scripts — tests

Module: scripts-tests Cohesion: 0.80 Members: 0

scripts — tests

The scripts/tests module houses the Real-Conditions Test Suite for Code Buddy. This comprehensive suite is designed to validate the core functionality of the application by exercising real module imports and, where applicable, making live API calls. Unlike traditional unit tests that might rely heavily on mocks, this suite aims to test the integration and behavior of various subsystems under conditions closer to a production environment.

Purpose

The primary goal of this test suite is to ensure the stability and correctness of Code Buddy's internal logic and external integrations. It achieves this by:

This suite is crucial for catching integration bugs and regressions that might be missed by isolated unit tests.

How to Run

To execute the entire test suite, follow these steps:

  1. Set your Google API Key: Many tests interact with the Gemini API.
    export GOOGLE_API_KEY="AIza..."

  1. Execute the runner script:
    npx tsx scripts/run-all-tests.ts

Test reports are automatically saved to .custom-output/gemini-extended-test-{timestamp}.json.

Architecture and Execution Flow

The test suite is structured around a set of TypeScript files (cat-*.ts) that define categories and individual tests. The scripts/run-all-tests.ts script acts as the orchestrator, importing these category files and executing their defined tests.

Key Components (scripts/tests/types.ts)

The types.ts file defines the fundamental structures and helper functions for the test harness:

    interface TestDef {
      name: string; class="hl-cmt">// Unique name for the test (e.g., "91.1-instantiation")
      fn: () => Promise<{ pass: boolean; metadata?: Record<string, unknown>; tokenUsage?: TokenUsage }>; class="hl-cmt">// The test logic
      timeout?: number; class="hl-cmt">// Optional timeout in milliseconds
      retries?: number; class="hl-cmt">// Optional number of retries for flaky tests
      mandatory?: boolean; class="hl-cmt">// If true, failure aborts the entire suite
    }
    interface CategoryDef {
      name: string; class="hl-cmt">// Name of the category (e.g., "Cat 91: Lessons Tracker")
      tests: TestDef[]; class="hl-cmt">// Array of TestDef objects
      abortOnFirst?: boolean; class="hl-cmt">// If true, category execution stops on first test failure
    }

Execution Flow

The run-all-tests.ts script dynamically imports all cat-*.ts files. For each file, it iterates through the exported CategoryDef arrays, calling runCategory for each. runCategory then calls runTest for every TestDef within that category. The TestDef.fn is where the actual application code from src/ is imported and exercised.

graph TD
    A[scripts/run-all-tests.ts] --> B{Import cat-*.ts files};
    B --> C{For each CategoryDef};
    C --> D[runCategory(category.name, category.tests)];
    D --> E{For each TestDef in category.tests};
    E --> F[runTest(testDef, category)];
    F --> G[runWithRetry(testDef.fn)];
    G --> H{testDef.fn()};
    H --> I[Imports src/module.js];
    I --> J[Executes src/module.js logic];

For API-dependent tests (e.g., cat-api-advanced.ts, cat-api-gemini-extended.ts), the run-all-tests.ts script also calls initApiAdvanced and initApiGeminiExtended to inject the GeminiProvider instance and the GOOGLE_API_KEY into the test files, allowing them to make authenticated API calls.

Test Structure and Organization

Tests are organized into categories, each focusing on a specific module or subsystem. The naming convention cat-NN-Description.ts (e.g., cat-agent-advanced.ts) clearly indicates the content.

Each cat-*.ts file exports one or more functions (e.g., cat91LessonsTracker(), cat92TodoTracker()) that return an array of TestDef objects. The tests within these arrays are numbered sequentially (e.g., 91.1, 91.2), making it easy to reference specific test cases.

Example: cat-agent-advanced.ts

This file covers several agent-related functionalities:

Many tests in cat-agent-advanced.ts (and other files) demonstrate common patterns like:

Coverage and Scope

The test suite provides extensive coverage across Code Buddy's architecture. The TESTS.md file includes a detailed "Summary by Subsystem" table, which is the best reference for understanding the breadth of coverage.

Key areas covered include:

The suite includes approximately 560 non-API tests (unit/integration) and 26 API tests, with a few mixed tests. This balance ensures both internal logic and external integrations are thoroughly vetted.

Integration with the Codebase

The scripts/tests module is tightly integrated with the main src/ codebase. Test functions directly import and instantiate classes or call functions from src/ modules. For example, cat91LessonsTracker imports LessonsTracker from ../../src/agent/lessons-tracker.js. This direct import strategy ensures that the tests are validating the actual production code, not just mocked interfaces.

The GOOGLE_API_KEY is passed to the GeminiProvider and CodeBuddyClient instances, which are then used by the API tests. This setup allows the tests to perform real network requests to the LLM, providing confidence in the API integration layer.

Developers contributing to Code Buddy should refer to this test suite to understand how different modules are expected to behave and to add new tests for any new features or bug fixes. Adhering to the existing TestDef and CategoryDef patterns will ensure consistency and maintainability of the test suite.