tests — observability
tests — observability
This document describes the run-store.test.ts module, which is the test suite for the RunStore class located in src/observability/run-store.ts.
RunStore Test Suite Documentation
Introduction
The run-store.test.ts module provides comprehensive unit and integration tests for the RunStore class. RunStore is a critical component responsible for managing the lifecycle, events, metrics, and artifacts of individual execution runs within the system. This test suite ensures the reliability, data integrity, and correct behavior of RunStore's persistence and retrieval mechanisms.
Purpose of the Test Suite
The primary purpose of this test suite is to:
- Verify Core Functionality: Ensure that
RunStorecorrectly creates, updates, and retrieves run data, including events, metrics, and artifacts. - Validate Data Integrity: Confirm that data written to disk (e.g.,
events.jsonl,metrics.json, artifacts) is correctly stored and can be accurately read back. - Test Edge Cases: Cover scenarios like unknown run IDs, concurrent operations (implicitly through rapid run creation), and data retention limits.
- Ensure Resource Management: Verify that temporary files and directories are created and cleaned up properly, and that write streams are handled correctly.
- Confirm Singleton Behavior: Validate that
RunStoreadheres to its singleton pattern.
Core Functionality Under Test
The test suite is structured around the key public methods and behaviors of the RunStore class.
1. Run Lifecycle Management (startRun, endRun)
startRun(objective: string, meta?: RunMetadata):- Verifies that
startRungenerates unique run IDs (e.g.,run_...). - Confirms that a dedicated directory for each run is created, containing
events.jsonlandmetrics.jsonfiles, along with anartifactssubdirectory. - Ensures that a
run_startevent is correctly recorded inevents.jsonlwith the provided objective and metadata. endRun(runId: string, status: RunStatus):- Tests that
endRunupdates the run's status (e.g.,completed,failed,cancelled) and records theendedAttimestamp in the run's summary. - Confirms that a
run_endevent is emitted toevents.jsonlwith the final status.
2. Event Logging (emit, getEvents)
emit(runId: string, event: RunEvent):- Validates that arbitrary
RunEventobjects are appended to theevents.jsonlfile for the specified run. - Ensures that each emitted event automatically includes a timestamp (
ts) and therunId. - Tests the resilience of
emitby confirming it silently ignores attempts to emit events for unknownrunIds. getEvents(runId: string):- Verifies that
getEventscan correctly read and parse all events from a run'sevents.jsonlfile.
3. Artifact Management (saveArtifact, getArtifact)
saveArtifact(runId: string, filename: string, content: string | Buffer):- Tests that artifacts are correctly written to the
artifacts/subdirectory within a run's directory. - Confirms that the content written matches the input.
- Ensures that saved artifacts are listed in the
artifactsarray when retrieving a run record viagetRun. getArtifact(runId: string, filename: string):- Validates that
getArtifactcan retrieve the content of a previously saved artifact. - Tests that
getArtifactreturnsnullfor non-existent artifacts.
4. Run Retrieval and Listing (getRun, listRuns)
getRun(runId: string):- Verifies that
getRunreturns a completeRunRecordfor a knownrunId, including its summary, metrics, and a list of artifacts. - Tests that
getRunreturnsnullfor unknownrunIds. listRuns(limit?: number):- Ensures that
listRunsreturns runs sorted bystartedAtin descending order (most recent first). - Validates that the
limitparameter correctly restricts the number of returned runs.
5. Metrics Updates (updateMetrics)
- While
updateMetricsis not directly tested in its owndescribeblock, its effect is verified throughgetRun. The test confirms that metrics updated viastore.updateMetrics(runId, { totalTokens: 1000, totalCost: 0.01 })are correctly reflected in theRunRecordreturned bygetRun.
6. Data Retention and Pruning
- The test suite includes a specific test case to ensure that
RunStore's internal pruning mechanism limits the total number of stored runs (e.g., to a maximum of 30 runs), preventing unbounded disk usage.
7. Singleton Pattern Enforcement
- A dedicated test confirms that
RunStore.getInstance()always returns the same instance of theRunStore, upholding the singleton design pattern.
Test Environment and Utilities
The test suite employs several helper functions and jest hooks to manage the test environment effectively.
makeTmpDir(): Creates a unique temporary directory for each test run usingos.tmpdir()andfs.mkdtempSync. This ensures isolation between tests and prevents side effects.cleanDir(dir: string): Recursively removes a given directory, used for cleaning up temporary test data. It includes error handling to ignore cleanup failures.beforeEach():- Initializes a new temporary directory (
tmpDir). - Creates a new
RunStoreinstance, pointing it to thetmpDir. - Resets
activeRunIdsto track runs created within the current test. afterEach():- Crucially, it iterates through all
activeRunIdsand callsstore.endRun()to ensure all write streams are properly closed. This is vital before attempting to delete the temporary directory. - Includes a
setTimeout(e.g.,80ms) to allow Node.js write streams time to flush their buffers and close the underlying file handles. Without this,fs.rmSyncmight fail due to open files. - Calls
cleanDir(tmpDir)to remove all test-generated files and directories. - Resets the internal
_instanceproperty of theRunStoresingleton tonull, ensuring thatRunStore.getInstance()returns a fresh instance for the next test block. startRun(objective: string, meta?: Parameters: A wrapper function around[1]) store.startRunthat also adds the newly createdrunIdto theactiveRunIdsarray, simplifying cleanup inafterEach.
Dependencies
This test module directly depends on:
fs,os,path: Node.js built-in modules for file system operations.RunStore(from../../src/observability/run-store.js): The class under test.
How to Run Tests
Assuming jest or a compatible test runner is configured, these tests can typically be executed from the project root using:
npm test tests/observability/run-store.test.ts
# or
yarn test tests/observability/run-store.test.ts
Or, to run all tests:
npm test