src — performance

Module: src-performance Cohesion: 0.80 Members: 0

src — performance

The src/performance module is the central hub for all performance-related optimizations, monitoring, and benchmarking within the codebase. Its primary goal is to ensure the application, especially when interacting with LLMs and external tools, operates efficiently, quickly, and within defined resource constraints.

This module provides mechanisms for:

Module Architecture

The performance module is structured around a central PerformanceManager that orchestrates several specialized components.

graph TD
    subgraph Core Performance
        PM[PerformanceManager]
        LL[LazyLoader]
        TC[ToolCache]
        RO[RequestOptimizer]
    end

    subgraph Utilities
        BS[BenchmarkSuite]
        MM[MemoryMonitor]
        ST[StartupTimer]
        SC[SemanticCache]
    end

    PM -- Manages/Integrates --> LL
    PM -- Manages/Integrates --> TC
    PM -- Manages/Integrates --> RO
    PM -- Uses --> SC
    TC -- Uses --> SC

    src/performance/index.ts -- Re-exports --> PM
    src/performance/index.ts -- Re-exports --> LL
    src/performance/index.ts -- Re-exports --> TC
    src/performance/index.ts -- Re-exports --> RO
    src/performance/index.ts -- Re-exports --> BS
    src/performance/index.ts -- Re-exports --> MM
    src/performance/index.ts -- Re-exports --> ST

    MM -- from ../utils --> src/performance/index.ts
    ST -- from ../utils --> src/performance/index.ts
    SC -- from ../utils --> src/performance/performance-manager.ts
    SC -- from ../utils --> src/performance/tool-cache.ts

Key Components

  1. PerformanceManager: The central orchestrator.
  2. LazyLoader: Manages on-demand loading of modules.
  3. ToolCache: Caches results of deterministic tool calls.
  4. RequestOptimizer: Optimizes external API requests.
  5. BenchmarkSuite: Provides comprehensive LLM performance benchmarking.
  6. Re-exported Utilities: MemoryMonitor and StartupTimer from src/utils.

1. PerformanceManager (src/performance/performance-manager.ts)

The PerformanceManager is the core component for managing and coordinating all performance-related aspects of the application. It acts as a unified interface for enabling/disabling optimizations, recording metrics, and retrieving performance summaries.

Purpose

How it Works

Upon initialize(), the manager sets up instances of LazyLoader, ToolCache, RequestOptimizer, and SemanticCache based on its configuration. It then subscribes to events from these components (e.g., module:loaded from LazyLoader, hit/miss from ToolCache, success/failure/deduplicated from RequestOptimizer) to automatically record performance metrics.

Developers can use measureOperation() (a utility function wrapping manager.measure()) to easily track the performance of any asynchronous function.

Core API

Configuration (PerformanceConfig)

Controls which optimizations are enabled, performance budgets, and metric retention.

interface PerformanceConfig {
  enabled: boolean;           class="hl-cmt">// Overall enable/disable switch
  lazyLoading: boolean;
  toolCaching: boolean;
  requestOptimization: boolean;
  apiCaching: boolean;
  budgetMs: number;           class="hl-cmt">// Threshold for 'budget:exceeded' event
  enableMetrics: boolean;
  metricsRetention: number;   class="hl-cmt">// How many metrics to keep
}

Events

The PerformanceManager extends EventEmitter and emits various events:

Singleton Access

The getPerformanceManager() function ensures a single instance of the manager throughout the application lifecycle. initializePerformanceManager() is an async helper to get and initialize the manager.


2. LazyLoader (src/performance/lazy-loader.ts)

The LazyLoader is designed to improve application startup time by deferring the loading of heavy modules until they are actually needed.

Purpose

How it Works

Modules are register()ed with a loader function (typically an import()) and a name. When get(name) is called, the module's loader is executed only if the module hasn't been loaded yet. Once loaded, the instance is cached for subsequent calls.

The schedulePreload() and scheduleIdlePreload() methods allow for intelligent background loading of modules after initial startup or during idle times, based on configured priorities and dependencies.

Core API

Configuration (LazyLoaderConfig)

interface LazyLoaderConfig {
  preloadDelay: number;       class="hl-cmt">// Delay before starting preload
  preloadModules: string[];   class="hl-cmt">// Modules to preload automatically
  enableMetrics: boolean;
  maxParallelLoads: number;   class="hl-cmt">// Max concurrent loads during preload
  idlePreload: boolean;       class="hl-cmt">// Enable idle-time preloading
}

LoadPriority constants (CRITICAL, HIGH, NORMAL, LOW, DEFERRED) help categorize modules for preloading.

Events

Integration

The module provides registerCommonModules(), initializeLazyLoader(), and initializeCLILazyLoader() to quickly set up the loader with common application dependencies and specific strategies for CLI startup. createDeferredLoader() is a helper for deferring initialization until after initial UI render.

Singleton Access

getLazyLoader() provides a singleton instance. resetLazyLoader() clears and resets it.


3. ToolCache (src/performance/tool-cache.ts)

The ToolCache optimizes tool calls by caching their results, especially for deterministic operations. It leverages semantic similarity to match similar queries, not just exact matches.

Purpose

How it Works

ToolCache wraps a SemanticCache instance. When getOrExecute() is called, it first checks if the tool call is isCacheable() (based on tool name, arguments, and exclusion patterns). If cacheable, it attempts to retrieve a semantically similar result from the underlying SemanticCache. If a hit occurs, the cached result is returned. Otherwise, the executeFn is called, its result is stored in the cache, and then returned.

Mutable tools (e.g., bash, create_file) are explicitly excluded from caching.

Core API

Configuration (ToolCacheConfig)

interface ToolCacheConfig {
  enabled: boolean;
  ttlMs: number;              class="hl-cmt">// Time-to-live for cache entries
  maxEntries: number;
  similarityThreshold: number; class="hl-cmt">// For semantic matching
  cacheableTools: Set<string>; class="hl-cmt">// List of tools that can be cached
  excludePatterns: RegExp[];   class="hl-cmt">// Patterns in args that prevent caching
}

MUTABLE_TOOLS is a hardcoded set of tools that are never cached.

Events

The ToolCache forwards cache:hit and cache:miss events from its internal SemanticCache as hit and miss respectively.

Integration

Singleton Access

getToolCache() provides a singleton instance. resetToolCache() disposes and resets it.


4. RequestOptimizer (src/performance/request-optimizer.ts)

The RequestOptimizer is designed to make external API requests more robust and efficient by managing concurrency, batching, deduplication, and retries.

Purpose

How it Works

Requests are submitted via execute() with a key (for deduplication) and an executeFn. Requests are added to an internal queue and processed by processQueue() respecting maxConcurrent limits.

Core API

Configuration (RequestConfig)

interface RequestConfig {
  maxConcurrent: number;      class="hl-cmt">// Max parallel requests
  batchWindowMs: number;      class="hl-cmt">// Window for batching (currently used for scheduling queue processing)
  maxRetries: number;
  retryBaseDelayMs: number;   class="hl-cmt">// Base delay for exponential backoff
  timeoutMs: number;          class="hl-cmt">// Timeout for individual requests
  deduplicate: boolean;
}

Events

Integration

Singleton Access

getRequestOptimizer() provides a singleton instance. resetRequestOptimizer() clears and resets it.


5. BenchmarkSuite (src/performance/benchmark-suite.ts)

The BenchmarkSuite provides a comprehensive framework for measuring the performance of LLM interactions.

Purpose

How it Works

The run() method orchestrates the benchmarking process:

  1. Warmup Runs: Executes a few runs to "warm up" the LLM or system, preventing initial cold-start penalties from skewing results.
  2. Benchmark Runs: Executes the specified number of runs, either sequentially or concurrently based on configuration.
  3. executeRun(): For each run, it calls a provided BenchmarkCallback (which typically wraps an LLM API call), measures TTFT, total time, token counts, and calculates TPS and cost.
  4. calculateSummary(): After all runs, it aggregates the results, calculates percentile statistics (p50, p95, p99), averages, and standard deviations for key metrics.

Core API

Configuration (BenchmarkConfig)

interface BenchmarkConfig {
  warmupRuns?: number;
  runs?: number;
  concurrency?: number;
  timeout?: number;
  monitorVRAM?: boolean;
  prompts?: BenchmarkPrompt[]; class="hl-cmt">// Prompts to use for benchmarking
}

DEFAULT_PROMPTS provides a set of diverse prompts for common use cases.

Events

The BenchmarkSuite extends EventEmitter and emits progress events:

Integration

Singleton Access

getBenchmarkSuite() provides a singleton instance. resetBenchmarkSuite() clears and resets it.


6. Re-exported Utilities (src/performance/index.ts)

The src/performance/index.ts file re-exports several utility modules from src/utils that are crucial for performance monitoring and analysis, making them easily accessible under the performance module namespace.

MemoryMonitor (../utils/memory-monitor.js)

Provides functionality to monitor application memory usage, including RSS, heap total, and heap used. It can track memory pressure and provide snapshots over time.

StartupTimer (../utils/startup-timing.js)

A utility for measuring and tracking different phases of application startup. This helps identify bottlenecks and optimize initialization sequences.


How to Contribute and Extend