src — learning
src — learning
The src/learning module provides a Persistent Learning System designed for continuous improvement of the application's intelligence and effectiveness. It tracks various operational aspects, learns from them, and generates insights to guide future actions. All collected data is persisted to an SQLite database, allowing for long-term learning and adaptation.
Purpose and Features
The primary goal of this module is to enable the system to learn from its own operations and interactions. This includes:
- Repair Learning: Understanding which strategies successfully resolve specific error types.
- Convention Detection: Identifying common coding patterns and practices within projects.
- Tool Effectiveness Tracking: Monitoring the success rates and performance of various internal tools.
- Insight Generation: Synthesizing learned data into actionable recommendations and statistics.
Core Component: PersistentLearning Class
The heart of this module is the PersistentLearning class. It acts as the central orchestrator for all learning activities, managing data recording, retrieval, and analysis across different domains.
Overview
The PersistentLearning class:
- Extends Node.js's
EventEmitter, allowing other parts of the system to subscribe to learning events (e.g.,repair:recorded,convention:recorded,tool:recorded). - Interacts with the database layer, primarily through
getAnalyticsRepository()for repair and tool data, and directly withgetDatabaseManager()for convention data. - Provides methods to record new learning events, query learned data, and generate aggregated statistics or insights.
Singleton Access
To ensure a single, consistent learning state across the application, PersistentLearning is implemented as a singleton.
getPersistentLearning(): PersistentLearning: This function is the canonical way to access thePersistentLearninginstance. It initializes the instance if it doesn't already exist.resetPersistentLearning(): void: Primarily used for testing or reinitialization, this function clears the current instance and its event listeners.
Learning Domains
The PersistentLearning class categorizes its learning into three main domains: Repair, Conventions, and Tools.
1. Repair Learning
This domain focuses on understanding and improving the process of fixing errors.
RepairAttemptInterface: Defines the structure for recording a single attempt to fix an error, includingerrorMessage,errorType,strategy,success,attempts, and optional context likelanguageorfixCode.recordRepairAttempt(attempt: RepairAttempt): void: Records the outcome of a repair attempt. It normalizes theerrorMessageto ensure consistent pattern matching.getBestRepairStrategies(errorMessage, options): RepairLearning[]: Given an error message, this method queries the learned data to suggest the most successful repair strategies, optionally filtered byerrorType,language, orframework.getRepairStats(): LearningStats['repair']: Provides aggregated statistics on repair learning, such as total patterns learned, average success rate, and top-performing strategies.
2. Convention Detection
This domain aims to identify and track common coding patterns and conventions within specific projects.
ConventionDetectionInterface: Describes a detected convention, includingprojectId,category(e.g., 'naming', 'style'),pattern, and optionaldescriptionandexamples.recordConvention(detection: ConventionDetection): void: Stores a detected convention. If a similar convention already exists for the project, it updates itsoccurrencesandconfidencescore.getProjectConventions(projectId, options): Convention[]: Retrieves conventions specific to a given project, with optional filtering bycategoryorminConfidence.getConventionStats(projectId?): LearningStats['conventions']: Returns statistics about detected conventions, including total count, breakdown by category, and average confidence.
3. Tool Effectiveness Tracking
This domain monitors the usage and performance of various internal tools.
ToolUsageInterface: Defines the data recorded for each tool interaction, includingtoolName,success,timeMs,cacheHit, and optionalprojectId.recordToolUsage(usage: ToolUsage): void: Records an instance of tool usage.getToolStats(projectId?): LearningStats['tools']: Provides aggregated statistics on tool usage, such as total tools tracked, top-performing tools, and average cache hit rate.getBestToolForTask(taskType, projectId?): { tool: string; confidence: number } | null: Recommends the most effective tool for a giventaskType(e.g., 'search', 'edit'), based on historical success rates.
Data Persistence and Database Interaction
All learning data is persisted to an SQLite database. The PersistentLearning module interacts with the database layer in two primary ways:
AnalyticsRepository: Forrepair_learningandtool_statsdata,PersistentLearningdelegates togetAnalyticsRepository(). This repository likely encapsulates specific SQL operations for these domains.- Direct
DatabaseManagerAccess: Forconventionsdata and some specific statistical queries (e.g.,getRepairStats'stopStrategiesquery),PersistentLearningdirectly usesgetDatabaseManager().getDatabase()to execute SQL statements. This allows for more flexible and domain-specific queries not covered by theAnalyticsRepository.
Insights and Reporting
Beyond raw data, the module provides mechanisms to synthesize information:
LearningStatsInterface: A comprehensive interface that aggregates statistics from all three learning domains.LearningInsightInterface: Represents a generated insight, including itstype, a descriptivemessage, aconfidencescore, and optionaldata.generateInsights(projectId?): LearningInsight[]: This method analyzes the current learning data to produce a list of actionable insights, such as the most effective repair strategy or a frequently detected convention. Insights are sorted by confidence.getStats(projectId?): LearningStats: A convenience method to retrieve all aggregated learning statistics.formatStats(projectId?): string: Formats the comprehensive learning statistics into a human-readable string, suitable for display in logs or user interfaces.
Key Internal Logic: Error Normalization
A critical private method is normalizeErrorPattern(error: string). This function preprocesses raw error messages by removing variable elements like line numbers, file paths, and specific identifiers. This ensures that different occurrences of the same underlying error are recognized as such, allowing for effective pattern matching and learning in the repair domain.
System Overview
graph TD
subgraph Learning Module
PL[PersistentLearning]
getPL(getPersistentLearning)
end
subgraph Database Layer
AR[AnalyticsRepository]
DM[DatabaseManager]
end
subgraph Consumers
IM[Integration Module]
BA[BaseAgent]
Tests[Tests]
end
getPL --> PL
PL -- uses --> AR
PL -- uses --> DM
IM -- calls --> getPL
IM -- calls --> PL::recordRepairAttempt
IM -- calls --> PL::getBestRepairStrategies
IM -- calls --> PL::recordToolUsage
IM -- calls --> PL::getStats
BA -- calls --> PL::formatStats
Tests -- calls --> PL::getToolStats
Tests -- calls --> PL::getRepairStats
Integration Points and Usage
The PersistentLearning module is designed to be integrated throughout the application where learning opportunities arise.
src/database/integration.ts: This module appears to be a primary consumer, acting as a bridge to the learning system. It callsgetPersistentLearning()to obtain the instance and then invokes methods likerecordRepairAttempt(),getBestRepairStrategies(),recordToolUsage(), andgetLearningStats(). This suggests that other parts of the system might interact withPersistentLearningindirectly throughintegration.ts.src/agent/base-agent.ts: Agents can leverage the learning system for self-reflection or reporting. TheformatStats()method is called here, indicating agents might display learning statistics.- Tests: Various unit and integration tests directly interact with
PersistentLearningmethods to verify its functionality.
By centralizing learning logic in this module, the system can continuously adapt and improve its performance based on real-world interactions, making it more robust and intelligent over time.