src — intelligence
src — intelligence
The src/intelligence module provides a suite of analytical and adaptive capabilities designed to enhance developer productivity and code quality. It acts as the "brain" of the system, offering insights, recommendations, and personalized experiences based on code analysis, project context, and user interactions.
This module is composed of several distinct sub-modules, each focusing on a specific aspect of intelligence:
- Anomaly Detector: Identifies potential issues and anti-patterns in code.
- Proactive Suggestions Engine: Generates contextual recommendations based on project state.
- Refactoring Recommender: Pinpoints opportunities for code improvement and modernization.
- Semantic Search Engine: Enables intelligent search through conversation history.
- Task Complexity Estimator: Analyzes task descriptions to estimate effort and risks.
- User Preferences Learning: Adapts to and learns user's coding and communication styles.
Intelligence Module Overview
The core principle behind the intelligence module is to provide actionable insights without requiring explicit user queries for every analysis. It aims to anticipate developer needs, highlight potential problems, and guide towards best practices, ultimately reducing cognitive load and improving development workflows.
Core Principles
- Pattern-Based Analysis: Many components rely on regular expressions and keyword matching to identify specific code constructs, project states, or task characteristics.
- Contextual Awareness: Information from various sources (Git,
package.json, file system, conversation history) is aggregated to provide relevant suggestions. - Adaptability: The
UserPreferencescomponent allows the system to learn and tailor its output to individual developer styles. - Actionable Output: Recommendations often include concrete suggestions, commands, or examples to facilitate immediate action.
Components
Anomaly Detector (anomaly-detector.ts)
The Anomaly Detector is responsible for scanning code for unusual patterns, potential security vulnerabilities, performance anti-patterns, and inconsistencies.
Purpose
To automatically identify and report common code issues that might otherwise go unnoticed, helping maintain code quality and security standards.
How it Works
The primary function, detectAnomalies, takes a code string, language, and optional filePath. It iterates through a predefined list of ANOMALY_PATTERNS. Each pattern is a regular expression designed to catch specific code smells or issues.
For each pattern, it first checks if the pattern is language-specific and skips if it doesn't match the provided language. It then scans the code line by line. If a match is found, a CodeAnomaly object is created with details like category, severity, message, and line number. It also performs a full-code scan for multi-line patterns.
After all patterns are checked, detectAnomalies compiles an AnomalyReport which includes a summary of total anomalies, and counts by category and severity.
The formatAnomalyReport function then takes this structured report and renders it into a human-readable string, grouped by severity. The getAnomalyStats function aggregates reports from multiple files to provide overall project statistics.
Key Data Structures
AnomalySeverity:'info' | 'warning' | 'error' | 'critical'AnomalyCategory: e.g.,'security','performance','logic'CodeAnomaly: Detailed information about a single detected anomaly.AnomalyReport: Contains a list ofCodeAnomalyobjects and a summary for a given file.AnomalyPattern: Internal interface defining a detection rule, including itspattern(RegExp),category,severity, andmessage.
Example Flow
graph TD
A[Code String, Language] --> B{detectAnomalies}
B --> C{Iterate ANOMALY_PATTERNS}
C --> D{Check language compatibility}
D -- Match --> E{Scan code line-by-line}
E -- Match --> F[Create CodeAnomaly]
C --> G{Scan full code for multi-line patterns}
G -- Match --> F
F --> H[Collect Anomalies]
H --> I{Calculate Summary}
I --> J[Return AnomalyReport]
J --> K{formatAnomalyReport}
K --> L[Formatted String Output]
Proactive Suggestions Engine (proactive-suggestions.ts)
This component analyzes the current state of a project and generates actionable suggestions across various domains like Git workflow, testing, documentation, and security.
Purpose
To provide developers with timely, contextual advice and reminders to improve project health and adherence to best practices.
How it Works
The generateSuggestions asynchronous function is the entry point. It first calls analyzeProjectContext to gather comprehensive information about the project. This context includes:
- Git Status: Obtained via
getGitStatus, which executesgitcommands usingchild_process.execSyncto determine branch, uncommitted changes, unpushed commits, etc. - Package Info: Reads
package.jsonto get dependencies, scripts, and check for lockfiles. - File Existence Checks: Uses
fs-extra.pathExiststo check for common test directories, documentation files (README.md, docs/), and CI/CD configurations (.github/workflows, .gitlab-ci.yml).
Once the ProjectContext is assembled, generateSuggestions dispatches to several helper functions:
generateGitSuggestions(context.gitStatus)generatePackageSuggestions(context.packageInfo)generateTestingSuggestions(projectPath, context)generateDocSuggestions(projectPath, context)generateSecuritySuggestions(projectPath, context)generateWorkflowSuggestions(context)
Each of these functions applies specific rules to the context data and returns an array of ProactiveSuggestion objects. These suggestions are then aggregated and sorted by SuggestionPriority before being returned.
The formatSuggestions function provides a display-ready string representation of the suggestions.
Key Data Structures
SuggestionType: e.g.,'git','code-quality','security'SuggestionPriority:'low' | 'medium' | 'high' | 'urgent'ProactiveSuggestion: Details a single suggestion, including title, description, and optionalactionorcommand.ProjectContext: Aggregates various pieces of project information.GitStatus: Specific details about the Git repository state.PackageInfo: Details extracted frompackage.json.
Integration
- Incoming Calls:
handleSuggest(fromcommands/handlers/suggest-handler.ts) is a known consumer ofgenerateSuggestions. - Outgoing Calls:
getGitStatusinternally callsexecSync(which is an outgoing call tosrc/desktop-automation/base-native-provider.ts).generateSecuritySuggestionscallsfs.readFile(an outgoing call tosrc/sandbox/e2b-sandbox.ts).
Execution Flow: HandleSuggest → ExecSync
handleSuggest(commands/handlers/suggest-handler.ts) initiates the process.- It calls
generateSuggestions(src/intelligence/proactive-suggestions.ts). generateSuggestionscallsanalyzeProjectContextto gather project data.analyzeProjectContextcallsgetGitStatusto query the Git repository.getGitStatusexecutesgitcommands usingexecSync(src/desktop-automation/base-native-provider.ts) to retrieve status information.
Refactoring Recommender (refactoring-recommender.ts)
This component analyzes code to identify common code smells and opportunities for refactoring, aiming to improve readability, maintainability, and performance.
Purpose
To guide developers in improving code quality by suggesting specific refactoring techniques based on detected patterns.
How it Works
The analyzeForRefactoring function takes code, language, and filePath. Similar to the Anomaly Detector, it uses a set of REFACTORING_RULES, each with a pattern (RegExp), category, priority, and suggestion.
It iterates through these rules, applying them to the code. Rules can be multiLine or single-line. For multi-line rules, matchAll is used on the entire code string; for single-line rules, each line is checked. When a match is found, a RefactoringRecommendation is created, including an estimated impact on readability, maintainability, and performance (calculated by getImpactEstimate).
After processing all rules, analyzeForRefactoring calls calculateSummary to aggregate statistics by category and priority, and compute an overallScore (0-100, higher is better). The result is a RefactoringReport.
The formatRefactoringReport function then converts this report into a formatted string for display. Helper functions like getPriorityRecommendations and getRecommendationsByCategory allow filtering the results, and estimateRefactoringEffort provides a time estimate for addressing the recommendations.
Key Data Structures
RefactoringCategory: e.g.,'extract-method','simplify','modernize'RefactoringPriority:'low' | 'medium' | 'high' | 'critical'RefactoringRecommendation: Details a single refactoring opportunity, including location, suggestion, and estimated impact.RefactoringReport: Contains a list ofRefactoringRecommendationobjects and a summary for a given file.RefactoringRule: Internal interface defining a refactoring detection rule.
Semantic Search Engine (semantic-search.ts)
The Semantic Search Engine provides intelligent search capabilities over conversation history, allowing users to quickly find relevant past interactions.
Purpose
To enable efficient and context-aware retrieval of past conversation messages, supporting knowledge recall and continuity.
How it Works
This component is implemented as a stateful class, SemanticSearchEngine. It maintains an in-memory index of ConversationMessage objects.
- Initialization & Persistence: The constructor loads messages from a JSON file (
~/.codebuddy/search-index.json) usingfs-extra. Messages are stored in an array, and awordIndex(Map>) maps words to message IDs for fast lookup. loadIndexandsaveIndexhandle disk persistence. - Indexing:
addMessageandaddMessagesappend new conversations. Each message's content and metadata (tools) are tokenized (tokenizefunction, which filters stop words) and added to thewordIndex. - Trimming: To prevent memory bloat,
trimIfNeededautomatically prunes older messages and rebuilds the index ifMAX_MESSAGESis exceeded.trimWordIndexlimits entries per word. - Searching: The
searchmethod takes aqueryandSearchOptions.
- It first tokenizes the query.
findCandidatesuses thewordIndexfor fast retrieval of messages containing query terms (exact and prefix matches).- For each candidate message,
scoreMessagecalculates a relevance score based on: - Exact phrase match.
- Term frequency and fuzzy matching (
fuzzyMatch). - Coverage of query terms.
- Recency bonus.
- Role bonus (user messages slightly preferred).
- Messages are filtered by
role,sessionId, anddateRangeas specified inSearchOptions. - Results are sorted by score and limited.
- If
contextSizeis specified,addContextretrieves surrounding messages from the same session.
- Utility Functions:
findSimilar,getRecent,getBySession,getStats,clear,pruneOlderThan, andformatResultsprovide additional functionalities. - Singleton:
getSemanticSearchEngineensures only one instance of the engine exists, managing a shared index.
Key Data Structures
ConversationMessage: Represents a single message in the conversation history.SearchResult: Contains a matchingConversationMessage, itsscore,highlights, andmatchType.SearchOptions: Allows specifying search parameters likelimit,role,dateRange,fuzzyMatch, etc.wordIndex:Map- the core inverted index for fast word-to-message-ID lookup.>
Example Flow (Search)
graph TD
A[Query, Options] --> B{search}
B --> C{tokenize Query}
C --> D{findCandidates (using wordIndex)}
D --> E{Filter Candidates by Options}
E --> F{scoreMessage for each candidate}
F -- Score, Highlights, MatchType --> G[Collect SearchResults]
G --> H{Sort by Score}
H --> I{Limit Results}
I -- if contextSize > 0 --> J{addContext}
J --> K[Return SearchResult[]]
Task Complexity Estimator (task-complexity-estimator.ts)
This component analyzes natural language task descriptions to estimate their complexity, identify risks, and provide effort estimates.
Purpose
To assist developers and project managers in understanding the scope, challenges, and time investment required for a given task.
How it Works
The estimateTaskComplexity function is the core of this component.
- Task Classification: It iterates through
TASK_PATTERNS(regular expressions) to classify the task into aTaskCategory(e.g., 'bug-fix', 'feature', 'refactor'). Each pattern has abaseComplexityscore and initialComplexityFactors. - Complexity Modifiers: It then applies
COMPLEXITY_MODIFIERS(more regex patterns) to adjust thecomplexityMultiplierbased on keywords indicating urgency, legacy code, broad scope, etc. - Score Calculation: A final
complexityScore(1-100) is calculated, and aComplexityLevelis derived usinggetComplexityLevel. - Factor Adjustment:
ComplexityFactors(scope, technical debt, unknowns, dependencies, testing effort, regression risk) are refined based on the task description and modifiers. - Risk Assessment:
RiskAssessmentobjects are generated, including both predefined risks fromTASK_PATTERNSand generic risks derived from highComplexityFactors.getRiskLeveldetermines the severity. - Effort Estimation:
calculateEffortuses thecomplexityScoreandfactorsto determineminHours,maxHours,typicalHours, and abreakdownfor planning, implementation, testing, and review. - Suggestions:
generateSuggestionsprovides contextual advice based on the task category, complexity, and factors. - Confidence: A
confidencescore is assigned based on how well the task description matched known patterns.
The formatTaskEstimate function renders the detailed estimate into a readable string, including a visual bar for factors using renderBar. compareTasks allows estimating and comparing multiple tasks.
Key Data Structures
ComplexityLevel:'trivial' | 'simple' | 'moderate' | 'complex' | 'very-complex'RiskLevel:'low' | 'medium' | 'high' | 'critical'TaskCategory: e.g.,'bug-fix','feature','refactor'ComplexityFactors: Numerical scores (1-10) for various aspects influencing complexity.TaskEstimate: The comprehensive output of the estimation, including category, complexity, factors, risks, effort, and suggestions.EffortEstimate: Detailed breakdown of estimated hours.TaskPattern: Internal interface defining rules for task classification.COMPLEXITY_MODIFIERS: Internal array for adjusting complexity based on keywords.
Example Flow
graph TD
A[Task Description] --> B{estimateTaskComplexity}
B --> C{Classify Task (TASK_PATTERNS)}
C --> D{Apply Modifiers (COMPLEXITY_MODIFIERS)}
D --> E{Calculate Complexity Score & Level}
E --> F{Adjust Complexity Factors}
F --> G{Assess Risks}
G --> H{Calculate Effort (calculateEffort)}
H --> I{Generate Suggestions}
I --> J[Return TaskEstimate]
J --> K{formatTaskEstimate}
K --> L[Formatted String Output]
User Preferences Learning (user-preferences.ts)
This component learns and adapts to a user's individual preferences, including coding style, tool usage, and communication style.
Purpose
To personalize the system's behavior and output, making it more aligned with the user's habits and expectations, thereby improving user experience and efficiency.
How it Works
The PreferencesManager is a stateful class that manages UserPreferences.
- Initialization & Persistence: The constructor loads preferences from a JSON file (
~/.codebuddy/preferences.json).loadPreferencesandsavePreferenceshandle disk I/O. Default preferences are used if no file exists. - Coding Style Learning:
learnFromCodeanalyzes a provided code string to infer preferences likeindentation(spaces/tabs, size),quotes(single/double),semicolons, andbraceStyle. It also callsdetectPatternsto identify common coding patterns (e.g., naming conventions, import styles). - Tool Usage Tracking:
recordToolUsagetracks how often tools are used, their success rates, average response times, and preferred options. This data helps the system recommend tools or configure them appropriately. - Communication Style:
updateCommunicationStyleallows explicit setting of preferences likeverbosity,includeExplanations, andresponseFormat. - Custom Rules: Users can define
CustomRules, which are stored and can be retrieved viagetActiveRules. - Management Functions: Methods like
getPreferences,updateCodingStyle,addPattern,removeCustomRule,setLearningEnabled,reset,export, andimportprovide comprehensive control over preferences. - Singleton:
getPreferencesManagerensures a single, shared instance of the manager.
Key Data Structures
CodingStyle: Defines preferences for code formatting.ToolPreference: Tracks usage statistics and preferred options for a specific tool.CommunicationStyle: Defines how the system should communicate with the user.UserPreferences: The top-level interface holding all user preferences.LearnedPattern: Represents a coding pattern identified from user code.CustomRule: User-defined rules for system behavior.
Example Flow (Learning from Code)
graph TD
A[Code String] --> B{learnFromCode}
B -- if learningEnabled --> C{Detect Indentation}
C --> D{Detect Quote Style}
D --> E{Detect Semicolons}
E --> F{Detect Brace Style}
F --> G{detectPatterns}
G --> H{updateOrAddPattern}
H --> I[Update UserPreferences]
I --> J{savePreferences}
Integration and Usage
The intelligence module is designed to be a central hub for analytical capabilities.
- Command Handlers: As seen with
handleSuggestcallinggenerateSuggestions, command handlers are a primary consumer, triggering analysis and displaying results to the user. - Code Generation/Modification: The
Anomaly DetectorandRefactoring Recommenderoutputs can inform code generation or automated refactoring tools, ensuring generated code adheres to quality standards or suggesting improvements to existing code. - Conversation Management: The
Semantic Search Engineis crucial for retrieving past interactions, allowing the system to maintain context and provide more relevant responses in ongoing conversations. - Personalization: The
User Preferences Learningcomponent influences how the system generates code, formats output, and communicates, ensuring a tailored experience. - Task Management: The
Task Complexity Estimatorcan be integrated into task creation workflows to provide immediate insights into effort and risks.
These components work together to create a more intelligent and adaptive development assistant, providing value across various stages of the software development lifecycle.