src — input

Module: src-input Cohesion: 0.80 Members: 0

src — input

The src/input module is a critical part of the system, responsible for handling various forms of user input beyond simple text. This includes parsing special @mentions for context injection, providing file path autocompletion, managing multimodal inputs like images and screenshots, and enabling sophisticated voice control and text-to-speech capabilities.

This module aims to enrich the user's interaction by allowing them to easily reference external information, provide visual context, and interact hands-free.

Module Overview

The src/input module is composed of several distinct sub-modules, each addressing a specific input modality or enhancement:

Key Components

1. Context Mentions (src/input/context-mentions.ts)

This module is responsible for identifying and resolving special @mentions within a user's input message. These mentions allow users to easily pull in external information (like file contents, web pages, git status, or terminal history) directly into their prompt, providing rich context for the AI.

Core Functionality:

Data Structures:

Execution Flow for expandMentions:

graph TD
    A[User Input String] --> B{ContextMentionParser.expandMentions}
    B -- Calls --> C{processNewMentions}
    C -- Processes @web, @git extended, @terminal --> D[Resolve New Mentions]
    D -- Resolved Contexts --> E[Append to Contexts List]
    C -- Returns Cleaned Text --> F{Loop through Legacy Patterns}
    F -- For each @file:, @url:, @git:, @symbol:, @search:, @image: --> G{resolveMention}
    G -- Calls specific resolver (e.g., resolveFile, resolveUrl) --> H[Fetch Content / Execute Command]
    H -- Success --> I[Create MentionContext]
    I --> E
    H -- Error --> J[Create Error MentionContext]
    J --> E
    E --> K[Return ExpandedInput {text, contexts}]
    K --> L[processMentions (convenience function)]
    L --> M[Filter & Format Contexts]
    M --> N[Return MentionResult {cleanedMessage, contextBlocks}]

Integration:

The processMentions function is a key integration point, called by agent/execution/agent-executor.ts (processUserMessage and processUserMessageStream) to enrich the AI's understanding of user requests before they are processed.

2. File Autocomplete (src/input/file-autocomplete.ts)

This module provides intelligent file path completion, primarily for @file: references, enhancing the user experience by suggesting relevant files and directories.

Core Functionality:

Data Structures:

Integration:

This module is typically used by the CLI or UI components to provide real-time suggestions as the user types @file: mentions.

3. Multimodal Input (src/input/multimodal-input.ts)

This module enables the system to handle various image inputs, providing visual context to AI models.

Core Functionality:

Data Structures:

Integration:

The MultimodalInputManager is used by commands (e.g., /image load, /image screenshot) to acquire image data. The prepareForAPI method is crucial for formatting images before sending them to AI models that accept multimodal input.

4. Text-to-Speech (src/input/text-to-speech.ts)

This module provides text-to-speech (TTS) functionality, allowing the system to speak responses to the user.

Core Functionality:

Data Structures:

Integration:

The TextToSpeechManager is typically used by command handlers (e.g., handleTTS) or other parts of the system that need to provide audible feedback or responses to the user.

5. Voice Control (src/input/voice-control.ts)

This module implements an advanced voice command system, enabling hands-free interaction with the application.

Core Functionality:

Data Structures:

Integration:

The VoiceControl singleton is accessed via getVoiceControl() and is used by command handlers (e.g., handleVoice) to enable and manage voice interaction. It integrates with external WakeWordDetector and VoiceActivityDetector modules for advanced audio processing.

6. Voice Input Enhanced (src/input/voice-input-enhanced.ts)

This module provides a more general and robust voice-to-text recording and transcription manager, often used for push-to-talk or single-utterance transcription.

Core Functionality:

Data Structures:

Integration:

The VoiceInputManager singleton is accessed via getVoiceInputManager() and is used by command handlers (e.g., handleVoice) for general voice input. It's designed for scenarios where a user might press a hotkey, speak, and have their utterance transcribed.

7. Voice Input (Legacy) (src/input/voice-input.ts)

This module provides basic voice recording and transcription capabilities. While functional, voice-input-enhanced.ts (VoiceInputManager) offers a more feature-rich and actively developed alternative.

Core Functionality:

Data Structures:

Integration:

This module might be used in specific contexts where a simpler voice input mechanism is preferred, or for compatibility with older integrations. For new features requiring voice input, VoiceInputManager (from voice-input-enhanced.ts) is recommended.

Integration and Usage

The src/input module's components are integrated throughout the application to provide a rich and flexible user experience:

Configuration

Most components in the src/input module manage their own configuration, typically loaded from ~/.codebuddy/-config.json files and allowing overrides via environment variables or constructor parameters.

Common configuration aspects include:

Error Handling

Each component handles errors internally, often emitting error events or returning MentionContext objects with an error field. This prevents unhandled rejections and allows the calling code to gracefully handle failures (e.g., informing the user that a mention could not be resolved or that a voice command failed). The getErrorMessage utility from src/types/index.js is frequently used to standardize error messages.

Contribution Guidelines

When contributing to the src/input module: