src — offline
src — offline
The src/offline module provides Code Buddy with robust offline capabilities, allowing it to function and provide value even without an active internet connection. It achieves this through a combination of intelligent response caching, local Large Language Model (LLM) fallback, embedding storage for semantic search, and a resilient request queuing system.
The core of this module is the OfflineMode class, which manages all aspects of offline operation.
Core Concepts
The offline module is designed around several key features:
- Response Caching: Stores previous LLM responses to avoid re-querying for identical or highly similar prompts.
- Local LLM Fallback: Integrates with local LLM providers (Ollama,
llama.cpp,node-llama-cpp, WebLLM) to generate responses when online models are unreachable. - Embedding Cache: Stores vector embeddings of queries and responses, enabling semantic search for cached content.
- Request Queuing: When offline, requests that require internet connectivity (e.g., to remote LLMs or APIs) are queued and automatically processed once connectivity is restored.
- Automatic Sync: Monitors internet connectivity and triggers processing of queued requests when back online.
- Offline-Capable Tools: Provides the infrastructure for tools to leverage local LLMs and cached data.
Architecture Overview
The OfflineMode class acts as a central manager, orchestrating interactions between various internal components and external dependencies. It maintains its state (caches, queue, configuration) persistently on disk and actively monitors network status.
graph TD
subgraph OfflineMode Manager
OM[OfflineMode Class] --> Cfg(OfflineConfig)
OM --> RC(Response Cache: LRUCache<CachedResponse>)
OM --> EC(Embedding Cache: LRUCache<CachedEmbedding>)
OM --> RQ(Request Queue: QueuedRequest[])
OM --> LLM(Local LLM Integration)
OM --> Net(Internet Check)
OM --> Stats(OfflineStats)
end
LLM --> LPM(LocalProviderManager)
LLM --> Ax(Axios for Ollama/llama.cpp)
RC --> FS(fs-extra: Disk Persistence)
EC --> FS
RQ --> FS
Cfg --> FS
Net --> Ax
LPM --> LocalLLMProviders[Local LLM Providers (node-llama-cpp, WebLLM)]
Ax --> ExternalAPIs[External APIs (Ollama, llama.cpp, Health Checks)]
OM -- Emits Events --> EventBus[EventEmitter]
MainApp[Main Application] -- Uses Singleton --> OM
Key Components
OfflineMode Class
The OfflineMode class (src/offline/offline-mode.ts) is the primary entry point and manager for all offline functionalities. It extends EventEmitter to broadcast important status changes and events.
Constructor and Initialization:
The constructor new OfflineMode(config) initializes the module with a given configuration (or DEFAULT_CONFIG). It sets up data directories (~/.codebuddy/offline), initializes LRUCache instances for responses and embeddings, and then calls initialize().
The initialize() method performs critical setup:
- Ensures necessary directories exist (
dataDir,cacheDir,cacheDir/responses,cacheDir/embeddings). - Loads previously saved cache indexes (
response-index.json,embedding-index.json) and the request queue (queue.json) from disk. - Performs an initial internet connectivity check via
checkInternet(). - Starts a periodic internet connectivity check using
startInternetCheck().
Configuration (OfflineConfig)
The OfflineConfig interface defines the module's behavior:
enabled: Master switch for offline mode.cacheEnabled,cacheMaxSize,cacheMaxAge: Control response caching.localLLMEnabled,localLLMProvider,localLLMModel: Configure local LLM usage.embeddingCacheEnabled: Enables/disables embedding storage.queueRequestsWhenOffline,autoSyncOnReconnect: Control request queuing behavior.checkInternetInterval: Frequency of internet checks.
Configuration can be updated at runtime using updateConfig(config: Partial, which also persists the changes to config.json.
Response Caching
The module uses an LRUCache (responseCache) to store LLM responses.
cacheResponse(query: string, response: string, model: string, tokensUsed: number): Stores a response. It generates a SHA256 hash of the query (hash()) as the cache key and saves the fullCachedResponseobject to a file incacheDir/responses/before adding it to the in-memory LRU cache. It then callssaveCacheIndexes()andcleanupCacheIfNeeded().getCachedResponse(query: string): Retrieves a response. It checks the LRU cache using the query hash. If found, it updatesaccessedAtandaccessCountfor LRU management and incrementsstats.cacheHits. It also checkscacheMaxAgeand removes expired entries.removeCachedResponse(queryHash: string): Deletes a cached response from both the in-memory cache and disk.cleanupCacheIfNeeded(): Periodically called to ensure the cache size (stats.cacheSize) does not exceedconfig.cacheMaxSize. It removes the least recently accessed items until the cache is within limits.clearCache(): Empties both response and embedding caches from memory and disk.
Embedding Caching & Semantic Search
An LRUCache (embeddingCache) stores text embeddings.
cacheEmbedding(text: string, embedding: number[], model: string): Stores a text's embedding, similar tocacheResponse.getEmbedding(text: string): Retrieves a cached embedding.findSimilarResponses(query: string, threshold: number = 0.85): This is a key feature for semantic search.
- It first gets or computes the embedding for the input
query. - Then, it iterates through all cached responses. For each cached response, it retrieves its associated query embedding from
embeddingCache. - It calculates the
cosineSimilarity()between the input query's embedding and the cached query's embedding. - Responses exceeding the
thresholdare returned, sorted by similarity.
Request Queuing
When offline, requests that cannot be fulfilled locally are added to a queue.
queueRequest(type: QueuedRequest['type'], payload: unknown, priority: number = 0): Adds a request torequestQueue. Requests are sorted bypriority. The queue is persisted toqueue.jsonviasaveQueue().processQueue(): Called when the system comes back online (ifautoSyncOnReconnectis true) or manually. It iterates through therequestQueue, attempting to process each request viaprocessRequest(). Failed requests are retried up to 3 times before being discarded.processRequest(request: QueuedRequest): A placeholder method that would integrate with the main application's request handling logic. It emits arequest:executeevent.clearQueue(): Empties the request queue.
Local LLM Integration
The module supports various local LLM providers.
callLocalLLM(prompt: string, options: {}): The main method for interacting with local LLMs. It dispatches to specific provider implementations based onconfig.localLLMProvider.- For
local-llama(node-llama-cpp) andwebllm, it uses theLocalProviderManagerfrom../providers/local-llm-provider.js. - For
ollamaandllamacpp, it makes directaxioscalls to their respective local HTTP endpoints (callOllama,callLlamaCpp). streamLocalLLM(prompt: string, options: {}): Provides anAsyncIterablefor streaming responses from providers that support it (currentlyLocalProviderManagerbased ones).isLocalLLMAvailable(): Checks if the configured local LLM provider is running and accessible.getAvailableProviders(): Dynamically checks for the presence of various local LLM providers (Ollama,llama.cppserver,node-llama-cppbindings, WebGPU for WebLLM) and returns a list of available options.getLocalModels(): Specifically for Ollama, retrieves a list of installed models.
Internet Connectivity Management
checkInternet(): Attempts to pinghttps://api.x.ai/healthorhttps://www.google.com/generate_204to determine online status.startInternetCheck(): Sets up asetIntervalto periodically callcheckInternet(). It emitsonlineandofflineevents when the status changes and triggersprocessQueue()on reconnection ifautoSyncOnReconnectis enabled.
Data Persistence
All critical data (configuration, cache indexes, request queue) is persisted to disk within the user's home directory (~/.codebuddy/offline).
fs-extrais used for robust file system operations (e.g.,ensureDir,readJSON,writeJSON,remove,emptyDir).saveCacheIndexes(): Writesresponse-index.jsonandembedding-index.json.saveQueue(): Writesqueue.json.saveConfig(): Writesconfig.json.
Statistics & Monitoring
The OfflineStats interface tracks various metrics:
cacheHits,cacheMisseslocalLLMCallsqueuedRequestscacheSizeisOnline,lastOnline
The getStats() method returns the current statistics. formatStatus() provides a human-readable summary, useful for debugging or user interfaces.
Usage
The OfflineMode instance is typically accessed as a singleton:
import { getOfflineMode, OfflineConfig } from 39;./offline-mode.js39;;
class="hl-cmt">// Get the singleton instance, optionally with initial config
const offlineManager = getOfflineMode({
localLLMProvider: 39;ollama39;,
localLLMModel: 39;llama339;,
cacheMaxSize: 1024, class="hl-cmt">// 1GB
});
class="hl-cmt">// Check internet status
console.log(39;Is online:39;, offlineManager.getStats().isOnline);
class="hl-cmt">// Try to get a cached response
const cached = await offlineManager.getCachedResponse(39;What is Code Buddy?39;);
if (cached) {
console.log(39;Cached response:39;, cached.response);
} else {
class="hl-cmt">// If offline, queue a request or use local LLM
if (!offlineManager.getStats().isOnline && offlineManager.getConfig().localLLMEnabled) {
const localResponse = await offlineManager.callLocalLLM(39;Explain offline mode.39;);
console.log(39;Local LLM response:39;, localResponse);
} else if (!offlineManager.getStats().isOnline && offlineManager.getConfig().queueRequestsWhenOffline) {
const requestId = offlineManager.queueRequest(39;chat39;, { prompt: 39;What is Code Buddy?39; });
console.log(39;Request queued:39;, requestId);
}
}
class="hl-cmt">// Listen for events
offlineManager.on(39;online39;, () => console.log(39;Back online! Processing queue...39;));
offlineManager.on(39;request:processed39;, ({ request }) => console.log(`Processed queued request: ${request.id}`));
class="hl-cmt">// Clean up on application exit
class="hl-cmt">// offlineManager.dispose();
Integration Points
../utils/lru-cache.js: Provides the underlying LRU cache implementation forresponseCacheandembeddingCache.../providers/local-llm-provider.js: This is a crucial dependency for modern local LLM integration. TheOfflineModeclass usesLocalProviderManagerandautoConfigureLocalProviderto abstract away the complexities of interacting withnode-llama-cppand WebLLM.axios: Used for external HTTP requests, including internet checks and direct API calls to Ollama andllama.cppservers.fs-extra: Handles all file system interactions for persistence.- Main Application Logic: The main Code Buddy application is expected to:
- Call
getOfflineMode()to obtain the manager instance. - Use
getCachedResponse()before making online LLM calls. - Call
cacheResponse()after successful online LLM calls. - Call
queueRequest()when an online request fails due to lack of connectivity. - Listen to
onlineandofflineevents to adjust UI or behavior. - Call
dispose()on application shutdown to ensure state is saved and resources are released.
Events
The OfflineMode class extends EventEmitter and emits the following events:
online: When internet connectivity is restored.offline: When internet connectivity is lost.cache:evict: When an item is evicted from an LRU cache.cache:cleaned: After a cache cleanup operation.cache:cleared: After the cache has been completely cleared.request:queued: When a request is added to the queue.queue:processing: When the queue starts processing.request:processed: When a queued request is successfully processed.request:failed: When a queued request fails after retries.queue:processed: When the queue finishes processing.localLLM:error: When an error occurs during a local LLM call.localLLM:progress: Emitted byLocalProviderManagerduring model loading or generation.
Lifecycle
- Initialization: The
getOfflineMode()function ensures a singleton instance is created and initialized. This involves setting up directories, loading persisted data, and starting background tasks like internet checks. - Disposal: The
dispose()method is critical for proper shutdown. It clears the internet check timer, attempts to kill any child processes started for local LLMs, disposes of theLocalProviderManager, and saves any pending cache indexes and queue state to disk. It also removes all event listeners to prevent memory leaks. TheresetOfflineMode()function can be used in testing or specific scenarios to force a re-initialization.