[ NATURAL VOICE CONVERSATIONS FOR AI ASSISTANTS VIA MCP ]
BBS v3.0 | DIAL-UP: 1-800-VOICE | 2400 BAUD SUPPORTED
ATDT 1-800-VOICE-MODE
CONNECT 2400
Welcome to VOICE MODE BBS
═[ SYSTEM MANIFESTO ]═
VOICE IS THE MOST NATURAL HUMAN INTERFACE.
CODE SHOULD SPEAK.
CODE SHOULD LISTEN.
                    
═[ THESIS ]═

Voice Mode transforms AI assistants from text-based tools into conversational partners. Through the Model Context Protocol, we enable Claude, ChatGPT, and other LLMs to engage in natural voice interactions.

No more typing. No more reading. Just conversation.

═[ CORE PRINCIPLES ]═
■ UNIVERSALITY: Works with any MCP-compatible client. No vendor lock-in.
■ SIMPLICITY: One command to install. One command to run. Zero configuration.
■ LOCALITY: Your voice never leaves your machine unless you choose cloud.
■ OPENNESS: MIT licensed. Fork it. Modify it. Make it yours.
═[ SYSTEM ARCHITECTURE ]═
┌─────────────────────────────────────────────────────────────────┐
│                        TRANSPORT LAYER                          │
├─────────────────────────────────────────────────────────────────┤
│ LOCAL MIC ──▶ AUDIO CAPTURE ──▶ STT SERVICE ──▶ TEXT          │
│ SPEAKER  ◀── AUDIO SYNTH   ◀── TTS SERVICE ◀── TEXT          │
├─────────────────────────────────────────────────────────────────┤
│                        PROTOCOL LAYER                           │
├─────────────────────────────────────────────────────────────────┤
│ MCP CLIENT ◀──▶ VOICE MODE SERVER ◀──▶ OPENAI-COMPATIBLE API  │
├─────────────────────────────────────────────────────────────────┤
│                        SERVICE LAYER                            │
├─────────────────────────────────────────────────────────────────┤
│ WHISPER.CPP (STT) │ KOKORO (TTS) │ LIVEKIT (RTC)             │
└─────────────────────────────────────────────────────────────────┘
                    
═[ SYSTEM REQUIREMENTS ]═
MINIMUM CONFIGURATION:
PLATFORM.....: Linux, macOS, Windows (WSL) RUNTIME......: Python 3.10+ MEMORY.......: 512MB minimum NETWORK......: Internet connection (for cloud services) SOUND CARD...: SoundBlaster compatible
═[ DEPENDENCIES ]═
pyaudio............>= 0.2.11 openai............. >= 1.0.0 mcp................ >= 1.0.0 livekit............ >= 0.17.5 (optional)
═[ API COMPATIBILITY ]═
STT.......: OpenAI Whisper API v1 TTS.......: OpenAI TTS API v1 PROTOCOL..: Model Context Protocol 2024.11
═[ AVAILABLE TOOLS ]═
converse(message, wait_for_response=True) listen_for_speech(duration=15.0) check_room_status() check_audio_devices() voice_status() list_tts_voices(provider=None) kokoro_start(models_dir=None) kokoro_stop() kokoro_status()
═[ ENVIRONMENT VARIABLES ]═
OPENAI_API_KEY # Required for cloud services STT_BASE_URL # Custom STT endpoint STT_API_KEY # STT authentication STT_MODEL # Whisper model selection TTS_BASE_URL # Custom TTS endpoint TTS_API_KEY # TTS authentication TTS_MODEL # TTS model selection TTS_VOICE # Voice selection VOICE_MODE_DEBUG # Enable debug logging VOICE_MODE_SAVE_AUDIO # Save audio files VOICE_MODE_AUDIO_DIR # Audio save directory
═[ DOWNLOAD CENTER ]═

Select your preferred installation method:

[1] CLAUDE CODE (RECOMMENDED)
$ claude mcp add --scope user voice-mode uvx voice-mode
[2] UV PACKAGE MANAGER
$ uvx voice-mode
[3] PYTHON PIP
$ pip install voice-mode
═[ LOCAL VOICE STACK ]═

Run everything on your machine. No cloud dependencies.

WHISPER.CPP (PORT 2022)
$ make whisper-start Local speech-to-text with OpenAI-compatible API CPU optimized with AVX/NEON support
KOKORO TTS (PORT 8880)
$ make kokoro-start Local text-to-speech with multiple voice options Zero-shot voice cloning capable
LIVEKIT (PORT 7880)
$ make livekit-start Real-time communication for room-based voice WebRTC powered, low latency
═[ INTEGRATION GUIDE ]═
CLAUDE DESKTOP
1. Install Voice Mode via Claude Code 2. Start Claude Desktop application 3. Use /converse command in chat 4. Speak naturally when prompted
CUSTOM MCP CLIENT
1. Add voice-mode to MCP server list 2. Configure transport (stdio/sse) 3. Call voice tools via MCP protocol 4. Handle audio streams appropriately
═[ USAGE EXAMPLES ]═
CONVERSATIONAL MODE
converse("Hello, how are you?") # Speaks message, waits for response # Returns transcribed user response
STATEMENT MODE
converse("Goodbye!", wait_for_response=False) # Speaks message without waiting # Immediate return after speech
LISTENING MODE
response = listen_for_speech(duration=30) # Pure listening mode # Returns transcribed text after duration
EMOTIONAL SPEECH
converse("Great job!", tts_model="gpt-4o-mini-tts", tts_instructions="Sound excited") # Requires VOICE_ALLOW_EMOTIONS=true # Uses advanced TTS model
═[ SYSTEM DIAGNOSTICS ]═
CHECK SYSTEM STATUS
voice_status() # Returns comprehensive service health # Shows all active voice services # Displays configuration details
LIST AUDIO DEVICES
check_audio_devices() # Shows available input devices # Shows available output devices # Displays current default devices
ENABLE DEBUG MODE
export VOICE_MODE_DEBUG=true # Enables verbose logging # Shows all API calls # Displays timing information
═[ FILE AREA ]═
FILENAME              SIZE    DATE       DESCRIPTION
─────────────────────────────────────────────────────────
VOICEMODE.ZIP         1.2M    2025-06-21 Complete package
README.TXT            32K     2025-06-21 Documentation
DEMO.MP4              8.7M    2025-06-21 Video demonstration
WHISPER.EXE           4.5M    2025-06-21 STT binary
KOKORO.TAR            12M     2025-06-21 TTS models

[D]ownload  [V]iew  [Q]uit                    _
                    
═[ EXTERNAL LINKS ]═

■ Watch Demo: youtube.com/watch?v=aXRNWvpnwVs

■ Source Code: github.com/mbailey/voicemode

■ Join Chat: discord.gg/gVHPPK5U

│ ANSI COLOR │ 80x25 │ ALT-X TO EXIT │ F1-F4 NAVIGATION