
A modern, sci-fi themed web interface for the open-source Chatterbox TTS models by Resemble AI. Features a futuristic "camera wall" background with live webcam integration and a floating glassmorphism control panel.
[laugh], [cough], [sigh] for expressive speechgit clone https://github.com/quyangminddock/chatterbox_demo.git
cd chatterbox_demo
# Install Python dependencies
pip install -e .
# Additional dependencies for API
pip install fastapi uvicorn python-multipart librosa soundfile
# Set your Hugging Face token (get one from https://huggingface.co/settings/tokens)
export HF_TOKEN=your_huggingface_token_here
# Start the API server (loads both Turbo and Multilingual models)
python api.py
The server will start on http://localhost:8000. Model loading takes 2-3 minutes on first run.
cd ui
# Install dependencies
npm install
# Start development server
npm run dev
Open http://localhost:3000 in your browser.
Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Spanish, Swahili, Swedish, Turkish
chatterbox_demo/
├── api.py # Unified FastAPI server (both models)
├── api_multilingual.py # Standalone multilingual API (optional)
├── ui/ # Next.js frontend
│ ├── components/
│ │ ├── CameraBackground.tsx # Surveillance-style grid background
│ │ └── FloatingHUD.tsx # Main control interface
│ ├── app/
│ │ └── globals.css # Sci-fi theme styling
│ └── public/ # CCTV images and assets
├── src/chatterbox/ # Modified Chatterbox source (dtype fixes)
└── README.md
tts_turbo.py and mtl_tts.py to ensure float32 throughout pipeline.float() cast in mel spectrogram computationmap_location='cpu' for model loading on non-CUDA devices
This demo interface is released under the MIT License. The underlying Chatterbox models are licensed under Apache 2.0 by Resemble AI.
Contributions are welcome! Please feel free to submit issues or pull requests.
Special thanks to Resemble AI for open-sourcing the amazing Chatterbox TTS models!