Transform text into beautiful speech with Google's premium Chirp 3 HD voices
π€ Professional Text-to-Speech with 30+ Premium Voices and Intuitive Interface
# π₯οΈ Start the beautiful desktop experience
cd gui && npm run tauri dev
# β‘ Use the powerful CLI
uv run kiwi document.md --voice en-US-Chirp3-HD-Kore
# π Launch the REST API server
uv run kiwi server --reloadMulti-Interface Features:
- π₯οΈ Modern Desktop GUI - Beautiful Tauri app with drag-and-drop support
- π€ 30+ Premium Voices - Google Chirp 3 HD voices across 28 languages
- π Smart File Management - Custom output locations with OS integration
- β‘ Lightning Fast - Optimized HTTP architecture for instant response
- π Real-time Progress - Live conversion updates and audio preview
- π Format Flexibility - MP3 (24kHz) and WAV output formats
- π― Developer Friendly - Full REST API with OpenAPI documentation
- ποΈ Multi-tier Architecture - Clean separation of concerns across layers
- π Type Safety Everywhere - TypeScript, Python hints, Rust memory safety
- β‘ Performance Optimized - HTTP keep-alive, connection pooling, lazy loading
- π Cross-Platform Native - Windows, macOS, Linux with native performance
- π Advanced Text Processing - Markdown support with front matter parsing
- π¨ Beautiful UI/UX - Modern React + shadcn/ui components
- π§ Developer Experience - Hot reload, comprehensive testing, clean APIs
- π Production Ready - Error handling, logging, monitoring capabilities
# Clone the repository
git clone https://github.com/Abdulrahman-Elsmmany/kiwi.git
cd kiwi
# Install Python dependencies with UV (ultrafast)
uv sync
# Install GUI dependencies
cd gui && npm install# π Authenticate with Google Cloud
gcloud auth login
gcloud auth application-default login
# π€ Enable Text-to-Speech API
gcloud services enable texttospeech.googleapis.com
# π― Set your project
gcloud config set project YOUR_PROJECT_ID# π₯οΈ Launch the desktop app (recommended)
# Terminal 1: Start API server
uv run kiwi server
# Terminal 2: Launch GUI
cd gui && npm run tauri dev
# β‘ Or use CLI directly
uv run kiwi document.md --voice en-US-Chirp3-HD-Kore
|
|
|
|
// Beautiful component architecture
<FileUpload onFileSelect={handleFile} />
<VoiceSettings language="en-US" voice={selectedVoice} />
<ConversionProgress status={conversionStatus} />
<AudioResult audioUrl={resultUrl} /># π Convert a markdown file
uv run kiwi README.md
# π€ Use a specific voice
uv run kiwi document.txt --voice en-US-Chirp3-HD-Zephyr
# π Custom output location
uv run kiwi file.md --output ~/Desktop/speech.mp3
# π List all available voices
uv run kiwi voices --language en-US
# π Generate WAV format
uv run kiwi text.txt --format wav# π€ Upload and convert file
curl -X POST "http://localhost:8000/api/convert" \
-F "file=@document.md" \
-F "voice=en-US-Chirp3-HD-Kore" \
-F "format=MP3"
# π Get available voices
curl "http://localhost:8000/api/voices?language=en-US"
# π Check API health
curl "http://localhost:8000/health"graph TB
subgraph "π₯οΈ Presentation Layer"
GUI[Desktop GUI<br/>React + TypeScript]
CLI[CLI Interface<br/>Click + Rich]
API[REST API<br/>FastAPI + OpenAPI]
end
subgraph "π§ Application Layer"
HANDLER[Request Handlers<br/>Validation + Orchestration]
SERVICE[Business Logic<br/>Text Processing + Conversion]
end
subgraph "π€ Infrastructure Layer"
TTS[Google TTS Client<br/>Chirp 3 HD Voices]
STORAGE[File Management<br/>Cross-platform I/O]
HTTP[HTTP Client<br/>Reqwest + Tokio]
end
GUI --> HTTP
CLI --> SERVICE
API --> HANDLER
HTTP --> API
HANDLER --> SERVICE
SERVICE --> TTS
SERVICE --> STORAGE
style GUI fill:#e3f2fd
style CLI fill:#f3e5f5
style API fill:#fff3e0
style HANDLER fill:#e8f5e9
style SERVICE fill:#fce4ec
style TTS fill:#e1f5fe
| Language | Voices Available | Popular Choices |
|---|---|---|
| πΊπΈ English (US) | 8 voices | Kore (Warm), Zephyr (Dynamic), Achernar (News) |
| π¬π§ English (UK) | 4 voices | Hera (Refined), Oberon (Classic) |
| π«π· French | 3 voices | Sylvie (Elegant), Pierre (Professional) |
| π©πͺ German | 3 voices | Klaus (Authoritative), Emma (Friendly) |
| πͺπΈ Spanish | 4 voices | Carmen (Expressive), Miguel (Clear) |
| π―π΅ Japanese | 2 voices | Yuki (Gentle), Haruto (Professional) |
| π°π· Korean | 2 voices | Min-ji (Warm), Jun-ho (Clear) |
| And 21 more languages! | 30+ total | Run uv run kiwi voices for full list |
kiwi/
βββ π src/kiwi/ # Python Backend
β βββ __init__.py # Package initialization
β βββ main.py # CLI with Click framework
β βββ api.py # FastAPI REST server
β βββ tts.py # Google Cloud TTS client
β βββ parsers.py # Text/Markdown processing
β βββ utils.py # Shared utilities
β
βββ π₯οΈ gui/ # Tauri Desktop App
β βββ src/ # React Frontend
β β βββ App.tsx # Main application
β β βββ components/ # UI Components
β β β βββ ui/ # shadcn/ui components
β β β βββ *.tsx # Feature components
β β βββ lib/ # Utilities & hooks
β β
β βββ src-tauri/ # Rust Backend
β β βββ src/
β β β βββ lib.rs # Tauri commands
β β β βββ main.rs # Application entry
β β βββ Cargo.toml # Rust dependencies
β β
β βββ package.json # Node dependencies
β
βββ π§ͺ tests/ # Test Suite
β βββ unit/ # Unit tests
β βββ integration/ # Integration tests
β βββ fixtures/ # Test data
β
βββ π docs/ # Documentation
βββ π§ pyproject.toml # Python config
βββ π PLANNING.md # Architecture docs
- Async Everything - Non-blocking I/O throughout the stack
- Connection Pooling - Reuse HTTP connections for speed
- Lazy Loading - Load resources only when needed
- Error Boundaries - Graceful error handling at every level
- Type Safety - Full typing in Python, TypeScript, and Rust
- Memory Efficiency - Stream large files, minimize allocations
- Cross-platform - Native performance on all platforms
# π Python Testing
uv run pytest # Run all tests
uv run pytest --cov=kiwi # With coverage
uv run ruff check src/ tests/ # Lint code
uv run mypy src/ # Type checking
# π₯οΈ Frontend Testing
cd gui && npm run test # Run tests
cd gui && npm run lint # ESLint
cd gui && npm run type-check # TypeScript
# π¦ Rust Testing
cd gui/src-tauri && cargo test # Run tests
cd gui/src-tauri && cargo clippy # Lint# π¦ Build Python package
uv build
# π₯οΈ Build desktop app
cd gui && npm run tauri build
# π³ Build Docker image
docker build -t kiwi-tts .
# π Production deployment
uv run kiwi server --host 0.0.0.0 --port 8000- β Batch file processing with queue management
- β SSML support for advanced speech control
- β Audio post-processing (speed, pitch control)
- β WebSocket support for real-time streaming
- π Web application version
- π Mobile apps (iOS/Android)
- π Voice cloning capabilities
- π Multi-cloud support (AWS Polly, Azure Speech)
- π AI-powered voice selection
- π Collaborative workspaces
- π Plugin ecosystem
- π Analytics dashboard
This project showcases expertise in:
- βοΈ Modern React patterns with hooks and context
- π TypeScript for bulletproof type safety
- π¨ Beautiful UI with shadcn/ui components
- π± Responsive design with Tailwind CSS
- β‘ Performance optimization techniques
- π Async Python with FastAPI
- π Secure API design with validation
- π RESTful architecture patterns
- π§ͺ Comprehensive testing strategies
- π Performance monitoring and optimization
- π¦ Rust for system-level performance
- π₯οΈ Cross-platform native applications
- π IPC communication patterns
- π Native file system integration
- π― OS-specific optimizations
- π³ Containerization with Docker
- π§ Modern build tools (uv, Vite, Cargo)
- π§ͺ CI/CD pipeline design
- π Monitoring and observability
- π Production deployment strategies
- βοΈ Google Cloud Platform expertise
- π Secure authentication flows
- π API rate limiting and quotas
- π Retry strategies and resilience
- π° Cost optimization techniques
We welcome contributions! See our Contributing Guide for details.
# 1. Fork the repository
# 2. Create your feature branch
git checkout -b feature/amazing-feature
# 3. Make your changes and test
uv run pytest
# 4. Commit with conventional commits
git commit -m "feat: add amazing feature"
# 5. Push and create a Pull Request
git push origin feature/amazing-featureThis project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments:
- π€ Google Cloud for premium Chirp 3 HD voices
- π₯οΈ Tauri Team for the amazing desktop framework
- β‘ FastAPI for the modern Python web framework
- π¨ shadcn/ui for beautiful React components
- π uv for ultrafast Python package management
Transform Text into Beautiful Speech
Created with β€οΈ by Abdulrahman Elsmmany
β Star this repository if KIWI TTS helps you create amazing audio content!
Let's transform text into speech together π
