Skip to content

Abdulrahman-Elsmmany/KIWI-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯ KIWI TTS

KIWI TTS

🎯 Professional Multi-Interface Text-to-Speech Application

Transform text into beautiful speech with Google's premium Chirp 3 HD voices

Version Python License Developer

Desktop CLI API Professional


πŸ–ΌοΈ Experience KIWI TTS

✨ Beautiful Desktop Interface for Premium Text-to-Speech

KIWI TTS Interface

🎀 Professional Text-to-Speech with 30+ Premium Voices and Intuitive Interface

🎯 Launch Applications

# πŸ–₯️ Start the beautiful desktop experience
cd gui && npm run tauri dev

# ⚑ Use the powerful CLI
uv run kiwi document.md --voice en-US-Chirp3-HD-Kore

# 🌐 Launch the REST API server
uv run kiwi server --reload

Multi-Interface Features:

  • πŸ–₯️ Modern Desktop GUI - Beautiful Tauri app with drag-and-drop support
  • 🎀 30+ Premium Voices - Google Chirp 3 HD voices across 28 languages
  • πŸ“ Smart File Management - Custom output locations with OS integration
  • ⚑ Lightning Fast - Optimized HTTP architecture for instant response
  • πŸ“Š Real-time Progress - Live conversion updates and audio preview
  • πŸ”„ Format Flexibility - MP3 (24kHz) and WAV output formats
  • 🎯 Developer Friendly - Full REST API with OpenAPI documentation

🎯 Why KIWI TTS?

✨ Enterprise-Grade Architecture

  • πŸ—οΈ Multi-tier Architecture - Clean separation of concerns across layers
  • πŸ”’ Type Safety Everywhere - TypeScript, Python hints, Rust memory safety
  • ⚑ Performance Optimized - HTTP keep-alive, connection pooling, lazy loading
  • 🌍 Cross-Platform Native - Windows, macOS, Linux with native performance

πŸš€ Built for Professionals

  • πŸ“ Advanced Text Processing - Markdown support with front matter parsing
  • 🎨 Beautiful UI/UX - Modern React + shadcn/ui components
  • πŸ”§ Developer Experience - Hot reload, comprehensive testing, clean APIs
  • πŸ“Š Production Ready - Error handling, logging, monitoring capabilities

πŸš€ Quick Start Guide

πŸ“¦ Installation

# Clone the repository
git clone https://github.com/Abdulrahman-Elsmmany/kiwi.git
cd kiwi

# Install Python dependencies with UV (ultrafast)
uv sync

# Install GUI dependencies
cd gui && npm install

βš™οΈ Google Cloud Configuration

# πŸ”‘ Authenticate with Google Cloud
gcloud auth login
gcloud auth application-default login

# 🎀 Enable Text-to-Speech API
gcloud services enable texttospeech.googleapis.com

# 🎯 Set your project
gcloud config set project YOUR_PROJECT_ID

🎯 Get Started in 30 Seconds

# πŸ–₯️ Launch the desktop app (recommended)
# Terminal 1: Start API server
uv run kiwi server

# Terminal 2: Launch GUI
cd gui && npm run tauri dev

# ⚑ Or use CLI directly
uv run kiwi document.md --voice en-US-Chirp3-HD-Kore

✨ Multi-Interface Capabilities

πŸ–₯️ Desktop Application

  • Tauri 2.0 native performance
  • Drag & drop file interface
  • Real-time progress tracking
  • Audio preview with controls
  • Custom output directory selection

⚑ Command Line Interface

  • Simple commands for quick conversion
  • Batch processing support
  • Voice listing and filtering
  • Multiple formats (MP3/WAV)
  • Pipeline friendly design

🌐 REST API Server

  • FastAPI with async support
  • OpenAPI/Swagger documentation
  • File upload endpoints
  • Streaming responses available
  • CORS enabled for web apps

🎀 Premium Voice Quality

  • 30+ Chirp 3 HD voices
  • 28 languages supported
  • Natural intonation and emotion
  • 24kHz MP3 or uncompressed WAV
  • Smart text preprocessing

🎯 Usage Examples

πŸ–₯️ Desktop GUI Features

// Beautiful component architecture
<FileUpload onFileSelect={handleFile} />
<VoiceSettings language="en-US" voice={selectedVoice} />
<ConversionProgress status={conversionStatus} />
<AudioResult audioUrl={resultUrl} />

⚑ CLI Commands

# πŸ“„ Convert a markdown file
uv run kiwi README.md

# 🎀 Use a specific voice
uv run kiwi document.txt --voice en-US-Chirp3-HD-Zephyr

# πŸ“ Custom output location
uv run kiwi file.md --output ~/Desktop/speech.mp3

# πŸ“‹ List all available voices
uv run kiwi voices --language en-US

# πŸ”Š Generate WAV format
uv run kiwi text.txt --format wav

🌐 API Endpoints

# πŸ“€ Upload and convert file
curl -X POST "http://localhost:8000/api/convert" \
  -F "file=@document.md" \
  -F "voice=en-US-Chirp3-HD-Kore" \
  -F "format=MP3"

# πŸ“‹ Get available voices
curl "http://localhost:8000/api/voices?language=en-US"

# πŸ“Š Check API health
curl "http://localhost:8000/health"

πŸ—οΈ Architecture & Design Patterns

🎯 Clean Architecture Implementation

graph TB
    subgraph "πŸ–₯️ Presentation Layer"
        GUI[Desktop GUI<br/>React + TypeScript]
        CLI[CLI Interface<br/>Click + Rich]
        API[REST API<br/>FastAPI + OpenAPI]
    end
    
    subgraph "πŸ”§ Application Layer"
        HANDLER[Request Handlers<br/>Validation + Orchestration]
        SERVICE[Business Logic<br/>Text Processing + Conversion]
    end
    
    subgraph "🎀 Infrastructure Layer"
        TTS[Google TTS Client<br/>Chirp 3 HD Voices]
        STORAGE[File Management<br/>Cross-platform I/O]
        HTTP[HTTP Client<br/>Reqwest + Tokio]
    end
    
    GUI --> HTTP
    CLI --> SERVICE
    API --> HANDLER
    HTTP --> API
    HANDLER --> SERVICE
    SERVICE --> TTS
    SERVICE --> STORAGE
    
    style GUI fill:#e3f2fd
    style CLI fill:#f3e5f5
    style API fill:#fff3e0
    style HANDLER fill:#e8f5e9
    style SERVICE fill:#fce4ec
    style TTS fill:#e1f5fe
Loading

🎀 Premium Voice Catalog

🌟 Chirp 3 HD Voices - Crystal Clear Quality

Language Voices Available Popular Choices
πŸ‡ΊπŸ‡Έ English (US) 8 voices Kore (Warm), Zephyr (Dynamic), Achernar (News)
πŸ‡¬πŸ‡§ English (UK) 4 voices Hera (Refined), Oberon (Classic)
πŸ‡«πŸ‡· French 3 voices Sylvie (Elegant), Pierre (Professional)
πŸ‡©πŸ‡ͺ German 3 voices Klaus (Authoritative), Emma (Friendly)
πŸ‡ͺπŸ‡Έ Spanish 4 voices Carmen (Expressive), Miguel (Clear)
πŸ‡―πŸ‡΅ Japanese 2 voices Yuki (Gentle), Haruto (Professional)
πŸ‡°πŸ‡· Korean 2 voices Min-ji (Warm), Jun-ho (Clear)
And 21 more languages! 30+ total Run uv run kiwi voices for full list

πŸ› οΈ Advanced Development

πŸ“ Professional Project Structure

kiwi/
β”œβ”€β”€ 🐍 src/kiwi/              # Python Backend
β”‚   β”œβ”€β”€ __init__.py           # Package initialization
β”‚   β”œβ”€β”€ main.py               # CLI with Click framework
β”‚   β”œβ”€β”€ api.py                # FastAPI REST server
β”‚   β”œβ”€β”€ tts.py                # Google Cloud TTS client
β”‚   β”œβ”€β”€ parsers.py            # Text/Markdown processing
β”‚   └── utils.py              # Shared utilities
β”‚
β”œβ”€β”€ πŸ–₯️ gui/                   # Tauri Desktop App
β”‚   β”œβ”€β”€ src/                  # React Frontend
β”‚   β”‚   β”œβ”€β”€ App.tsx           # Main application
β”‚   β”‚   β”œβ”€β”€ components/       # UI Components
β”‚   β”‚   β”‚   β”œβ”€β”€ ui/           # shadcn/ui components
β”‚   β”‚   β”‚   └── *.tsx         # Feature components
β”‚   β”‚   └── lib/              # Utilities & hooks
β”‚   β”‚
β”‚   β”œβ”€β”€ src-tauri/            # Rust Backend
β”‚   β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”‚   β”œβ”€β”€ lib.rs        # Tauri commands
β”‚   β”‚   β”‚   └── main.rs       # Application entry
β”‚   β”‚   └── Cargo.toml        # Rust dependencies
β”‚   β”‚
β”‚   └── package.json          # Node dependencies
β”‚
β”œβ”€β”€ πŸ§ͺ tests/                 # Test Suite
β”‚   β”œβ”€β”€ unit/                 # Unit tests
β”‚   β”œβ”€β”€ integration/          # Integration tests
β”‚   └── fixtures/             # Test data
β”‚
β”œβ”€β”€ πŸ“š docs/                  # Documentation
β”œβ”€β”€ πŸ”§ pyproject.toml         # Python config
└── πŸ“‹ PLANNING.md            # Architecture docs

πŸš€ Performance & Best Practices

  • Async Everything - Non-blocking I/O throughout the stack
  • Connection Pooling - Reuse HTTP connections for speed
  • Lazy Loading - Load resources only when needed
  • Error Boundaries - Graceful error handling at every level
  • Type Safety - Full typing in Python, TypeScript, and Rust
  • Memory Efficiency - Stream large files, minimize allocations
  • Cross-platform - Native performance on all platforms

πŸ”§ Development Commands

πŸ§ͺ Testing & Quality

# 🐍 Python Testing
uv run pytest                        # Run all tests
uv run pytest --cov=kiwi            # With coverage
uv run ruff check src/ tests/       # Lint code
uv run mypy src/                    # Type checking

# πŸ–₯️ Frontend Testing
cd gui && npm run test              # Run tests
cd gui && npm run lint              # ESLint
cd gui && npm run type-check        # TypeScript

# πŸ¦€ Rust Testing
cd gui/src-tauri && cargo test      # Run tests
cd gui/src-tauri && cargo clippy    # Lint

πŸ—οΈ Building & Distribution

# πŸ“¦ Build Python package
uv build

# πŸ–₯️ Build desktop app
cd gui && npm run tauri build

# 🐳 Build Docker image
docker build -t kiwi-tts .

# πŸš€ Production deployment
uv run kiwi server --host 0.0.0.0 --port 8000

πŸ—ΊοΈ Roadmap & Future Features

🎯 Version 2.0 (Q2 2025)

  • βœ… Batch file processing with queue management
  • βœ… SSML support for advanced speech control
  • βœ… Audio post-processing (speed, pitch control)
  • βœ… WebSocket support for real-time streaming

🎯 Version 3.0 (Q4 2025)

  • πŸ“‹ Web application version
  • πŸ“‹ Mobile apps (iOS/Android)
  • πŸ“‹ Voice cloning capabilities
  • πŸ“‹ Multi-cloud support (AWS Polly, Azure Speech)

🎯 Version 4.0 (2026)

  • πŸ“‹ AI-powered voice selection
  • πŸ“‹ Collaborative workspaces
  • πŸ“‹ Plugin ecosystem
  • πŸ“‹ Analytics dashboard

πŸ† Skills Demonstrated

This project showcases expertise in:

Frontend Excellence

  • βš›οΈ Modern React patterns with hooks and context
  • πŸ“˜ TypeScript for bulletproof type safety
  • 🎨 Beautiful UI with shadcn/ui components
  • πŸ“± Responsive design with Tailwind CSS
  • ⚑ Performance optimization techniques

Backend Mastery

  • 🐍 Async Python with FastAPI
  • πŸ”’ Secure API design with validation
  • πŸ“Š RESTful architecture patterns
  • πŸ§ͺ Comprehensive testing strategies
  • πŸ“ˆ Performance monitoring and optimization

Desktop Development

  • πŸ¦€ Rust for system-level performance
  • πŸ–₯️ Cross-platform native applications
  • πŸ”Œ IPC communication patterns
  • πŸ“ Native file system integration
  • 🎯 OS-specific optimizations

DevOps & Tooling

  • 🐳 Containerization with Docker
  • πŸ”§ Modern build tools (uv, Vite, Cargo)
  • πŸ§ͺ CI/CD pipeline design
  • πŸ“Š Monitoring and observability
  • πŸš€ Production deployment strategies

Cloud Integration

  • ☁️ Google Cloud Platform expertise
  • πŸ” Secure authentication flows
  • πŸ“ˆ API rate limiting and quotas
  • πŸ”„ Retry strategies and resilience
  • πŸ’° Cost optimization techniques

🀝 Contributing

We welcome contributions! See our Contributing Guide for details.

# 1. Fork the repository
# 2. Create your feature branch
git checkout -b feature/amazing-feature

# 3. Make your changes and test
uv run pytest

# 4. Commit with conventional commits
git commit -m "feat: add amazing feature"

# 5. Push and create a Pull Request
git push origin feature/amazing-feature

πŸ“ž Support & Community

🌟 Get Help & Connect

Issues Discussions Documentation

🎀 Share Your Creations

Show us what you've created with KIWI TTS! Showcase


πŸ“„ License & Attribution

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments:

  • 🎀 Google Cloud for premium Chirp 3 HD voices
  • πŸ–₯️ Tauri Team for the amazing desktop framework
  • ⚑ FastAPI for the modern Python web framework
  • 🎨 shadcn/ui for beautiful React components
  • πŸš€ uv for ultrafast Python package management

πŸ₯ KIWI TTS

Transform Text into Beautiful Speech

Created with ❀️ by Abdulrahman Elsmmany

GitHub

LinkedIn


⭐ Star this repository if KIWI TTS helps you create amazing audio content!

Let's transform text into speech together πŸš€

πŸŽ€πŸ“πŸŽ΅ Premium Voices β€’ Beautiful Interface β€’ Professional Quality

About

🎡 Modern multi-interface TTS app with 30+ premium Google Cloud voices. Built with Python/FastAPI + React/TypeScript + Rust/Tauri.Features beautiful desktop GUI, CLI tools & REST API. Enterprise-grade architecture showcasing full-stack expertise across multiple technologies. πŸš€βœ¨

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors