This repository also includes example scripts for running fine-tuned models, such as lex-au/Orpheus-3b-Italian_Spanish-FT-Q8_0.gguf
.
To use a different language model:
- Stop the LM Studio server and load your desired model.
- Start the server again.
- Use the dedicated scripts (
italian_orpheus.py
,italian_structured_audiobook_generator.py
) which are pre-configured with the correct voice prompts. - EXAMPLE FOR ITALIAN: python italian_orpheus.py --text "Ciao mondo, questo è un test." --voice it-male
A lightweight client for running Orpheus TTS locally using LM Studio API.
- 🎧 High-quality Text-to-Speech using the Orpheus TTS model
- 💻 Completely local - no cloud API keys needed
- 🔊 Multiple voice options (tara, leah, jess, leo, dan, mia, zac, zoe)
- 💾 Save audio to WAV files
- 📚 Generate audiobooks from text, markdown, and HTML files
- 🧠 Intelligent text chunking using spaCy for natural-sounding speech
-
Install LM Studio
-
Download the Orpheus TTS model (orpheus-3b-0.1-ft-q4_k_m.gguf) in LM Studio
-
Load the Orpheus model in LM Studio
-
Start the local server in LM Studio (default: http://127.0.0.1:1234)
-
Install dependencies:
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt
-
Run the script:
python gguf_orpheus.py --text "Hello, this is a test" --voice tara
Note: When running the audiobook generators for the first time, they will automatically download the required spaCy language model if it's not already installed.
python gguf_orpheus.py --text "Your text here" --voice tara --output "output.wav"
--text
: The text to convert to speech--voice
: The voice to use (default: tara)--output
: Output WAV file path (default: auto-generated filename)--list-voices
: Show available voices--temperature
: Temperature for generation (default: 0.6)--top_p
: Top-p sampling parameter (default: 0.9)--repetition_penalty
: Repetition penalty (default: 1.1)
python audiobook_generator.py mybook.txt --voice leo --output-dir "my_audiobooks"
input_file
: Input text, markdown, or HTML file--voice
: Voice to use (default: tara)--output-dir
: Output directory for audio files (default: outputs/audiobook)--separate-chunks
: Keep audio chunks as separate files--by-structure
: Chunk by document structure (for markdown/html)--list-voices
: List available voices and exit
For more advanced document processing with better structure handling:
python structured_audiobook_generator.py mybook.md --voice tara --max-length 600
input_file
: Input text, markdown, or HTML file--voice
: Voice to use (default: tara)--output-dir
: Output directory for audio files (default: outputs/audiobook)--separate-chunks
: Keep audio chunks as separate files--max-length
: Maximum characters per chunk (default: 500)--min-length
: Minimum characters per chunk (default: 50)--list-voices
: List available voices and exit
See example.py
for basic TTS usage and example_audiobook.py
for audiobook generation examples.
- tara - Best overall voice for general use (default)
- leah
- jess
- leo
- dan
- mia
- zac
- zoe
You can add emotion to the speech by adding the following tags:
<giggle>
<laugh>
<chuckle>
<sigh>
<cough>
<sniffle>
<groan>
<yawn>
<gasp>
Apache 2.0