[Bug] KeyError: 'default_speaker' #4258

apachexc · 2025-05-06T12:57:35Z

Describe the bug

@app.route("/api/tts", methods=["GET", "POST"])
def tts():
with lock:
text = request.headers.get("text") or request.values.get("text", "")
speaker_idx = request.headers.get("speaker-id") or request.values.get("speaker_id", "")
language_idx = request.headers.get("language-id") or request.values.get("language_id", "")
style_wav = request.headers.get("style-wav") or request.values.get("style_wav", "")
style_wav = style_wav_uri_to_dict(style_wav)

    print(f" > Model input: {text}")
    print(f" > Speaker Idx: {speaker_idx}")
    print(f" > Language Idx: {language_idx}")
    wavs = synthesizer.tts(text, speaker_name=speaker_idx, language_name=language_idx, style_wav=style_wav)
    out = io.BytesIO()
    synthesizer.save_wav(wavs, out)
return send_file(out, mimetype="audio/wav")

or

    device = "cpu"
    tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)
    tts.tts_to_file(text=text,
            file_path="output.wav",
            speaker_wav=style_wav,
            language="en")

ERROR:xtts_server_xc:Exception on /api/tts [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1455, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 869, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 867, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 852, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/root/TTS/server/xtts_server_xc.py", line 209, in tts
wavs = synthesizer.tts(text, speaker_name=speaker_idx, language_name=language_idx, style_wav=style_wav)
File "/root/TTS/utils/synthesizer.py", line 386, in tts
outputs = self.tts_model.synthesize(
File "/root/TTS/tts/models/xtts.py", line 411, in synthesize
gpt_cond_latent, speaker_embedding = self.speaker_manager.speakers[speaker_id].values()
KeyError: 'default_speaker'

Both with and without the speaker parameter will result in an error, and the problem lies with the speaker.
May I ask, what is the value of the speaker parameter?

To Reproduce

Both with and without the speaker parameter will result in an error, and the problem lies with the speaker.
May I ask, what is the value of the speaker parameter?

Expected behavior

No response

Logs

Environment

docker

Additional context

No response

The text was updated successfully, but these errors were encountered:

eginhard · 2025-05-06T13:55:16Z

The server from this repo doesn't support the XTTS model. You can use our fork (available via pip install coqui-tts) and corresponding docker images instead.

apachexc · 2025-05-17T13:42:46Z

@eginhard

Hello, I have already run the idiap branch using Docker. I saw in the instructions for the idiap branch that you can clone voice using the following code:

TTS with list of amplitude values as output, clone the voice from `speaker_wav`

wav = tts.tts(
text="Hello world!",
speaker_wav="my/cloning/audio.wav",
language="en"
）.
But there is no relevant function in server. py that supports this method of cloning speech. Is there a method to implement it in server. py?

apachexc · 2025-05-18T07:45:02Z

Xtts requires speechr_id.

If I provide the speaker id, the sound produced by cloning will be the speaker id's sound, not the provided speaker wav's sound.

If no speaker_id is provided, an error will occur.
For example, wav=api.tts（
text=text,
speaker_wav=speaker_wav,
language=language
)

How to solve it?

apachexc · 2025-05-18T07:47:43Z

Xtts requires speechr_id
If I provide the speaker id, the sound produced by cloning will be the speaker id's sound, not the provided speaker wav's sound.
Example:
wav = api.tts(text, speaker=speaker_idx, language=language_idx, style_wav=style_wav)
If no speaker_id is provided, an error will occur.
Example:
wav = api.tts(
text=text,
speaker_wav=speaker_wav,
language=language
)
How to solve it?

apachexc added the bug Something isn't working label May 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] KeyError: 'default_speaker' #4258

[Bug] KeyError: 'default_speaker' #4258

apachexc commented May 6, 2025

eginhard commented May 6, 2025

Uh oh!

apachexc commented May 17, 2025

Uh oh!

apachexc commented May 18, 2025

Uh oh!

apachexc commented May 18, 2025

Uh oh!

[Bug] KeyError: 'default_speaker' #4258

[Bug] KeyError: 'default_speaker' #4258

Comments

apachexc commented May 6, 2025

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

eginhard commented May 6, 2025

Uh oh!

apachexc commented May 17, 2025

TTS with list of amplitude values as output, clone the voice from speaker_wav

Uh oh!

apachexc commented May 18, 2025

Uh oh!

apachexc commented May 18, 2025

Uh oh!

TTS with list of amplitude values as output, clone the voice from `speaker_wav`