Open
Description
Environment:
- vox-box version: 0.0.2
- model: Hugging Face/FunAudioLLM/CosyVoice-300M-SFT
Issue 1: Failed to generate speech after selecting opus
or pcm
format
View the logs:
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - Specified sample rate 11025 is not supported by the libopus encoder
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - Supported sample rates:
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - 48000
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - 24000
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - 16000
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - 12000
2024-11-27T21:00:23+08:00 - libav.libopus - ERROR - 8000
Issue 2: Failed to generate speech after selecting 0.25x and 4x speed
0%| | 0/1 [00:00<?, ?it/s]2024-11-27T21:01:35+08:00 - root - INFO - synthesis text Hello.
2024-11-27T21:01:36+08:00 - root - INFO - yield speech len 1.1145578231292517, rtf 0.7314251182833686
100%|██████████| 1/1 [00:00<00:00, 1.22it/s]
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - Specified sample rate 5512 is not supported by the libmp3lame encoder
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - Supported sample rates:
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 44100
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 48000
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 32000
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 22050
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 24000
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 16000
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 11025
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 12000
2024-11-27T21:01:36+08:00 - libav.libmp3lame - ERROR - 8000
Metadata
Metadata
Assignees
Labels
No labels