Dev Blog for v1.2.0 #158

cyberofficial · 2025-05-29T15:42:20Z

cyberofficial
May 29, 2025
Maintainer

This week flew by in a flurry of coding for Synthalingua! v1.2.0 is almost here, and I'm excited to share what I've been working on. It's been a mix of tackling some long-standing frustrations and adding some quality-of-life features.

One of my biggest personal wins was squashing the "hanging server" bug. It was incredibly frustrating to have the Flask server refuse to shut down properly, especially when hitting Ctrl+C. The solution? A PID file/watchdog system. Now, when the server starts, it creates a server.pid file (added to .gitignore). If a watchdog thread detects this file's deletion (like when I Ctrl+C), the server gets force-killed. No more hanging! And for those edge cases, I made a kill_server.py utility. This was a major stability improvement, and honestly, a huge relief. This also resolves hanging when doing a "graceful shutdown."

Speaking of stability, I also focused on the stream transcription module. I added retry logic and, even better, rate limiting. I got tired of getting those pesky 429 errors, so now Synthalingua handles them gracefully, backing off and trying again later. This should make stream transcription much more robust. I also added better error handling and messages all around. You'll see messages like [DEBUG] Downloading segment... to help you diagnose if a stream isn't working.

Next up was a pet peeve of mine: duplicate transcriptions. It was annoying to see the same output over and over when the model got stuck. Now, difflib checks for similarity between transcriptions, suppressing near-duplicates, making the output cleaner. This was particularly noticeable with the Turbo model, which was prone to very long segments, so I refined that logic as well. Subtitle timings are also much improved due to using word timestamps for splitting.

I also added some cool new stuff:

Interactive Stream Selection (--selectsource): Tired of manually finding the right stream format? Just use --selectsource for an interactive menu of available streams from yt-dlp, or specify your format directly! Makes stream setup way easier.
Auto Blocklist (--auto_blocklist): Frequently blocked phrases? Let Synthalingua handle it! If a phrase gets blocked 3+ times, it's automatically added to the ignore list (--ignorelist).
Enhanced Captions/Subtitles: You can now compare the output of all the different RAM settings with the command --makecaptions compare! This creates separate .srt files for each model size, so you can choose the best one for your needs.

And let's not forget about the documentation! I completely overhauled the wiki with detailed explanations for every argument, examples, troubleshooting tips, and more. It was a big project, but it was worth it!

Finally, some quality-of-life changes: I updated microphone settings for higher quality input (48kHz, 24-bit), cleaned up some old unused code, and moved the GUI wrapper to its own repository (Private for Now, Need to adjust the repo settings and get the licensing for it set up properly, don't want people stealing with out giving proper credit or trying to resell it when it should be free).

Here's a summary of the v1.2.0 changes:

Added

Force Shutdown (PID file, kill_server.py utility)
Interactive Stream Selection (--selectsource)
Similarity/Repetition Suppression (--condition_on_previous_text)
Auto Blocklist (--auto_blocklist)
Enhanced Documentation
Captions Compare Mode (--makecaptions compare)

Improved

Stream Transcription Stability (Retries, Rate Limiting)
Microphone Input Quality (48kHz, 24-bit)
Subtitle Segmentation (Word Timestamps, Turbo Model Handling)

Changed

12GB Model Renamed to 11GB
Default RAM Model (3gb)
GUI Wrapper Moved to Separate Repository

Removed

Deprecated --stream_target_language

Fixed

FFmpeg stdin Initialization

That's it for this update list. I'll share some previews of the web ui soon. Making final adjustments to it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Dev Blog for v1.2.0 #158

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Dev Blog for v1.2.0 #158

Uh oh!

cyberofficial May 29, 2025 Maintainer

Added

Improved

Changed

Removed

Fixed

Replies: 0 comments

cyberofficial
May 29, 2025
Maintainer