Dev Blog for v1.2.0 #158
cyberofficial
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This week flew by in a flurry of coding for Synthalingua! v1.2.0 is almost here, and I'm excited to share what I've been working on. It's been a mix of tackling some long-standing frustrations and adding some quality-of-life features.
One of my biggest personal wins was squashing the "hanging server" bug. It was incredibly frustrating to have the Flask server refuse to shut down properly, especially when hitting Ctrl+C. The solution? A PID file/watchdog system. Now, when the server starts, it creates a
server.pidfile (added to.gitignore). If a watchdog thread detects this file's deletion (like when I Ctrl+C), the server gets force-killed. No more hanging! And for those edge cases, I made akill_server.pyutility. This was a major stability improvement, and honestly, a huge relief. This also resolves hanging when doing a "graceful shutdown."Speaking of stability, I also focused on the stream transcription module. I added retry logic and, even better, rate limiting. I got tired of getting those pesky 429 errors, so now Synthalingua handles them gracefully, backing off and trying again later. This should make stream transcription much more robust. I also added better error handling and messages all around. You'll see messages like
[DEBUG] Downloading segment...to help you diagnose if a stream isn't working.Next up was a pet peeve of mine: duplicate transcriptions. It was annoying to see the same output over and over when the model got stuck. Now,
difflibchecks for similarity between transcriptions, suppressing near-duplicates, making the output cleaner. This was particularly noticeable with the Turbo model, which was prone to very long segments, so I refined that logic as well. Subtitle timings are also much improved due to using word timestamps for splitting.I also added some cool new stuff:
--selectsource): Tired of manually finding the right stream format? Just use--selectsourcefor an interactive menu of available streams from yt-dlp, or specify your format directly! Makes stream setup way easier.--auto_blocklist): Frequently blocked phrases? Let Synthalingua handle it! If a phrase gets blocked 3+ times, it's automatically added to the ignore list (--ignorelist).--makecaptions compare! This creates separate.srtfiles for each model size, so you can choose the best one for your needs.And let's not forget about the documentation! I completely overhauled the wiki with detailed explanations for every argument, examples, troubleshooting tips, and more. It was a big project, but it was worth it!
Finally, some quality-of-life changes: I updated microphone settings for higher quality input (48kHz, 24-bit), cleaned up some old unused code, and moved the GUI wrapper to its own repository (Private for Now, Need to adjust the repo settings and get the licensing for it set up properly, don't want people stealing with out giving proper credit or trying to resell it when it should be free).
Here's a summary of the v1.2.0 changes:
Added
kill_server.pyutility)--selectsource)--condition_on_previous_text)--auto_blocklist)--makecaptions compare)Improved
Changed
Removed
--stream_target_languageFixed
That's it for this update list. I'll share some previews of the web ui soon. Making final adjustments to it.
Beta Was this translation helpful? Give feedback.
All reactions