Skip to content

Conversation

@cyberofficial
Copy link
Owner

@cyberofficial cyberofficial commented Sep 27, 2025

Synthalingua Version 1.2.5 Change Log

Description of Changes

This update introduces a major new feature: an intelligent adaptive batch processing system that dynamically allocates transcription jobs between the GPU and CPU for maximum performance. A comprehensive bug report generator and an SRT subtitle fixer have also been added to improve troubleshooting and usability. Additionally, the setup process for vocal isolation has been significantly simplified by replacing Miniconda with a more lightweight, portable Python environment, and the packaged CUDA version has been updated.

Noticeable Changes

  • Intelligent Adaptive Batch Processing: A new --adaptive_batch mode for caption generation intelligently distributes work between your GPU and CPU. It learns your system's performance and allocates longer tasks to the GPU and shorter ones to the CPU, significantly speeding up processing for large files.
  • Comprehensive Bug Report Generator: A new --bugreport command has been added. It generates a detailed bugreportinfo.txt file with system, Python, and hardware information to make troubleshooting and reporting issues much easier.
  • SRT Subtitle Fixer: A new --fixsrt utility has been added to repair SRT files with out-of-order timestamps, ensuring subtitles are always displayed chronologically.
  • Simplified Vocal Isolation Setup: The --using_vocal_isolation feature no longer uses Miniconda. It now sets up a portable, embedded Python environment, reducing the installation footprint, complexity, and potential for environment conflicts.
  • Cross-Platform Setup Script: The set_up_env.py script is now fully cross-platform, supporting Windows, Linux, and macOS. Linux users are given a choice between using their system Python or the new embedded Python for vocal isolation.
  • Interactive Setup Experience: The setup script is more interactive, guiding users through choices for using existing tools (like FFmpeg, yt-dlp) or downloading fresh copies.
  • New About Screen: The --about command now displays a modern, animated interface with project details, features, and contributor acknowledgements.
  • HTTPS Server Option: A new --https command-line argument allows running a secure web server alongside the existing HTTP one.
  • Redesigned Player Page: The player.html interface has been completely overhauled with a modern design, a two-column layout for settings, and the ability to change the video source directly from the UI. If no video source is provided in the URL, the page now interactively prompts the user for one.
  • Enhanced Device Listing: The --list_microphones command now provides a much more detailed and user-friendly table with device ID, channels, and sample rate, along with clear instructions to use the ID for selection.
  • Enhanced Remote Microphone Server: The remote_microphone.py script now supports binding the server to a specific IP address using the --serverip argument, allowing for network access beyond localhost.

Hidden Changes

  • Adaptive Batch Processing Backend: A new modules/adaptive_batch.py module has been added to handle the logic for GPU/CPU detection, performance tracking, and dynamic job scheduling.
  • Web Server Auto-Port Selection: The web server will now automatically search for and use an available port if the one specified is already in use.
  • Web Server Security: The Flask web server now includes security enhancements, such as IP blocking after multiple failed requests for static files and improved protection against path traversal attacks.
  • Robust Demucs Path Finding: The demucs_path_helper.py script has been rewritten to be more intelligent, actively searching multiple common locations (embedded Python, virtual environments, system paths) to find a valid demucs installation.
  • Interactive Timeout Handling: When using --batchmode, if a transcription job times out, the system now interactively prompts the user to either retry with an increased timeout or skip the problematic segment.
  • Build Script Simplification: The main build.bat script has been simplified to directly use pyinstaller with the corresponding .spec files, streamlining the build process.
  • Discord Message Formatting: The icons in Discord notifications have been replaced with text-based labels (e.g., [ORIGINAL], [TRANSLATION]) for better clarity.

Technical Changes

  • Version Update: The local version in modules/version_checker.py has been updated to 1.2.5.
  • Argument Parser Update: modules/parser_args.py was updated to include the new --bugreport, --fixsrt, --https, --adaptive_batch, --batchjobsize, --cpu_batches, --max_cpu_time, and --stop_cpu_at arguments.
  • New Modules: Added modules/adaptive_batch.py for the new processing scheduler, modules/bug_report.py for generating system reports, and modules/srt_fix.py for the subtitle repair utility.
  • Major set_up_env.py Overhaul: The script was rewritten to manage a portable, embedded Python installation instead of Miniconda, handle platform-specific downloads, and generate a bugreportinfo.txt file upon completion.
  • Demucs Environment Fix: The set_up_env.py script now sets environment variables (TORIO_USE_FFMPEG=0) to force torchaudio to use the soundfile backend, preventing conflicts with FFmpeg/torchcodec when running demucs.
  • CUDA Version Update: The setup scripts (setup.bat, setup.sh) have been updated to use the PyTorch index URL for CUDA 12.9 (cu129) instead of 12.8.
  • Transcription Worker Enhancements: The transcribe_worker.py script now accepts a --debug flag for more verbose logging and has improved UTF-8 handling for stability.
  • Build Specification Updates: The synthalingua.spec file now includes torchcodec in its data collection and changes the final output directory name to release.
  • Transcription Core Stability: The stop_queue_processing method in modules/transcription_core.py was improved to be more robust and prevent race conditions on shutdown.
  • Project Organization: Utility scripts have been moved into a misc/ directory, and a new plan.md file has been added, outlining the design of the adaptive batching system.

…d notification functions, and update webhook usage in transcription processes
…nscription and translation messages to improve readability
…n; enhance `--stream_transcribe` usage and notifications in Discord integration
…warning display; adjust argument handling in transcription logic
…hance transcription process with improved segment handling and logging
…splitting; improve handling of long segments for turbo model
…ry, and cookies folder; update documentation and examples accordingly
…h all RAM models; update documentation and examples accordingly
…RAM; update compare mode to process RAM models in reverse order
…detected language during transcription and translation
…h-specific and multilingual models based on RAM settings
…ent background processing queue for transcription
…st file if not found and improve warning messages
…ractive stream format and implement shutdown flag checks during segment download
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 14 out of 22 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Replaced emoji icons with plain text or bracketed labels in user-facing status and log messages across modules/discord.py, modules/sub_gen.py, and remote_microphone.py. This change standardizes output formatting and improves compatibility with environments where emoji rendering may be inconsistent.
@cyberofficial cyberofficial requested a review from Copilot October 1, 2025 06:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 16 out of 24 changed files in this pull request and generated 5 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# Additional security checks on filename
# Reject filenames with null bytes or that contain absolute path indicators
# Check for Windows drive letters (C:, D:, etc.) on any platform for consistency
if '\x00' in filename or (len(filename) >= 2 and filename[1] == ':'):
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Windows drive letter check filename[1] == ':' could cause an IndexError if filename has only 1 character. Consider using len(filename) >= 2 and filename[1] == ':' within a try-except block or check length first.

Copilot uses AI. Check for mistakes.
videoContainer.src = `https://player.twitch.tv/?channel=${videoId}&parent=${parent}`;
// Validate Twitch username and parent domain
if (isValidTwitchUsername(videoId) && isValidDomain(parent)) {
videoContainer.src = `https://player.twitch.tv/?channel=${encodeURIComponent(videoId)}&parent=${encodeURIComponent(parent)}`;
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good use of input validation and URL encoding for security. The validation functions properly check for valid Twitch usernames and domain formats before constructing URLs.

Suggested change
videoContainer.src = `https://player.twitch.tv/?channel=${encodeURIComponent(videoId)}&parent=${encodeURIComponent(parent)}`;
videoContainer.src = `https://player.twitch.tv/?channel=${encodeURIComponent(videoId)}&parent=${parent}`;

Copilot uses AI. Check for mistakes.
Copilot AI added a commit that referenced this pull request Oct 1, 2025
- Added has_windows_drive_letter() function to modules/file_handlers.py
- Implements safe checking for Windows drive letters (C:, D:, etc.) without IndexError risk
- Includes comprehensive error handling and input validation
- Tested with 18 test cases covering all edge cases
- Created documentation in update_notes/WINDOWS_DRIVE_LETTER_FIX.md
- Addresses review comment from PR #187 by providing reusable defensive programming solution

Co-authored-by: cyberofficial <[email protected]>
…ent IndexError (#196)

* Add scam warning to README

Added a warning about scams related to crypto and NFTs.

* Update README.md

Co-authored-by: Copilot <[email protected]>

* Initial plan

* Add defensive utility function for Windows drive letter check

- Added has_windows_drive_letter() function to modules/file_handlers.py
- Implements safe checking for Windows drive letters (C:, D:, etc.) without IndexError risk
- Includes comprehensive error handling and input validation
- Tested with 18 test cases covering all edge cases
- Created documentation in update_notes/WINDOWS_DRIVE_LETTER_FIX.md
- Addresses review comment from PR #187 by providing reusable defensive programming solution

Co-authored-by: cyberofficial <[email protected]>

---------

Co-authored-by: Cyber Official <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: copilot-swe-agent[bot] <[email protected]>
cyberofficial added a commit that referenced this pull request Oct 1, 2025
…ent IndexError (#196)

* Add scam warning to README

Added a warning about scams related to crypto and NFTs.

* Update README.md

Co-authored-by: Copilot <[email protected]>

* Initial plan

* Add defensive utility function for Windows drive letter check

- Added has_windows_drive_letter() function to modules/file_handlers.py
- Implements safe checking for Windows drive letters (C:, D:, etc.) without IndexError risk
- Includes comprehensive error handling and input validation
- Tested with 18 test cases covering all edge cases
- Created documentation in update_notes/WINDOWS_DRIVE_LETTER_FIX.md
- Addresses review comment from PR #187 by providing reusable defensive programming solution

Co-authored-by: cyberofficial <[email protected]>

---------

Co-authored-by: Cyber Official <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: copilot-swe-agent[bot] <[email protected]>
* Initial plan

* Remove unused console_settings.py module and its references

Co-authored-by: cyberofficial <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: cyberofficial <[email protected]>
@cyberofficial cyberofficial linked an issue Oct 2, 2025 that may be closed by this pull request
Replaces Miniconda-based environment setup for Demucs/vocal isolation with a portable Python embedded installation. Updates all related logic, prompts, and paths in set_up_env.py and modules/demucs_path_helper.py to use Python embedded instead of Miniconda, and adjusts build.bat to activate the correct environment. This change reduces disk space requirements and simplifies installation for end users.
@cyberofficial cyberofficial linked an issue Oct 14, 2025 that may be closed by this pull request
Expanded demucs_path_helper.py to support OS-specific Python path detection, system Python with demucs, and improved validation. Updated yt-dlp URLs in set_up_env.py for all platforms. Changed PyInstaller output folder in build.bat and updated release name in synthalingua.spec.

- Fixes #199 -
The Flask server now automatically finds an available port if the specified one is in use. The setup script tracks installation sources and configuration details, and generates a bug report info file to assist with troubleshooting. Also updated version numbers and improved diffq installation reporting.
Updated port selection logic to prefer ports >=8000 and limit attempts to 1000 for better reliability. Enhanced error handling and debug output when binding to ports. Improved argument help and validation to recommend ports >=8000 for Windows compatibility, and added user-facing messages indicating web interface URLs.
Changed setup scripts and environment checks to require Python 3.12.x instead of 3.10.x. Updated CUDA-enabled PyTorch installation to use cu129 wheels. Improved torch/torchaudio/torchcodec installation logic and error handling in set_up_env.py. Added environment variable workaround for torchaudio/torchcodec issues in sub_gen.py. Updated .gitignore and spec file for new files and torchcodec package.

Possible Fix for #200
…d tidy build.bat

- Set PYTHONIOENCODING, TORCHAUDIO_USE_BACKEND_DISPATCHER=1 and TORIO_USE_FFMPEG=0
  when launching Demucs (subprocess.run / Popen) to avoid torchcodec/FFmpeg issues
  and ensure UTF-8 encoding.
- Add EnvironmentSetup._write_demucs_config to write demucs_python_path.txt and
  call it after Python/Demucs installation so demucs_path_helper can locate the
  Demucs executable.
- Reorder build.bat to run main spec builds first and move the remote_microphone
  pyinstaller invocation after a goto :eof (kept disabled) for clarity.
Introduces adaptive batch processing with dynamic GPU/CPU job allocation, performance tracking, and optimization suggestions. Adds a bug report generator for comprehensive system diagnostics. Updates argument parser to support new batch and bug report options, and integrates adaptive batch logic into speech region processing.
- Document new CLI flags: --batchmode, --adaptive_batch, --batchjobsize
- Add usage examples for parallel batch and adaptive batch modes
- Expand Adaptive Batch section with how it works, config guidance, and best-use scenarios
- Add Parallel Batch section and "Ultimate Performance" combo example
- Update processing workflow and pro tips to include batch allocation/parallel processing
Expand the --timeout argument description to specify it is a maximum time (seconds)
for individual transcription jobs, default behavior (0 = no timeout), which modes
it affects (file processing, batchmode and adaptive_batch), and provide an example
and expected job handling when the limit is exceeded.
When a speech region times out, users can now choose to retry, insert a placeholder, or leave an empty SRT spot. For final failed regions, users can opt to remove timeout restrictions or replace with empty SRT spots, improving flexibility and error handling in transcription.
Introduces a dynamic progress bar and estimated time remaining display during adaptive batch processing. Enhances user feedback by showing job completion stats, average time per job, and ETA, replacing verbose per-job print statements with concise progress updates.
Introduces a new --fixsrt argument to repair SRT files with incorrect timestamp ordering. Adds modules/srt_fix.py for sorting and rewriting SRT subtitles chronologically, updates README and argument parser, and integrates the feature into synthalingua.py. Also ensures speech segments are sorted by start time in sub_gen.py.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment