diff --git a/docs/guide/converting-media-files.md b/docs/guide/converting-media-files.md index b2a43aa2..c5183fac 100644 --- a/docs/guide/converting-media-files.md +++ b/docs/guide/converting-media-files.md @@ -294,11 +294,46 @@ The function is called for each input audio sample after remixing and resampling This function can also be used to manually perform remixing or resampling. When doing so, you should signal the post-process parameters using the `processedNumberOfChannels` and `processedSampleRate` fields, which enables the encoder to better know what to expect. +## Subtitle options + +You can set the `subtitle` property in the conversion options to configure the converter's behavior for subtitle tracks. The options are: +```ts +type ConversionSubtitleOptions = { + discard?: boolean; + codec?: SubtitleCodec; +}; +``` + +For example, here we convert all subtitle tracks to WebVTT format: +```ts +const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'webvtt', + }, +}); +``` + +::: info +The provided configuration will apply equally to all subtitle tracks of the input. If you want to apply a separate configuration to each subtitle track, check [track-specific options](#track-specific-options). +::: + +### Discarding subtitles + +If you want to get rid of subtitle tracks, use `discard: true`. + +### Converting subtitle format + +Use the `codec` property to control the format of the output subtitle tracks. This should be set to a [codec](./supported-formats-and-codecs#subtitle-codecs) supported by the output file, or else the track will be [discarded](#discarded-tracks). + +Subtitle tracks are always copied (extracted and re-muxed as text), never transcoded, so there is no quality loss. The supported formats are WebVTT, SRT, ASS/SSA, TX3G, and TTML. + ## Track-specific options -You may want to configure your video and audio options differently depending on the specifics of the input track. Or, in case a media file has multiple video or audio tracks, you may want to discard only specific tracks or configure each track separately. +You may want to configure your video, audio, and subtitle options differently depending on the specifics of the input track. Or, in case a media file has multiple tracks of the same type, you may want to discard only specific tracks or configure each track separately. -For this, instead of passing an object for `video` and `audio`, you can instead pass a function: +For this, instead of passing an object for `video`, `audio`, or `subtitle`, you can instead pass a function: ```ts const conversion = await Conversion.init({ @@ -329,10 +364,22 @@ const conversion = await Conversion.init({ codec: 'aac', }; }, + + // Works for subtitles too: + subtitle: (subtitleTrack, n) => { + if (subtitleTrack.languageCode !== 'eng' && subtitleTrack.languageCode !== 'spa') { + // Keep only English and Spanish subtitles + return { discard: true }; + } + + return { + codec: 'webvtt', + }; + }, }); ``` -For documentation about the properties of video and audio tracks, refer to [Reading track metadata](./reading-media-files#reading-track-metadata). +For documentation about the properties of video, audio, and subtitle tracks, refer to [Reading track metadata](./reading-media-files#reading-track-metadata). ## Trimming diff --git a/docs/guide/supported-formats-and-codecs.md b/docs/guide/supported-formats-and-codecs.md index dbaf533d..a2a1509d 100644 --- a/docs/guide/supported-formats-and-codecs.md +++ b/docs/guide/supported-formats-and-codecs.md @@ -64,6 +64,11 @@ Mediabunny ships with built-in decoders and encoders for all audio PCM codecs, m ### Subtitle codecs - `'webvtt'` - WebVTT +- `'tx3g'` - 3GPP Timed Text (MP4 subtitles) +- `'ttml'` - Timed Text Markup Language +- `'srt'` - SubRip +- `'ass'` - Advanced SubStation Alpha +- `'ssa'` - SubStation Alpha ## Compatibility table @@ -97,11 +102,15 @@ Not all codecs can be used with all containers. The following table specifies th | `'pcm-f64be'` | ✓ | ✓ | | | | | | | | | | `'ulaw'` | | ✓ | | | | | ✓ | | | | | `'alaw'` | | ✓ | | | | | ✓ | | | | -| `'webvtt'`[^3] | (✓) | | (✓) | (✓) | | | | | | | +| `'webvtt'` | ✓ | ✓ | ✓ | ✓ | | | | | | | +| `'tx3g'` | ✓ | ✓ | | | | | | | | | +| `'ttml'` | ✓ | ✓ | | | | | | | | | +| `'srt'` | | | ✓ | ✓ | | | | | | | +| `'ass'` | | | ✓ | ✓ | | | | | | | +| `'ssa'` | | | ✓ | ✓ | | | | | | | [^2]: WebM only supports a small subset of the codecs supported by Matroska. However, this library can technically read all codecs from a WebM that are supported by Matroska. -[^3]: WebVTT can only be written, not read. ## Querying codec encodability @@ -330,4 +339,4 @@ All instance methods of the class can return promises. In this case, the library ::: warning The samples passed to `onSample` **must** be sorted by increasing timestamp. This especially means if the decoder is decoding a video stream that makes use of [B-frames](./media-sources.md#b-frames), the decoder **must** internally hold on to these frames so it can emit them sorted by presentation timestamp. This strict sorting requirement is reset each time `flush` is called. -::: \ No newline at end of file +::: diff --git a/examples/subtitle-extraction/index.html b/examples/subtitle-extraction/index.html new file mode 100644 index 00000000..54461738 --- /dev/null +++ b/examples/subtitle-extraction/index.html @@ -0,0 +1,94 @@ + + + + + + + Subtitle extraction example | Mediabunny + + + + + + + +

Subtitle extraction example

+

Extract subtitle tracks from video files with embedded subtitles (MKV, MP4, MOV).

+ +
+
+ + + +
+ + + Download sample file + +
+ +

+ + + +
+ + + +

Mediabunny

+
+ + + +

View source code

+
+ + + + diff --git a/examples/subtitle-extraction/subtitle-extraction.ts b/examples/subtitle-extraction/subtitle-extraction.ts new file mode 100644 index 00000000..8cb9a0a1 --- /dev/null +++ b/examples/subtitle-extraction/subtitle-extraction.ts @@ -0,0 +1,184 @@ +import { Input, ALL_FORMATS, BlobSource, UrlSource } from 'mediabunny'; + +// Sample file URL - users can replace with their own +const SampleFileUrl = 'https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4'; +(document.querySelector('#sample-file-download') as HTMLAnchorElement).href = SampleFileUrl; + +const selectMediaButton = document.querySelector('#select-file') as HTMLButtonElement; +const loadUrlButton = document.querySelector('#load-url') as HTMLButtonElement; +const fileNameElement = document.querySelector('#file-name') as HTMLParagraphElement; +const horizontalRule = document.querySelector('hr') as HTMLHRElement; +const contentContainer = document.querySelector('#content-container') as HTMLDivElement; + +const extractSubtitles = async (resource: File | string) => { + fileNameElement.textContent = resource instanceof File ? resource.name : resource; + horizontalRule.style.display = ''; + contentContainer.innerHTML = '

Loading...

'; + + try { + const source = resource instanceof File + ? new BlobSource(resource) + : new UrlSource(resource); + + const input = new Input({ + source, + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + + if (!subtitleTracks || subtitleTracks.length === 0) { + contentContainer.innerHTML = '

No subtitle tracks found in this file.

'; + input.dispose(); + return; + } + + // Extract all subtitle data before disposing input + const subtitleData = await Promise.all(subtitleTracks.map(async (track) => { + const cues = []; + let cueCount = 0; + for await (const cue of track.getCues()) { + cues.push(cue); + cueCount++; + if (cueCount >= 5) break; + } + + // Get full text for download + const fullText = await track.exportToText(); + + return { + id: track.id, + name: track.name, + codec: track.codec, + languageCode: track.languageCode, + previewCues: cues, + fullText, + }; + })); + + // Now dispose the input + input.dispose(); + + // Render subtitle tracks + contentContainer.innerHTML = ''; + + for (const trackData of subtitleData) { + const trackDiv = document.createElement('div'); + trackDiv.className = 'subtitle-track'; + + // Header + const headerDiv = document.createElement('div'); + headerDiv.className = 'subtitle-track-header'; + + const titleSpan = document.createElement('span'); + titleSpan.className = 'subtitle-track-title'; + titleSpan.textContent = trackData.name || `Track ${trackData.id}`; + + const metaSpan = document.createElement('span'); + metaSpan.className = 'subtitle-track-meta'; + metaSpan.textContent = `${trackData.codec?.toUpperCase()} • ${trackData.languageCode}`; + + headerDiv.appendChild(titleSpan); + headerDiv.appendChild(metaSpan); + trackDiv.appendChild(headerDiv); + + // Cue preview + const previewDiv = document.createElement('div'); + previewDiv.className = 'cue-preview'; + + if (trackData.previewCues.length > 0) { + for (const cue of trackData.previewCues) { + const cueDiv = document.createElement('div'); + cueDiv.className = 'cue-item'; + + const timeSpan = document.createElement('span'); + timeSpan.className = 'cue-time'; + timeSpan.textContent = formatTime(cue.timestamp); + + const textSpan = document.createElement('span'); + textSpan.textContent = cue.text.substring(0, 100) + (cue.text.length > 100 ? '...' : ''); + + cueDiv.appendChild(timeSpan); + cueDiv.appendChild(textSpan); + previewDiv.appendChild(cueDiv); + } + + const countNote = document.createElement('p'); + countNote.className = 'text-xs opacity-50 mt-2'; + countNote.textContent = `Showing first ${trackData.previewCues.length} cues`; + previewDiv.appendChild(countNote); + } else { + previewDiv.innerHTML = '

No cues found

'; + } + + trackDiv.appendChild(previewDiv); + + // Download button + const downloadBtn = document.createElement('button'); + downloadBtn.className = 'download-btn'; + downloadBtn.textContent = `Download as ${trackData.codec?.toUpperCase()}`; + downloadBtn.onclick = () => { + try { + const blob = new Blob([trackData.fullText], { type: 'text/plain' }); + const url = URL.createObjectURL(blob); + const a = document.createElement('a'); + a.href = url; + a.download = `subtitles_track${trackData.id}.${trackData.codec}`; + a.click(); + URL.revokeObjectURL(url); + } catch (err) { + alert(`Error: ${err}`); + } + }; + trackDiv.appendChild(downloadBtn); + + contentContainer.appendChild(trackDiv); + } + } catch (err) { + console.error(err); + contentContainer.innerHTML = `

Error: ${err}

`; + } +}; + +const formatTime = (seconds: number): string => { + const h = Math.floor(seconds / 3600); + const m = Math.floor((seconds % 3600) / 60); + const s = Math.floor(seconds % 60); + const ms = Math.floor((seconds % 1) * 1000); + return `${h.toString().padStart(2, '0')}:${m.toString().padStart(2, '0')}:${s.toString().padStart(2, '0')}.${ms.toString().padStart(3, '0')}`; +}; + +selectMediaButton.addEventListener('click', () => { + const fileInput = document.createElement('input'); + fileInput.type = 'file'; + fileInput.accept = 'video/*,video/x-matroska,video/x-msvideo'; + fileInput.addEventListener('change', () => { + const file = fileInput.files?.[0]; + if (!file) return; + void extractSubtitles(file); + }); + fileInput.click(); +}); + +loadUrlButton.addEventListener('click', () => { + const url = prompt( + 'Enter URL of a media file with subtitles. Must support CORS.', + 'https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4', + ); + if (!url) return; + void extractSubtitles(url); +}); + +document.addEventListener('dragover', (event) => { + event.preventDefault(); + event.dataTransfer!.dropEffect = 'copy'; +}); + +document.addEventListener('drop', (event) => { + event.preventDefault(); + const files = event.dataTransfer?.files; + const file = files && files.length > 0 ? files[0] : undefined; + if (file) { + void extractSubtitles(file); + } +}); diff --git a/examples/subtitle-muxing/index.html b/examples/subtitle-muxing/index.html new file mode 100644 index 00000000..928f3bb1 --- /dev/null +++ b/examples/subtitle-muxing/index.html @@ -0,0 +1,71 @@ + + + + + + + Subtitle muxing example | Mediabunny + + + + + + + +

Subtitle muxing example

+

Add subtitle tracks to video files. Automatically chooses the best output format to preserve the original container when possible.

+ +
+
+ + +

+
+ +
+ +

Format detected automatically from file extension (.srt, .ass, .ssa, .vtt)

+ +

+
+ + + + + + + +

+
+ + + +

Mediabunny

+
+ + + +

View source code

+
+ + diff --git a/examples/subtitle-muxing/subtitle-muxing.ts b/examples/subtitle-muxing/subtitle-muxing.ts new file mode 100644 index 00000000..3963a75a --- /dev/null +++ b/examples/subtitle-muxing/subtitle-muxing.ts @@ -0,0 +1,203 @@ +import { + Input, + Output, + ALL_FORMATS, + BlobSource, + BufferTarget, + MkvOutputFormat, + Mp4OutputFormat, + MovOutputFormat, + TextSubtitleSource, + Conversion, + type SubtitleCodec, +} from 'mediabunny'; + +const selectVideoBtn = document.querySelector('#select-video') as HTMLButtonElement; +const selectSubtitleBtn = document.querySelector('#select-subtitle') as HTMLButtonElement; +const videoNameEl = document.querySelector('#video-name') as HTMLParagraphElement; +const subtitleNameEl = document.querySelector('#subtitle-name') as HTMLParagraphElement; +const processBtn = document.querySelector('#process-btn') as HTMLButtonElement; +const progressBar = document.querySelector('#progress-bar') as HTMLDivElement; +const progressFill = document.querySelector('#progress-fill') as HTMLDivElement; +const downloadSection = document.querySelector('#download-section') as HTMLDivElement; +const downloadBtn = document.querySelector('#download-btn') as HTMLButtonElement; +const errorElement = document.querySelector('#error-element') as HTMLParagraphElement; + +let videoFile: File | null = null; +let subtitleFile: File | null = null; +let outputBlob: Blob | null = null; +let outputExtension = 'mkv'; + +const detectSubtitleCodec = (filename: string): SubtitleCodec => { + const ext = filename.toLowerCase().split('.').pop(); + if (ext === 'srt') return 'srt'; + if (ext === 'ass') return 'ass'; + if (ext === 'ssa') return 'ssa'; + if (ext === 'vtt') return 'webvtt'; + return 'srt'; +}; + +const determineBestOutputFormat = (videoExt: string, subtitleCodec: SubtitleCodec) => { + const ext = videoExt.toLowerCase(); + + if (ext === 'mkv' || ext === 'webm') { + return { format: new MkvOutputFormat(), extension: 'mkv' }; + } + + if (ext === 'mp4') { + if (subtitleCodec === 'webvtt') { + return { format: new Mp4OutputFormat(), extension: 'mp4' }; + } else { + return { format: new MkvOutputFormat(), extension: 'mkv' }; + } + } + + if (ext === 'mov') { + if (subtitleCodec === 'webvtt') { + return { format: new MovOutputFormat(), extension: 'mov' }; + } else { + return { format: new MkvOutputFormat(), extension: 'mkv' }; + } + } + + return { format: new MkvOutputFormat(), extension: 'mkv' }; +}; + +selectVideoBtn.onclick = async () => { + const [fileHandle] = await (window as any).showOpenFilePicker({ + types: [{ + description: 'Video Files', + accept: { + 'video/*': ['.mp4', '.mkv', '.mov', '.webm'], + }, + }], + }); + videoFile = await fileHandle.getFile(); + videoNameEl.textContent = `Selected: ${videoFile!.name}`; + updateProcessButton(); +}; + +selectSubtitleBtn.onclick = async () => { + const [fileHandle] = await (window as any).showOpenFilePicker({ + types: [{ + description: 'Subtitle Files', + accept: { + 'text/*': ['.srt', '.ass', '.ssa', '.vtt'], + }, + }], + }); + subtitleFile = await fileHandle.getFile(); + subtitleNameEl.textContent = `Selected: ${subtitleFile!.name}`; + updateProcessButton(); +}; + +const updateProcessButton = () => { + processBtn.disabled = !(videoFile && subtitleFile); +}; + +processBtn.onclick = async () => { + if (!videoFile || !subtitleFile) return; + + errorElement.textContent = ''; + downloadSection.style.display = 'none'; + progressBar.style.display = 'block'; + progressFill.style.width = '0%'; + progressFill.textContent = '0%'; + processBtn.disabled = true; + + try { + const subtitleText = await subtitleFile.text(); + const subtitleCodec = detectSubtitleCodec(subtitleFile.name); + + const input = new Input({ + source: new BlobSource(videoFile), + formats: ALL_FORMATS, + }); + + progressFill.style.width = '10%'; + progressFill.textContent = '10%'; + + // Detect video format from filename + const videoExt = videoFile.name.toLowerCase().split('.').pop() || 'mkv'; + const { format: outputFormat, extension } = determineBestOutputFormat(videoExt, subtitleCodec); + outputExtension = extension; + + const output = new Output({ + format: outputFormat, + target: new BufferTarget(), + }); + + progressFill.style.width = '20%'; + progressFill.textContent = '20%'; + + // Initialize conversion (it will copy video/audio tracks) + const conversion = await Conversion.init({ + input, + output, + }); + + progressFill.style.width = '30%'; + progressFill.textContent = '30%'; + + // Create subtitle source + const subtitleSource = new TextSubtitleSource(subtitleCodec); + + // Add subtitle track with content provider that will be called after output starts + conversion.addExternalSubtitleTrack( + subtitleSource, + { + languageCode: 'eng', + name: 'English', + }, + async () => { + // This will be called after output.start() connects the tracks + await subtitleSource.add(subtitleText); + await subtitleSource.close(); + }, + ); + + progressFill.style.width = '40%'; + progressFill.textContent = '40%'; + + // Set up progress callback + conversion.onProgress = (progress) => { + const percentage = 50 + (progress * 40); + progressFill.style.width = `${percentage}%`; + progressFill.textContent = `${Math.round(percentage)}%`; + }; + + // Execute conversion (this will start the output, connect tracks, and run content providers) + await conversion.execute(); + + progressFill.style.width = '100%'; + progressFill.textContent = '100%'; + + input.dispose(); + + const buffer = (output.target as BufferTarget).buffer; + if (!buffer) throw new Error('Output buffer is null'); + outputBlob = new Blob([buffer]); + + setTimeout(() => { + progressBar.style.display = 'none'; + downloadSection.style.display = 'block'; + processBtn.disabled = false; + }, 500); + } catch (err) { + console.error(err); + errorElement.textContent = `Error: ${err}`; + progressBar.style.display = 'none'; + processBtn.disabled = false; + } +}; + +downloadBtn.onclick = () => { + if (!outputBlob) return; + + const url = URL.createObjectURL(outputBlob); + const a = document.createElement('a'); + a.href = url; + a.download = `video_with_subtitles.${outputExtension}`; + a.click(); + URL.revokeObjectURL(url); +}; diff --git a/scripts/generate-subtitle-test-files.sh b/scripts/generate-subtitle-test-files.sh new file mode 100755 index 00000000..70b3205d --- /dev/null +++ b/scripts/generate-subtitle-test-files.sh @@ -0,0 +1,135 @@ +#!/bin/bash + +# Generate subtitle test files for MediaBunny +set -e + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +OUTPUT_DIR="$PROJECT_ROOT/test/public/subtitles" + +mkdir -p "$OUTPUT_DIR" +cd "$OUTPUT_DIR" + +echo "Generating subtitle test files..." + +# Create test SRT subtitle file +cat > test.srt << 'EOF' +1 +00:00:01,000 --> 00:00:03,500 +Hello world! + +2 +00:00:05,000 --> 00:00:07,000 +This is a test. + +3 +00:00:08,500 --> 00:00:10,000 +Goodbye! +EOF + +# Create test ASS subtitle file +cat > test.ass << 'EOF' +[Script Info] +Title: Test Subtitles +ScriptType: v4.00+ +PlayResX: 1280 +PlayResY: 720 + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello world! +Dialogue: 0,0:00:05.00,0:00:07.00,Default,,0,0,0,,This is a test. +Dialogue: 0,0:00:08.50,0:00:10.00,Default,,0,0,0,,Goodbye! +EOF + +# Create test SSA subtitle file (older format) +cat > test.ssa << 'EOF' +[Script Info] +Title: Test Subtitles +ScriptType: v4.00 + +[V4 Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding +Style: Default,Arial,20,16777215,65535,65535,0,0,0,1,2,0,2,10,10,10,0,1 + +[Events] +Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: Marked=0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello world! +Dialogue: Marked=0,0:00:05.00,0:00:07.00,Default,,0,0,0,,This is a test. +Dialogue: Marked=0,0:00:08.50,0:00:10.00,Default,,0,0,0,,Goodbye! +EOF + +# Create test WebVTT subtitle file +cat > test.vtt << 'EOF' +WEBVTT + +00:00:01.000 --> 00:00:03.500 +Hello world! + +00:00:05.000 --> 00:00:07.000 +This is a test. + +00:00:08.500 --> 00:00:10.000 +Goodbye! +EOF + +echo "Creating test video (10 seconds, black screen)..." + +# Generate test video (black screen, 10 seconds, with audio) +ffmpeg -y -f lavfi -i color=black:s=1280x720:d=10 -f lavfi -i anullsrc=r=48000:cl=stereo:d=10 \ + -c:v libx264 -preset ultrafast -c:a aac -shortest test-video.mp4 \ + -loglevel warning + +echo "Creating MKV with SRT subtitle..." +ffmpeg -y -i test-video.mp4 -i test.srt \ + -c:v copy -c:a copy -c:s srt \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="English" \ + test-mkv-srt.mkv -loglevel warning + +echo "Creating MKV with ASS subtitle..." +ffmpeg -y -i test-video.mp4 -i test.ass \ + -c:v copy -c:a copy -c:s ass \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="English" \ + test-mkv-ass.mkv -loglevel warning + +echo "Creating MKV with SSA subtitle..." +ffmpeg -y -i test-video.mp4 -i test.ssa \ + -c:v copy -c:a copy -c:s ssa \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="English" \ + test-mkv-ssa.mkv -loglevel warning + +echo "Creating MKV with WebVTT subtitle..." +ffmpeg -y -i test-video.mp4 -i test.vtt \ + -c:v copy -c:a copy -c:s webvtt \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="English" \ + test-mkv-vtt.mkv -loglevel warning + +echo "Creating MKV with multiple subtitle tracks..." +ffmpeg -y -i test-video.mp4 -i test.srt -i test.ass \ + -map 0:v -map 0:a -map 1:s -map 2:s \ + -c:v copy -c:a copy -c:s srt -c:s:1 ass \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="English SRT" \ + -metadata:s:s:1 language=spa -metadata:s:s:1 title="Spanish ASS" \ + test-mkv-multi.mkv -loglevel warning + +echo "Creating MP4 with WebVTT subtitle (mov_text codec)..." +ffmpeg -y -i test-video.mp4 -i test.vtt \ + -c:v copy -c:a copy -c:s mov_text \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="English" \ + test-mp4-vtt.mp4 -loglevel warning + +echo "Creating MKV with ASS subtitle with fonts/graphics sections..." +ffmpeg -y -i test-video.mp4 -i test-with-fonts.ass \ + -c:v copy -c:a copy -c:s ass \ + -metadata:s:s:0 language=eng -metadata:s:s:0 title="With Fonts" \ + test-mkv-ass-fonts.mkv -loglevel warning + +echo "" +echo "✓ Test files generated successfully in $OUTPUT_DIR" +echo "" +echo "Files created:" +ls -lh test-*.mkv test-*.mp4 test*.srt test*.ass test*.ssa test*.vtt 2>/dev/null | awk '{print " " $9 " (" $5 ")"}' diff --git a/src/codec.ts b/src/codec.ts index b7086f6f..bd53ced9 100644 --- a/src/codec.ts +++ b/src/codec.ts @@ -89,7 +89,12 @@ export const AUDIO_CODECS = [ */ export const SUBTITLE_CODECS = [ 'webvtt', -] as const; // TODO add the rest + 'tx3g', + 'ttml', + 'srt', + 'ass', + 'ssa', +] as const; /** * Union type of known video codecs. @@ -767,6 +772,12 @@ export const inferCodecFromCodecString = (codecString: string): MediaCodec | nul // Subtitle codecs if (codecString === 'webvtt') { return 'webvtt'; + } else if (codecString === 'srt') { + return 'srt'; + } else if (codecString === 'ass') { + return 'ass'; + } else if (codecString === 'ssa') { + return 'ssa'; } return null; diff --git a/src/conversion.ts b/src/conversion.ts index 0a4ac13a..24239289 100644 --- a/src/conversion.ts +++ b/src/conversion.ts @@ -10,6 +10,8 @@ import { AUDIO_CODECS, AudioCodec, NON_PCM_AUDIO_CODECS, + SUBTITLE_CODECS, + SubtitleCodec, VIDEO_CODECS, VideoCodec, } from './codec'; @@ -21,7 +23,7 @@ import { VideoEncodingConfig, } from './encode'; import { Input } from './input'; -import { InputAudioTrack, InputTrack, InputVideoTrack } from './input-track'; +import { InputAudioTrack, InputSubtitleTrack, InputTrack, InputVideoTrack } from './input-track'; import { AudioSampleSink, CanvasSink, @@ -32,6 +34,8 @@ import { AudioSource, EncodedVideoPacketSource, EncodedAudioPacketSource, + SubtitleSource, + TextSubtitleSource, VideoSource, VideoSampleSource, AudioSampleSource, @@ -45,10 +49,11 @@ import { promiseWithResolvers, Rotation, } from './misc'; -import { Output, TrackType } from './output'; +import { Output, SubtitleTrackMetadata, TrackType } from './output'; import { Mp4OutputFormat } from './output-format'; import { AudioSample, clampCropRectangle, validateCropRectangle, VideoSample } from './sample'; import { MetadataTags, validateMetadataTags } from './metadata'; +import { formatCuesToAss, formatCuesToSrt, formatCuesToWebVTT, SubtitleCue } from './subtitles'; import { NullTarget } from './target'; /** @@ -82,6 +87,15 @@ export type ConversionOptions = { audio?: ConversionAudioOptions | ((track: InputAudioTrack, n: number) => MaybePromise); + /** + * Subtitle-specific options. When passing an object, the same options are applied to all subtitle tracks. When passing a + * function, it will be invoked for each subtitle track and is expected to return or resolve to the options + * for that specific track. The function is passed an instance of {@link InputSubtitleTrack} as well as a number `n`, + * which is the 1-based index of the track in the list of all subtitle tracks. + */ + subtitle?: ConversionSubtitleOptions + | ((track: InputSubtitleTrack, n: number) => MaybePromise); + /** Options to trim the input file. */ trim?: { /** @@ -267,6 +281,18 @@ export type ConversionAudioOptions = { processedSampleRate?: number; }; +/** + * Subtitle-specific options. + * @group Conversion + * @public + */ +export type ConversionSubtitleOptions = { + /** If `true`, all subtitle tracks will be discarded and will not be present in the output. */ + discard?: boolean; + /** The desired output subtitle codec. */ + codec?: SubtitleCodec; +}; + const validateVideoOptions = (videoOptions: ConversionVideoOptions | undefined) => { if (videoOptions !== undefined && (!videoOptions || typeof videoOptions !== 'object')) { throw new TypeError('options.video, when provided, must be an object.'); @@ -415,6 +441,20 @@ const validateAudioOptions = (audioOptions: ConversionAudioOptions | undefined) } }; +const validateSubtitleOptions = (subtitleOptions: ConversionSubtitleOptions | undefined) => { + if (subtitleOptions !== undefined && (!subtitleOptions || typeof subtitleOptions !== 'object')) { + throw new TypeError('options.subtitle, when provided, must be an object.'); + } + if (subtitleOptions?.discard !== undefined && typeof subtitleOptions.discard !== 'boolean') { + throw new TypeError('options.subtitle.discard, when provided, must be a boolean.'); + } + if (subtitleOptions?.codec !== undefined && !SUBTITLE_CODECS.includes(subtitleOptions.codec)) { + throw new TypeError( + `options.subtitle.codec, when provided, must be one of: ${SUBTITLE_CODECS.join(', ')}.`, + ); + } +}; + const FALLBACK_NUMBER_OF_CHANNELS = 2; const FALLBACK_SAMPLE_RATE = 48000; @@ -499,6 +539,13 @@ export class Conversion { /** @internal */ _canceled = false; + /** @internal */ + _externalSubtitleSources: Array<{ + source: SubtitleSource; + metadata: SubtitleTrackMetadata; + contentProvider?: () => Promise; + }> = []; + /** * A callback that is fired whenever the conversion progresses. Returns a number between 0 and 1, indicating the * completion of the conversion. Note that a progress of 1 doesn't necessarily mean the conversion is complete; @@ -551,14 +598,14 @@ export class Conversion { if (typeof options.video !== 'function') { validateVideoOptions(options.video); - } else { - // We'll validate the return value later } if (typeof options.audio !== 'function') { validateAudioOptions(options.audio); - } else { - // We'll validate the return value later + } + + if (typeof options.subtitle !== 'function') { + validateSubtitleOptions(options.subtitle); } if (options.trim !== undefined && (!options.trim || typeof options.trim !== 'object')) { @@ -614,9 +661,10 @@ export class Conversion { let nVideo = 1; let nAudio = 1; + let nSubtitle = 1; for (const track of inputTracks) { - let trackOptions: ConversionVideoOptions | ConversionAudioOptions | undefined = undefined; + let trackOptions: ConversionVideoOptions | ConversionAudioOptions | ConversionSubtitleOptions | undefined = undefined; if (track.isVideoTrack()) { if (this._options.video) { if (typeof this._options.video === 'function') { @@ -637,6 +685,16 @@ export class Conversion { trackOptions = this._options.audio; } } + } else if (track.isSubtitleTrack()) { + if (this._options.subtitle) { + if (typeof this._options.subtitle === 'function') { + trackOptions = await this._options.subtitle(track, nSubtitle); + validateSubtitleOptions(trackOptions); + nSubtitle++; + } else { + trackOptions = this._options.subtitle; + } + } } else { assert(false); } @@ -669,6 +727,8 @@ export class Conversion { await this._processVideoTrack(track, (trackOptions ?? {}) as ConversionVideoOptions); } else if (track.isAudioTrack()) { await this._processAudioTrack(track, (trackOptions ?? {}) as ConversionAudioOptions); + } else if (track.isSubtitleTrack()) { + await this._processSubtitleTrack(track, (trackOptions ?? {}) as ConversionSubtitleOptions); } } @@ -788,6 +848,52 @@ export class Conversion { return elements; } + /** + * Adds an external subtitle track to the output. This can be called after `init()` but before `execute()`. + * This is useful for adding subtitle tracks from separate files that are not part of the input video. + * + * @param source - The subtitle source to add + * @param metadata - Optional metadata for the subtitle track + * @param contentProvider - Optional async function that will be called after the output starts to add content to the subtitle source + */ + addExternalSubtitleTrack( + source: SubtitleSource, + metadata: SubtitleTrackMetadata = {}, + contentProvider?: () => Promise, + ) { + if (this._executed) { + throw new Error('Cannot add subtitle tracks after conversion has been executed.'); + } + if (this.output.state !== 'pending') { + throw new Error('Cannot add subtitle tracks after output has been started.'); + } + + // Check track count limits + const outputTrackCounts = this.output.format.getSupportedTrackCounts(); + const currentSubtitleCount = this._addedCounts.subtitle + this._externalSubtitleSources.length; + + if (currentSubtitleCount >= outputTrackCounts.subtitle.max) { + throw new Error( + `Cannot add more subtitle tracks. Maximum of ${outputTrackCounts.subtitle.max} subtitle track(s) allowed.`, + ); + } + + const totalTrackCount = this._totalTrackCount + this._externalSubtitleSources.length + 1; + if (totalTrackCount > outputTrackCounts.total.max) { + throw new Error( + `Cannot add more tracks. Maximum of ${outputTrackCounts.total.max} total track(s) allowed.`, + ); + } + + this._externalSubtitleSources.push({ source, metadata, contentProvider }); + + // Update validity check to include external subtitles + this.isValid = this._totalTrackCount + this._externalSubtitleSources.length >= outputTrackCounts.total.min + && this._addedCounts.video >= outputTrackCounts.video.min + && this._addedCounts.audio >= outputTrackCounts.audio.min + && this._addedCounts.subtitle + this._externalSubtitleSources.length >= outputTrackCounts.subtitle.min; + } + /** * Executes the conversion process. Resolves once conversion is complete. * @@ -825,9 +931,23 @@ export class Conversion { this.onProgress?.(0); } + // Add external subtitle tracks before starting the output + for (const { source, metadata } of this._externalSubtitleSources) { + this.output.addSubtitleTrack(source, metadata); + } + await this.output.start(); this._start(); + // Now that output has started and tracks are connected, run content providers + const contentProviderPromises = this._externalSubtitleSources + .filter(s => s.contentProvider) + .map(s => s.contentProvider!()); + + if (contentProviderPromises.length > 0) { + this._trackPromises.push(...contentProviderPromises); + } + try { await Promise.all(this._trackPromises); } catch (error) { @@ -1550,6 +1670,119 @@ export class Conversion { } } + async _processSubtitleTrack(track: InputSubtitleTrack, trackOptions: ConversionSubtitleOptions) { + const sourceCodec = track.codec; + if (!sourceCodec) { + this.discardedTracks.push({ + track, + reason: 'unknown_source_codec', + }); + return; + } + + // Determine target codec + let targetCodec = trackOptions.codec ?? sourceCodec; + const supportedCodecs = this.output.format.getSupportedSubtitleCodecs(); + + // Check if target codec is supported by output format + if (!supportedCodecs.includes(targetCodec)) { + // Try to use source codec if no specific codec was requested + if (!trackOptions.codec && supportedCodecs.includes(sourceCodec)) { + targetCodec = sourceCodec; + } else { + // If a specific codec was requested but not supported, or source codec not supported, discard + this.discardedTracks.push({ + track, + reason: 'no_encodable_target_codec', + }); + return; + } + } + + // Create subtitle source + const subtitleSource = new TextSubtitleSource(targetCodec); + + // Add track promise to extract and add subtitle cues + this._trackPromises.push((async () => { + await this._started; + + let subtitleText: string; + + // If no trim or codec conversion needed, use the efficient export method + if (this._startTimestamp === 0 && !Number.isFinite(this._endTimestamp) && targetCodec === sourceCodec) { + subtitleText = await track.exportToText(); + } else { + // Extract and adjust cues for trim/conversion + const cues: SubtitleCue[] = []; + for await (const cue of track.getCues()) { + const cueEndTime = cue.timestamp + cue.duration; + + // Apply trim if needed + if (this._startTimestamp > 0 || Number.isFinite(this._endTimestamp)) { + // Skip cues completely outside trim range + if (cueEndTime <= this._startTimestamp || cue.timestamp >= this._endTimestamp) { + continue; + } + + // Adjust cue timing + const adjustedTimestamp = Math.max(cue.timestamp - this._startTimestamp, 0); + const adjustedEndTime = Math.min(cueEndTime - this._startTimestamp, this._endTimestamp - this._startTimestamp); + + cues.push({ + ...cue, + timestamp: adjustedTimestamp, + duration: adjustedEndTime - adjustedTimestamp, + }); + } else { + cues.push(cue); + } + + if (this._canceled) { + return; + } + } + + // Convert to target format + if (targetCodec === 'srt') { + subtitleText = formatCuesToSrt(cues); + } else if (targetCodec === 'webvtt') { + subtitleText = formatCuesToWebVTT(cues); + } else if (targetCodec === 'ass' || targetCodec === 'ssa') { + // When converting to ASS/SSA, try to preserve the header from source if it's also ASS/SSA + let header = ''; + if (sourceCodec === 'ass' || sourceCodec === 'ssa') { + // Get the full text to extract header + const fullText = await track.exportToText(); + const eventsIndex = fullText.indexOf('[Events]'); + if (eventsIndex !== -1) { + // Extract everything before [Events] + Format line + const formatMatch = fullText.substring(eventsIndex).match(/Format:[^\n]+\n/); + if (formatMatch) { + header = fullText.substring(0, eventsIndex + formatMatch.index! + formatMatch[0].length); + } + } + } + subtitleText = formatCuesToAss(cues, header); + } else { + // For other formats (tx3g, ttml), export from track + subtitleText = await track.exportToText(targetCodec); + } + } + + await subtitleSource.add(subtitleText); + subtitleSource.close(); + })()); + + this.output.addSubtitleTrack(subtitleSource, { + languageCode: isIso639Dash2LanguageCode(track.languageCode) ? track.languageCode : undefined, + name: track.name ?? undefined, + }); + this._addedCounts.subtitle++; + this._totalTrackCount++; + + this.utilizedTracks.push(track); + } + /** @internal */ _resampleAudio( track: InputAudioTrack, diff --git a/src/index.ts b/src/index.ts index 1ee6ded7..a8e25409 100644 --- a/src/index.ts +++ b/src/index.ts @@ -170,6 +170,7 @@ export { InputTrack, InputVideoTrack, InputAudioTrack, + InputSubtitleTrack, PacketStats, } from './input-track'; export { @@ -206,6 +207,7 @@ export { ConversionVideoOptions, ConversionAudioOptions, ConversionCanceledError, + ConversionSubtitleOptions, DiscardedTrack, } from './conversion'; export { @@ -223,5 +225,17 @@ export { AttachedFile, TrackDisposition, } from './metadata'; +export type { SubtitleMetadata, SubtitleCue, SubtitleConfig } from './subtitles'; +export { + parseSrtTimestamp, + formatSrtTimestamp, + splitSrtIntoCues, + formatCuesToSrt, + formatCuesToWebVTT, + parseAssTimestamp, + formatAssTimestamp, + splitAssIntoCues, + formatCuesToAss, +} from './subtitles'; // 🐡🦔 diff --git a/src/input-track.ts b/src/input-track.ts index 607e56a2..31bebfd0 100644 --- a/src/input-track.ts +++ b/src/input-track.ts @@ -6,7 +6,7 @@ * file, You can obtain one at https://mozilla.org/MPL/2.0/. */ -import { AudioCodec, MediaCodec, VideoCodec } from './codec'; +import { AudioCodec, MediaCodec, SubtitleCodec, VideoCodec } from './codec'; import { determineVideoPacketType } from './codec-data'; import { customAudioDecoders, customVideoDecoders } from './custom-coder'; import { Input } from './input'; @@ -15,6 +15,7 @@ import { assert, Rotation } from './misc'; import { TrackType } from './output'; import { EncodedPacket, PacketType } from './packet'; import { TrackDisposition } from './metadata'; +import { SubtitleCue } from './subtitles'; /** * Contains aggregate statistics about the encoded packets of a track. @@ -90,6 +91,11 @@ export abstract class InputTrack { return this instanceof InputAudioTrack; } + /** Returns true if and only if this track is a subtitle track. */ + isSubtitleTrack(): this is InputSubtitleTrack { + return this instanceof InputSubtitleTrack; + } + /** The unique ID of this track in the input file. */ get id() { return this._backing.getId(); @@ -440,3 +446,95 @@ export class InputAudioTrack extends InputTrack { return 'key'; // No audio codec with delta packets } } + +export interface InputSubtitleTrackBacking extends InputTrackBacking { + getCodec(): SubtitleCodec | null; + getCodecPrivate(): string | null; + getCues(): AsyncGenerator; +} + +/** + * Represents a subtitle track in an input file. + * @group Input files & tracks + * @public + */ +export class InputSubtitleTrack extends InputTrack { + /** @internal */ + override _backing: InputSubtitleTrackBacking; + + /** @internal */ + constructor(input: Input, backing: InputSubtitleTrackBacking) { + super(input, backing); + + this._backing = backing; + } + + get type(): TrackType { + return 'subtitle'; + } + + get codec(): SubtitleCodec | null { + return this._backing.getCodec(); + } + + /** + * Returns an async iterator that yields all subtitle cues in this track. + */ + getCues(): AsyncGenerator { + return this._backing.getCues(); + } + + /** + * Exports all subtitle cues to text format. If targetFormat is specified, + * attempts to convert to that format (limited conversion support). + */ + async exportToText(targetFormat?: SubtitleCodec): Promise { + const cues: SubtitleCue[] = []; + for await (const cue of this.getCues()) { + cues.push(cue); + } + + const codec = targetFormat || this.codec; + const codecPrivate = this._backing.getCodecPrivate(); + + if (codec === 'srt') { + const { formatCuesToSrt } = await import('./subtitles'); + return formatCuesToSrt(cues); + } else if (codec === 'ass' || codec === 'ssa') { + const { formatCuesToAss, splitAssIntoCues } = await import('./subtitles'); + + // For ASS, we need to merge Comment lines from CodecPrivate with Dialogue lines from blocks + // CodecPrivate contains: header + Comment lines + // Blocks contain: Dialogue lines (without timestamps) + // We need to reconstruct full ASS file + + // Parse CodecPrivate to extract the header with Comments preserved + const parsed = codecPrivate ? splitAssIntoCues(codecPrivate) : { header: '', cues: [] }; + + // Use the header (includes Comment lines) and add our cues (from blocks) + return formatCuesToAss(cues, parsed.header); + } else if (codec === 'webvtt') { + const { formatCuesToWebVTT } = await import('./subtitles'); + // Use codecPrivate as preamble if available + return formatCuesToWebVTT(cues, codecPrivate || undefined); + } else { + // Fallback to SRT for unknown formats + const { formatCuesToSrt } = await import('./subtitles'); + return formatCuesToSrt(cues); + } + } + + async getCodecParameterString(): Promise { + return this.codec; + } + + async canDecode(): Promise { + // Subtitles are always text-based and can be decoded + return this.codec !== null; + } + + async determinePacketType(packet: EncodedPacket): Promise { + // Subtitle packets are always key packets + return 'key'; + } +} diff --git a/src/input.ts b/src/input.ts index b1ce3cab..985779a9 100644 --- a/src/input.ts +++ b/src/input.ts @@ -150,6 +150,12 @@ export class Input implements Disposable { return tracks.filter(x => x.isAudioTrack()); } + /** Returns the list of all subtitle tracks of this input file. */ + async getSubtitleTracks() { + const tracks = await this.getTracks(); + return tracks.filter(x => x.isSubtitleTrack()); + } + /** Returns the primary video track of this input file, or null if there are no video tracks. */ async getPrimaryVideoTrack() { const tracks = await this.getTracks(); @@ -162,6 +168,30 @@ export class Input implements Disposable { return tracks.find(x => x.isAudioTrack()) ?? null; } + /** + * Returns the list of all subtitle tracks of this input file. This is a convenience property that calls + * {@link Input.getSubtitleTracks} and caches the result. Note that this property is a promise! + */ + get subtitleTracks() { + return this.getSubtitleTracks(); + } + + /** + * Returns the list of all video tracks of this input file. This is a convenience property that calls + * {@link Input.getVideoTracks} and caches the result. Note that this property is a promise! + */ + get videoTracks() { + return this.getVideoTracks(); + } + + /** + * Returns the list of all audio tracks of this input file. This is a convenience property that calls + * {@link Input.getAudioTracks} and caches the result. Note that this property is a promise! + */ + get audioTracks() { + return this.getAudioTracks(); + } + /** Returns the full MIME type of this input file, including track codecs. */ async getMimeType() { const demuxer = await this._getDemuxer(); diff --git a/src/isobmff/isobmff-boxes.ts b/src/isobmff/isobmff-boxes.ts index 63330baf..d2dfe678 100644 --- a/src/isobmff/isobmff-boxes.ts +++ b/src/isobmff/isobmff-boxes.ts @@ -595,8 +595,14 @@ export const stsd = (trackData: IsobmffTrackData) => { trackData, ); } else if (trackData.type === 'subtitle') { + const boxName = SUBTITLE_CODEC_TO_BOX_NAME[trackData.track.source._codec]; + if (!boxName) { + throw new Error( + `Subtitle codec '${trackData.track.source._codec}' is not supported in MP4/MOV. Only WebVTT is supported.`, + ); + } sampleDescription = subtitleSampleDescription( - SUBTITLE_CODEC_TO_BOX_NAME[trackData.track.source._codec], + boxName, trackData, ); } @@ -984,12 +990,20 @@ const dec3 = (trackData: IsobmffAudioTrackData) => { export const subtitleSampleDescription = ( compressionType: string, trackData: IsobmffSubtitleTrackData, -) => box(compressionType, [ - Array(6).fill(0), // Reserved - u16(1), // Data reference index -], [ - SUBTITLE_CODEC_TO_CONFIGURATION_BOX[trackData.track.source._codec](trackData), -]); +) => { + const configBox = SUBTITLE_CODEC_TO_CONFIGURATION_BOX[trackData.track.source._codec]; + if (!configBox) { + throw new Error( + `Subtitle codec '${trackData.track.source._codec}' is not supported in MP4/MOV. Only WebVTT is supported.`, + ); + } + return box(compressionType, [ + Array(6).fill(0), // Reserved + u16(1), // Data reference index + ], [ + configBox(trackData), + ]); +}; export const vttC = (trackData: IsobmffSubtitleTrackData) => box('vttC', [ ...textEncoder.encode(trackData.info.config.description), @@ -1755,15 +1769,19 @@ const audioCodecToConfigurationBox = (codec: AudioCodec, isQuickTime: boolean) = return null; }; -const SUBTITLE_CODEC_TO_BOX_NAME: Record = { +const SUBTITLE_CODEC_TO_BOX_NAME: Partial> = { webvtt: 'wvtt', + tx3g: 'tx3g', + ttml: 'stpp', }; -const SUBTITLE_CODEC_TO_CONFIGURATION_BOX: Record< +const SUBTITLE_CODEC_TO_CONFIGURATION_BOX: Partial Box | null -> = { +>> = { webvtt: vttC, + tx3g: () => null, // tx3g doesn't require a configuration box + ttml: () => null, // stpp configuration is optional }; const getLanguageCodeInt = (code: string) => { diff --git a/src/isobmff/isobmff-demuxer.ts b/src/isobmff/isobmff-demuxer.ts index c70efcc9..eace10e1 100644 --- a/src/isobmff/isobmff-demuxer.ts +++ b/src/isobmff/isobmff-demuxer.ts @@ -17,6 +17,7 @@ import { parsePcmCodec, PCM_AUDIO_CODECS, PcmAudioCodec, + SubtitleCodec, VideoCodec, } from '../codec'; import { @@ -37,6 +38,8 @@ import { Input } from '../input'; import { InputAudioTrack, InputAudioTrackBacking, + InputSubtitleTrack, + InputSubtitleTrackBacking, InputTrack, InputTrackBacking, InputVideoTrack, @@ -64,6 +67,7 @@ import { roundIfAlmostInteger, } from '../misc'; import { EncodedPacket, PLACEHOLDER_DATA } from '../packet'; +import { SubtitleCue } from '../subtitles'; import { buildIsobmffMimeType } from './isobmff-misc'; import { MAX_BOX_HEADER_SIZE, @@ -147,10 +151,17 @@ type InternalTrack = { codecDescription: Uint8Array | null; aacCodecInfo: AacCodecInfo | null; }; +} | { + info: { + type: 'subtitle'; + codec: SubtitleCodec | null; + codecPrivateText: string | null; + }; }); type InternalVideoTrack = InternalTrack & { info: { type: 'video' } }; type InternalAudioTrack = InternalTrack & { info: { type: 'audio' } }; +type InternalSubtitleTrack = InternalTrack & { info: { type: 'subtitle' } }; type SampleTable = { sampleTimingEntries: SampleTimingEntry[]; @@ -692,6 +703,10 @@ export class IsobmffDemuxer extends Demuxer { const audioTrack = track as InternalAudioTrack; track.inputTrack = new InputAudioTrack(this.input, new IsobmffAudioTrackBacking(audioTrack)); this.tracks.push(track); + } else if (track.info.type === 'subtitle') { + const subtitleTrack = track as InternalSubtitleTrack; + track.inputTrack = new InputSubtitleTrack(this.input, new IsobmffSubtitleTrackBacking(subtitleTrack)); + this.tracks.push(track); } } @@ -864,6 +879,12 @@ export class IsobmffDemuxer extends Demuxer { codecDescription: null, aacCodecInfo: null, }; + } else if (handlerType === 'text' || handlerType === 'subt' || handlerType === 'sbtl') { + track.info = { + type: 'subtitle', + codec: null, + codecPrivateText: null, + }; } }; break; @@ -926,6 +947,28 @@ export class IsobmffDemuxer extends Demuxer { slice.skip(4 + 4 + 4 + 2 + 32 + 2 + 2); + this.readContiguousBoxes( + slice.slice( + slice.filePos, + (sampleBoxStartPos + sampleBoxInfo.totalSize) - slice.filePos, + ), + ); + } else if (track.info.type === 'subtitle') { + // Parse subtitle sample entries + slice.skip(6); // Reserved + const dataReferenceIndex = readU16Be(slice); + + // Detect subtitle codec based on sample entry box type + if (lowercaseBoxName === 'wvtt') { + track.info.codec = 'webvtt'; + } else if (lowercaseBoxName === 'tx3g' || lowercaseBoxName === 'text') { + // 3GPP Timed Text + track.info.codec = 'tx3g'; + } else if (lowercaseBoxName === 'stpp') { + // TTML/IMSC subtitles + track.info.codec = 'ttml'; + } + this.readContiguousBoxes( slice.slice( slice.filePos, @@ -1092,7 +1135,9 @@ export class IsobmffDemuxer extends Demuxer { } assert(track.info); - track.info.codecDescription = readBytes(slice, boxInfo.contentSize); + if (track.info.type === 'video') { + track.info.codecDescription = readBytes(slice, boxInfo.contentSize); + } }; break; case 'hvcC': { @@ -1102,7 +1147,9 @@ export class IsobmffDemuxer extends Demuxer { } assert(track.info); - track.info.codecDescription = readBytes(slice, boxInfo.contentSize); + if (track.info.type === 'video') { + track.info.codecDescription = readBytes(slice, boxInfo.contentSize); + } }; break; case 'vpcC': { @@ -2962,6 +3009,92 @@ class IsobmffAudioTrackBacking extends IsobmffTrackBacking implements InputAudio } } +class IsobmffSubtitleTrackBacking extends IsobmffTrackBacking implements InputSubtitleTrackBacking { + override internalTrack: InternalSubtitleTrack; + + constructor(internalTrack: InternalSubtitleTrack) { + super(internalTrack); + this.internalTrack = internalTrack; + } + + override getCodec(): SubtitleCodec | null { + return this.internalTrack.info.codec; + } + + getCodecPrivate(): string | null { + return this.internalTrack.info.codecPrivateText; + } + + async *getCues(): AsyncGenerator { + // Use the existing packet reading infrastructure + let packet = await this.getFirstPacket({}); + + while (packet) { + // Parse WebVTT box structure or plain text + let text = ''; + + if (this.internalTrack.info.codec === 'webvtt') { + // WebVTT in MP4 is stored as a series of boxes + const dataBytes = new Uint8Array(packet.data); + const dataSlice = new FileSlice( + dataBytes, + new DataView(dataBytes.buffer, dataBytes.byteOffset, dataBytes.byteLength), + 0, + 0, + dataBytes.length, + ); + + while (dataSlice.remainingLength > 0) { + const boxHeader = readBoxHeader(dataSlice); + if (!boxHeader) break; + + if (boxHeader.name === 'vttc') { + // WebVTT cue box, contains the actual text + // Skip to content and continue parsing nested boxes + const vttcEnd = dataSlice.filePos + boxHeader.contentSize; + + while (dataSlice.filePos < vttcEnd && dataSlice.remainingLength > 0) { + const innerBox = readBoxHeader(dataSlice); + if (!innerBox) break; + + if (innerBox.name === 'payl') { + // Payload box contains the actual text + const textBytes = readBytes(dataSlice, innerBox.contentSize); + const decoder = new TextDecoder('utf-8'); + text += decoder.decode(textBytes); + } else { + // Skip unknown boxes + dataSlice.skip(innerBox.contentSize); + } + } + } else if (boxHeader.name === 'vtte') { + // Empty cue box, skip it + dataSlice.skip(boxHeader.contentSize); + } else { + // Skip unknown boxes + dataSlice.skip(boxHeader.contentSize); + } + } + } else { + // Plain text for other subtitle formats (tx3g, etc.) + const decoder = new TextDecoder('utf-8'); + text = decoder.decode(packet.data); + } + + if (text) { + // Only yield cues with actual text content + yield { + timestamp: packet.timestamp, + duration: packet.duration, + text, + }; + } + + packet = await this.getNextPacket(packet, {}); + } + } +} + const getSampleIndexForTimestamp = (sampleTable: SampleTable, timescaleUnits: number) => { if (sampleTable.presentationTimestamps) { const index = binarySearchLessOrEqual( diff --git a/src/isobmff/isobmff-muxer.ts b/src/isobmff/isobmff-muxer.ts index 5074a1d9..089e05ac 100644 --- a/src/isobmff/isobmff-muxer.ts +++ b/src/isobmff/isobmff-muxer.ts @@ -274,6 +274,11 @@ export class IsobmffMuxer extends Muxer { } else { const map: Record = { webvtt: 'wvtt', + tx3g: 'tx3g', + ttml: 'stpp', + srt: 'wvtt', // MP4 stores SRT as WebVTT + ass: 'wvtt', // MP4 stores ASS as WebVTT + ssa: 'wvtt', // MP4 stores SSA as WebVTT }; return map[trackData.track.source._codec]; } @@ -628,7 +633,9 @@ export class IsobmffMuxer extends Muxer { trackData.cueQueue.push(cue); await this.processWebVTTCues(trackData, cue.timestamp); } else { - // TODO + throw new Error( + `${track.source._codec} subtitles are not supported in ${this.format._name}. Only WebVTT is supported.`, + ); } } finally { release(); diff --git a/src/matroska/ebml.ts b/src/matroska/ebml.ts index ba1b7368..23a759e2 100644 --- a/src/matroska/ebml.ts +++ b/src/matroska/ebml.ts @@ -757,6 +757,9 @@ export const CODEC_STRING_MAP: Partial> = { 'pcm-f64': 'A_PCM/FLOAT/IEEE', 'webvtt': 'S_TEXT/WEBVTT', + 'srt': 'S_TEXT/UTF8', + 'ass': 'S_TEXT/ASS', + 'ssa': 'S_TEXT/SSA', }; export function assertDefinedSize(size: number | undefined): asserts size is number { diff --git a/src/matroska/matroska-demuxer.ts b/src/matroska/matroska-demuxer.ts index ad1295a6..070b8956 100644 --- a/src/matroska/matroska-demuxer.ts +++ b/src/matroska/matroska-demuxer.ts @@ -19,6 +19,7 @@ import { extractVideoCodecString, MediaCodec, OPUS_SAMPLE_RATE, + SubtitleCodec, VideoCodec, } from '../codec'; import { Demuxer } from '../demuxer'; @@ -26,6 +27,8 @@ import { Input } from '../input'; import { InputAudioTrack, InputAudioTrackBacking, + InputSubtitleTrack, + InputSubtitleTrackBacking, InputTrack, InputTrackBacking, InputVideoTrack, @@ -48,6 +51,7 @@ import { UNDETERMINED_LANGUAGE, } from '../misc'; import { EncodedPacket, EncodedPacketSideData, PLACEHOLDER_DATA } from '../packet'; +import { SubtitleCue } from '../subtitles'; import { assertDefinedSize, CODEC_STRING_MAP, @@ -218,10 +222,16 @@ type InternalTrack = { codec: AudioCodec | null; codecDescription: Uint8Array | null; aacCodecInfo: AacCodecInfo | null; + } + | { + type: 'subtitle'; + codec: SubtitleCodec | null; + codecPrivateText: string | null; }; }; type InternalVideoTrack = InternalTrack & { info: { type: 'video' } }; type InternalAudioTrack = InternalTrack & { info: { type: 'audio' } }; +type InternalSubtitleTrack = InternalTrack & { info: { type: 'subtitle' } }; const METADATA_ELEMENTS = [ { id: EBMLId.SeekHead, flag: 'seekHeadSeen' }, @@ -1122,6 +1132,29 @@ export class MatroskaDemuxer extends Demuxer { const inputTrack = new InputAudioTrack(this.input, new MatroskaAudioTrackBacking(audioTrack)); this.currentTrack.inputTrack = inputTrack; this.currentSegment.tracks.push(this.currentTrack); + } else if (this.currentTrack.info.type === 'subtitle') { + // Map Matroska codec IDs to our subtitle codecs + const codecId = this.currentTrack.codecId; + if (codecId === 'S_TEXT/UTF8') { + this.currentTrack.info.codec = 'srt'; + } else if (codecId === 'S_TEXT/SSA' || codecId === 'S_SSA') { + this.currentTrack.info.codec = 'ssa'; + } else if (codecId === 'S_TEXT/ASS' || codecId === 'S_ASS') { + this.currentTrack.info.codec = 'ass'; + } else if (codecId === 'S_TEXT/WEBVTT' || codecId === 'D_WEBVTT' || codecId === 'D_WEBVTT/SUBTITLES') { + this.currentTrack.info.codec = 'webvtt'; + } + + // Store CodecPrivate as text for ASS/SSA headers + if (this.currentTrack.codecPrivate) { + const decoder = new TextDecoder('utf-8'); + this.currentTrack.info.codecPrivateText = decoder.decode(this.currentTrack.codecPrivate); + } + + const subtitleTrack = this.currentTrack as InternalSubtitleTrack; + const inputTrack = new InputSubtitleTrack(this.input, new MatroskaSubtitleTrackBacking(subtitleTrack)); + this.currentTrack.inputTrack = inputTrack; + this.currentSegment.tracks.push(this.currentTrack); } } @@ -1159,6 +1192,12 @@ export class MatroskaDemuxer extends Demuxer { codecDescription: null, aacCodecInfo: null, }; + } else if (type === 17) { + this.currentTrack.info = { + type: 'subtitle', + codec: null, + codecPrivateText: null, + }; } }; break; @@ -2441,3 +2480,78 @@ class MatroskaAudioTrackBacking extends MatroskaTrackBacking implements InputAud }; } } +class MatroskaSubtitleTrackBacking extends MatroskaTrackBacking implements InputSubtitleTrackBacking { + override internalTrack: InternalSubtitleTrack; + + constructor(internalTrack: InternalSubtitleTrack) { + super(internalTrack); + this.internalTrack = internalTrack; + } + + override getCodec(): SubtitleCodec | null { + return this.internalTrack.info.codec; + } + + getCodecPrivate(): string | null { + return this.internalTrack.info.codecPrivateText; + } + + async *getCues(): AsyncGenerator { + // Use the existing packet reading infrastructure + let packet = await this.getFirstPacket({}); + + while (packet) { + // Decode subtitle data as UTF-8 text + const decoder = new TextDecoder('utf-8'); + const text = decoder.decode(packet.data); + + yield { + timestamp: packet.timestamp, + duration: packet.duration, + text, + }; + + packet = await this.getNextPacket(packet, {}); + } + } +} + +/** Sorts blocks such that referenced blocks come before the blocks that reference them. */ +const sortBlocksByReferences = (blocks: ClusterBlock[]) => { + const timestampToBlock = new Map(); + + for (let i = 0; i < blocks.length; i++) { + const block = blocks[i]!; + timestampToBlock.set(block.timestamp, block); + } + + const processedBlocks = new Set(); + const result: ClusterBlock[] = []; + + const processBlock = (block: ClusterBlock) => { + if (processedBlocks.has(block)) { + return; + } + + // Marking the block as processed here already; prevents this algorithm from dying on cycles + processedBlocks.add(block); + + for (let j = 0; j < block.referencedTimestamps.length; j++) { + const timestamp = block.referencedTimestamps[j]!; + const otherBlock = timestampToBlock.get(timestamp); + if (!otherBlock) { + continue; + } + + processBlock(otherBlock); + } + + result.push(block); + }; + + for (let i = 0; i < blocks.length; i++) { + processBlock(blocks[i]!); + } + + return result; +}; diff --git a/src/matroska/matroska-muxer.ts b/src/matroska/matroska-muxer.ts index d4a73afd..1f485ef3 100644 --- a/src/matroska/matroska-muxer.ts +++ b/src/matroska/matroska-muxer.ts @@ -46,6 +46,7 @@ import { formatSubtitleTimestamp, inlineTimestampRegex, parseSubtitleTimestamp, + convertDialogueLineToMkvFormat, } from '../subtitles'; import { aacChannelMap, @@ -711,7 +712,12 @@ export class MatroskaMuxer extends Muxer { return trackData.info.decoderConfig.codec; } else { const map: Record = { - webvtt: 'wvtt', + webvtt: 'S_TEXT/WEBVTT', + tx3g: 'S_TEXT/UTF8', // Matroska doesn't have tx3g, convert to SRT + ttml: 'S_TEXT/WEBVTT', // Matroska doesn't have TTML, convert to WebVTT + srt: 'S_TEXT/UTF8', + ass: 'S_TEXT/ASS', + ssa: 'S_TEXT/SSA', }; return map[trackData.track.source._codec]; } @@ -949,14 +955,17 @@ export class MatroskaMuxer extends Muxer { let bodyText = cue.text; const timestampMs = Math.round(timestamp * 1000); - // Replace in-body timestamps so that they're relative to the cue start time - inlineTimestampRegex.lastIndex = 0; - bodyText = bodyText.replace(inlineTimestampRegex, (match) => { - const time = parseSubtitleTimestamp(match.slice(1, -1)); - const offsetTime = time - timestampMs; + if (track.source._codec === 'ass' || track.source._codec === 'ssa') { + bodyText = convertDialogueLineToMkvFormat(bodyText); + } else { + inlineTimestampRegex.lastIndex = 0; + bodyText = bodyText.replace(inlineTimestampRegex, (match) => { + const time = parseSubtitleTimestamp(match.slice(1, -1)); + const offsetTime = time - timestampMs; - return `<${formatSubtitleTimestamp(offsetTime)}>`; - }); + return `<${formatSubtitleTimestamp(offsetTime)}>`; + }); + } const body = textEncoder.encode(bodyText); const additions = `${cue.settings ?? ''}\n${cue.identifier ?? ''}\n${cue.notes ?? ''}`; diff --git a/src/output-format.ts b/src/output-format.ts index adef3a90..6f675954 100644 --- a/src/output-format.ts +++ b/src/output-format.ts @@ -310,15 +310,15 @@ export class Mp4OutputFormat extends IsobmffOutputFormat { 'pcm-s24', 'pcm-s24be', 'pcm-s32', - 'pcm-s32be', - 'pcm-f32', - 'pcm-f32be', - 'pcm-f64', - 'pcm-f64be', - - ...SUBTITLE_CODECS, - ]; - } + 'pcm-s32be', + 'pcm-f32', + 'pcm-f32be', + 'pcm-f64', + 'pcm-f64be', + // Only WebVTT subtitles are supported in MP4 + 'webvtt', + ]; + } /** @internal */ override _codecUnsupportedHint(codec: MediaCodec) { @@ -358,6 +358,8 @@ export class MovOutputFormat extends IsobmffOutputFormat { return [ ...VIDEO_CODECS, ...AUDIO_CODECS, + // Only WebVTT subtitles are supported in MOV + 'webvtt', ]; } diff --git a/src/subtitles.ts b/src/subtitles.ts index 2ae1e508..87ec56ef 100644 --- a/src/subtitles.ts +++ b/src/subtitles.ts @@ -6,25 +6,50 @@ * file, You can obtain one at https://mozilla.org/MPL/2.0/. */ +import type { SubtitleCodec } from './codec.js'; + +/** + * Represents a single subtitle cue with timing and text. + * @group Media sources + * @public + */ export type SubtitleCue = { - timestamp: number; // in seconds - duration: number; // in seconds + /** When the subtitle should appear, in seconds. */ + timestamp: number; + /** How long the subtitle should be displayed, in seconds. */ + duration: number; + /** The subtitle text content. */ text: string; + /** Optional cue identifier. */ identifier?: string; + /** Optional format-specific settings (e.g., VTT positioning). */ settings?: string; + /** Optional notes or comments. */ notes?: string; }; +/** + * Subtitle configuration data. + * @group Media sources + * @public + */ export type SubtitleConfig = { + /** Format-specific description (e.g., WebVTT preamble, ASS/SSA header). */ description: string; }; +/** + * Metadata associated with subtitle cues. + * @group Media sources + * @public + */ export type SubtitleMetadata = { + /** Optional subtitle configuration. */ config?: SubtitleConfig; }; type SubtitleParserOptions = { - codec: 'webvtt'; + codec: SubtitleCodec; output: (cue: SubtitleCue, metadata: SubtitleMetadata) => unknown; }; @@ -42,6 +67,45 @@ export class SubtitleParser { } parse(text: string) { + if (this.options.codec === 'srt') { + this.parseSrt(text); + } else if (this.options.codec === 'ass' || this.options.codec === 'ssa') { + this.parseAss(text); + } else if (this.options.codec === 'tx3g') { + this.parseTx3g(text); + } else if (this.options.codec === 'ttml') { + this.parseTtml(text); + } else { + this.parseWebVTT(text); + } + } + + private parseSrt(text: string) { + const cues = splitSrtIntoCues(text); + + for (let i = 0; i < cues.length; i++) { + const meta: SubtitleMetadata = {}; + // SRT doesn't have a header, but we need to provide a config for the first cue + if (i === 0) { + meta.config = { description: '' }; + } + this.options.output(cues[i]!, meta); + } + } + + private parseAss(text: string) { + const { header, cues } = splitAssIntoCues(text); + + for (let i = 0; i < cues.length; i++) { + const meta: SubtitleMetadata = {}; + if (i === 0 && header) { + meta.config = { description: header }; + } + this.options.output(cues[i]!, meta); + } + } + + private parseWebVTT(text: string) { text = text.replaceAll('\r\n', '\n').replaceAll('\r', '\n'); cueBlockHeaderRegex.lastIndex = 0; @@ -105,9 +169,52 @@ export class SubtitleParser { this.options.output(cue, meta); } } + + private parseTx3g(text: string) { + // tx3g (3GPP Timed Text) samples are usually already plain text + // For now, treat as plain text cue - timing comes from container + const meta: SubtitleMetadata = { config: { description: '' } }; + const cue: SubtitleCue = { + timestamp: 0, + duration: 0, + text: text.trim(), + }; + this.options.output(cue, meta); + } + + private parseTtml(text: string) { + // Basic TTML parsing - extract text content from

elements + // TODO: Full TTML/IMSC parser with styling support + const pRegex = /]*>(.*?)<\/p>/gs; + const matches = [...text.matchAll(pRegex)]; + + for (let i = 0; i < matches.length; i++) { + const match = matches[i]!; + const content = match[1]?.replace(/<[^>]+>/g, '') || ''; // Strip inner tags + + const meta: SubtitleMetadata = {}; + if (i === 0) { + meta.config = { description: '' }; + } + + const cue: SubtitleCue = { + timestamp: 0, + duration: 0, + text: content.trim(), + }; + + this.options.output(cue, meta); + } + } } const timestampRegex = /(?:(\d{2}):)?(\d{2}):(\d{2}).(\d{3})/; + +/** + * Parses a WebVTT timestamp string to milliseconds. + * @group Media sources + * @internal + */ export const parseSubtitleTimestamp = (string: string) => { const match = timestampRegex.exec(string); if (!match) throw new Error('Expected match.'); @@ -118,6 +225,11 @@ export const parseSubtitleTimestamp = (string: string) => { + Number(match[4]); }; +/** + * Formats milliseconds to WebVTT timestamp format. + * @group Media sources + * @internal + */ export const formatSubtitleTimestamp = (timestamp: number) => { const hours = Math.floor(timestamp / (60 * 60 * 1000)); const minutes = Math.floor((timestamp % (60 * 60 * 1000)) / (60 * 1000)); @@ -129,3 +241,471 @@ export const formatSubtitleTimestamp = (timestamp: number) => { + seconds.toString().padStart(2, '0') + '.' + milliseconds.toString().padStart(3, '0'); }; + +// SRT parsing functions +const srtTimestampRegex = /(\d{2}):(\d{2}):(\d{2}),(\d{3})/; + +/** + * Parses an SRT timestamp string (HH:MM:SS,mmm) to seconds. + * @group Media sources + * @public + */ +export const parseSrtTimestamp = (timeString: string): number => { + const match = srtTimestampRegex.exec(timeString); + if (!match) throw new Error('Invalid SRT timestamp format'); + + const hours = Number(match[1]); + const minutes = Number(match[2]); + const seconds = Number(match[3]); + const milliseconds = Number(match[4]); + + return hours * 3600 + minutes * 60 + seconds + milliseconds / 1000; +}; + +/** + * Formats seconds to SRT timestamp format (HH:MM:SS,mmm). + * @group Media sources + * @public + */ +export const formatSrtTimestamp = (seconds: number): string => { + const hours = Math.floor(seconds / 3600); + const minutes = Math.floor((seconds % 3600) / 60); + const secs = Math.floor(seconds % 60); + const milliseconds = Math.round((seconds % 1) * 1000); + + return hours.toString().padStart(2, '0') + ':' + + minutes.toString().padStart(2, '0') + ':' + + secs.toString().padStart(2, '0') + ',' + + milliseconds.toString().padStart(3, '0'); +}; + +/** + * Splits SRT subtitle text into individual cues. + * @group Media sources + * @public + */ +export const splitSrtIntoCues = (text: string): SubtitleCue[] => { + text = text.replaceAll('\r\n', '\n').replaceAll('\r', '\n'); + + const cues: SubtitleCue[] = []; + const cueRegex = /(\d+)\n(\d{2}:\d{2}:\d{2},\d{3})\s+-->\s+(\d{2}:\d{2}:\d{2},\d{3})\n([\s\S]*?)(?=\n\n\d+\n|\n*$)/g; + + let match: RegExpExecArray | null; + while ((match = cueRegex.exec(text))) { + const startTime = parseSrtTimestamp(match[2]!); + const endTime = parseSrtTimestamp(match[3]!); + const cueText = match[4]!.trim(); + + cues.push({ + timestamp: startTime, + duration: endTime - startTime, + text: cueText, + identifier: match[1], + }); + } + + return cues; +}; + +/** + * Extracts plain text from ASS/SSA Dialogue/Comment line. + * If the text is already plain (not ASS format), returns as-is. + */ +const extractTextFromAssCue = (text: string): string => { + // Check if this is an ASS Dialogue/Comment line + if (text.startsWith('Dialogue:') || text.startsWith('Comment:')) { + // ASS format: Dialogue: Layer,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text + // We need to extract the last field (Text) which may contain commas + const colonIndex = text.indexOf(':'); + if (colonIndex === -1) return text; + + const afterColon = text.substring(colonIndex + 1); + const parts = afterColon.split(','); + + // Text is the 10th field (index 9), but it may contain commas + // So we need to join everything from index 9 onward + if (parts.length >= 10) { + return parts.slice(9).join(','); + } + } + + // Check if this is MKV ASS format (without Dialogue: prefix) + // MKV format: ReadOrder,Layer,Style,Name,MarginL,MarginR,MarginV,Effect,Text + // OR: Layer,Style,Name,MarginL,MarginR,MarginV,Effect,Text + const parts = text.split(','); + if (parts.length >= 8) { + const firstPart = parts[0]?.trim(); + const secondPart = parts[1]?.trim(); + + // Check if first field is numeric (Layer or ReadOrder) + if (firstPart && !isNaN(parseInt(firstPart))) { + // Check if second field is also numeric (ReadOrder,Layer format) + if (secondPart && !isNaN(parseInt(secondPart)) && parts.length >= 9) { + // MKV format with ReadOrder: text is 9th field (index 8) onward + return parts.slice(8).join(','); + } else if (parts.length >= 8) { + // Standard ASS format without ReadOrder: text is 8th field (index 7) onward + return parts.slice(7).join(','); + } + } + } + + // Not ASS format, return as-is + return text; +}; + +/** + * Formats subtitle cues back to SRT text format. + * @group Media sources + * @public + */ +export const formatCuesToSrt = (cues: SubtitleCue[]): string => { + return cues.map((cue, index) => { + const sequenceNumber = index + 1; + const startTime = formatSrtTimestamp(cue.timestamp); + const endTime = formatSrtTimestamp(cue.timestamp + cue.duration); + const text = extractTextFromAssCue(cue.text); + + return `${sequenceNumber}\n${startTime} --> ${endTime}\n${text}\n`; + }).join('\n'); +}; + +/** + * Formats subtitle cues back to WebVTT text format. + * @group Media sources + * @public + */ +export const formatCuesToWebVTT = (cues: SubtitleCue[], preamble?: string): string => { + // Start with the WebVTT header + let result = preamble || 'WEBVTT\n'; + + // Ensure there's a blank line after the header + if (!result.endsWith('\n\n')) { + result += '\n'; + } + + // Format each cue + const formattedCues = cues.map((cue) => { + const startTime = formatSubtitleTimestamp(cue.timestamp * 1000); // Convert to milliseconds + const endTime = formatSubtitleTimestamp((cue.timestamp + cue.duration) * 1000); + const text = extractTextFromAssCue(cue.text); + + // WebVTT doesn't require sequence numbers like SRT + return `${startTime} --> ${endTime}\n${text}`; + }); + + return result + formattedCues.join('\n\n'); +}; + +// ASS/SSA parsing functions +const assTimestampRegex = /(\d+):(\d{2}):(\d{2})\.(\d{2})/; + +/** + * Parses an ASS/SSA timestamp string (H:MM:SS.cc) to seconds. + * @group Media sources + * @public + */ +export const parseAssTimestamp = (timeString: string): number => { + const match = assTimestampRegex.exec(timeString); + if (!match) throw new Error('Invalid ASS timestamp format'); + + const hours = Number(match[1]); + const minutes = Number(match[2]); + const seconds = Number(match[3]); + const centiseconds = Number(match[4]); + + return hours * 3600 + minutes * 60 + seconds + centiseconds / 100; +}; + +/** + * Formats seconds to ASS/SSA timestamp format (H:MM:SS.cc). + * @group Media sources + * @public + */ +export const formatAssTimestamp = (seconds: number): string => { + const hours = Math.floor(seconds / 3600); + const minutes = Math.floor((seconds % 3600) / 60); + const secs = Math.floor(seconds % 60); + const centiseconds = Math.floor((seconds % 1) * 100); + + return hours.toString() + ':' + + minutes.toString().padStart(2, '0') + ':' + + secs.toString().padStart(2, '0') + '.' + + centiseconds.toString().padStart(2, '0'); +}; + +/** + * Splits ASS/SSA subtitle text into header (styles) and individual cues. + * Preserves all sections including [Fonts], [Graphics], and Aegisub sections. + * Aegisub sections are moved to the end to avoid breaking [Events]. + * @group Media sources + * @public + */ +export const splitAssIntoCues = (text: string): { header: string; cues: SubtitleCue[] } => { + text = text.replaceAll('\r\n', '\n').replaceAll('\r', '\n'); + + const lines = text.split('\n'); + + // Find [Events] section + const eventsIndex = lines.findIndex(line => line.trim() === '[Events]'); + if (eventsIndex === -1) { + return { header: text, cues: [] }; + } + + // Separate sections for proper ordering + const headerSections: string[] = []; // [Script Info], [V4+ Styles], etc. (before Events) + const eventsHeader: string[] = []; // [Events] and Format: line + const eventLines: string[] = []; // Dialogue/Comment lines + const postEventsSections: string[] = []; // [Fonts], [Graphics], [Aegisub...] (after Events) + + let currentSection: string[] = headerSections; + let inEventsSection = false; + + for (let i = 0; i < lines.length; i++) { + const line = lines[i]; + + // Section header + if (line && line.startsWith('[') && line.endsWith(']')) { + const trimmedLine = line.trim(); + + if (trimmedLine === '[Events]') { + inEventsSection = true; + eventsHeader.push(line); + continue; + } + + // Any section after [Events] goes to post-events + if (inEventsSection) { + currentSection = postEventsSections; + inEventsSection = false; + } + + currentSection.push(line); + continue; + } + + if (inEventsSection) { + if (!line) { + continue; // Skip empty lines in Events + } + + if (line.startsWith('Format:')) { + eventsHeader.push(line); + } else if (line.startsWith('Dialogue:')) { + // Dialogue lines go to eventLines (will be reconstructed with timestamps from blocks) + eventLines.push(line); + } else if (line.startsWith('Comment:')) { + // Comment lines stay in header (they're metadata, not in MKV blocks) + eventsHeader.push(line); + } + } else { + if (line !== undefined) { + currentSection.push(line); + } + } + } + + // Build header: everything except Dialogue lines (keep Comments) + // Format: [Header Sections] + [Events] + Format + Comments + [Post-Events Sections] + const header = [ + ...headerSections, + ...eventsHeader, // Includes [Events], Format:, and Comment: lines + ...postEventsSections, + ].join('\n'); + + // Parse Comment and Dialogue lines + const cues: SubtitleCue[] = []; + + for (const line of eventLines) { + // Parse ASS dialogue/comment format + // Dialogue: Layer,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text + const colonIndex = line.indexOf(':'); + if (colonIndex === -1) continue; + + const parts = line.substring(colonIndex + 1).split(','); + if (parts.length < 10) continue; + + try { + const startTime = parseAssTimestamp(parts[1]!.trim()); + const endTime = parseAssTimestamp(parts[2]!.trim()); + + cues.push({ + timestamp: startTime, + duration: endTime - startTime, + text: line, // Store the entire line (Dialogue: or Comment:) + }); + } catch { + // Skip malformed lines + continue; + } + } + + return { header, cues }; +}; + +/** + * Parses ASS Format line to get field order. + * Returns map of field name to index. + */ +const parseAssFormat = (formatLine: string): Map => { + // Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text + const fields = formatLine + .substring(formatLine.indexOf(':') + 1) + .split(',') + .map(f => f.trim()); + + const fieldMap = new Map(); + fields.forEach((field, index) => { + fieldMap.set(field, index); + }); + + return fieldMap; +}; + +/** + * Converts a full Dialogue/Comment line to MKV block format. + * @group Media sources + * @internal + */ +export const convertDialogueLineToMkvFormat = (line: string): string => { + const match = /^(Dialogue|Comment):\s*(\d+),\d+:\d{2}:\d{2}\.\d{2},\d+:\d{2}:\d{2}\.\d{2},(.*)$/.exec(line); + if (match) { + const layer = match[2]; + const restFields = match[3]; + return `${layer},${restFields}`; + } + + if (line.startsWith('Dialogue:') || line.startsWith('Comment:')) { + return line.substring(line.indexOf(':') + 1).trim(); + } + + return line; +}; + +/** + * Formats subtitle cues back to ASS/SSA text format with header. + * Properly inserts Dialogue/Comment lines within [Events] section. + * @group Media sources + * @public + */ +export const formatCuesToAss = (cues: SubtitleCue[], header: string): string => { + // If header is empty or missing, create a default ASS header + if (!header || header.trim() === '') { + header = `[Script Info] +Title: Default +ScriptType: v4.00+ + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,2,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text`; + } + + // Find [Events] section and its Format line + const headerLines = header.split('\n'); + const eventsIndex = headerLines.findIndex(line => line.trim() === '[Events]'); + + if (eventsIndex === -1) { + // No [Events] section, create one + return header + `\n\n[Events]\nFormat: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text\n` + cues.map(c => c.text).join('\n'); + } + + // Find Format line AFTER [Events] + let formatIndex = -1; + let formatLine = ''; + for (let i = eventsIndex + 1; i < headerLines.length; i++) { + const line = headerLines[i]; + if (line && line.trim().startsWith('Format:')) { + formatIndex = i; + formatLine = line; + break; + } + // Stop if we hit another section + if (line && line.startsWith('[') && line.endsWith(']')) { + break; + } + } + + // Parse format to understand field order + const fieldMap = formatLine ? parseAssFormat(formatLine) : null; + + // Reconstruct dialogue lines with proper field order + const dialogueLines = cues.map(cue => { + // If text already has full Dialogue/Comment line with timestamps, use as-is + if (cue.text.startsWith('Dialogue:') || cue.text.startsWith('Comment:')) { + if (/^(Dialogue|Comment):\s*\d+,\d+:\d{2}:\d{2}\.\d{2},\d+:\d{2}:\d{2}\.\d{2},/.test(cue.text)) { + return cue.text; + } + } + + // Parse MKV block data or plain text + let params = cue.text; + const isComment = params.startsWith('Comment:'); + const prefix = isComment ? 'Comment:' : 'Dialogue:'; + + if (params.startsWith('Dialogue:') || params.startsWith('Comment:')) { + params = params.substring(params.indexOf(':') + 1).trim(); + } + + const parts = params.split(','); + const startTime = formatAssTimestamp(cue.timestamp); + const endTime = formatAssTimestamp(cue.timestamp + cue.duration); + + let layer: string; + let restFields: string[]; + + // Detect ReadOrder format from actual block data first + // MKV blocks: ReadOrder,Layer,Style,... (9+ fields, first two numeric) OR Layer,Style,... (8+ fields, first numeric) + const blockHasReadOrder = parts.length >= 9 && !isNaN(parseInt(parts[0]!)) && !isNaN(parseInt(parts[1]!)); + const blockHasLayer = parts.length >= 8 && !isNaN(parseInt(parts[0]!)); + + if (blockHasReadOrder) { + layer = parts[1] || '0'; + restFields = parts.slice(2); + } else if (blockHasLayer) { + layer = parts[0] || '0'; + restFields = parts.slice(1); + } else { + return `${prefix} 0,${startTime},${endTime},Default,,0,0,0,,${cue.text}`; + } + + return `${prefix} ${layer},${startTime},${endTime},${restFields.join(',')}`; + }); + + if (formatIndex === -1) { + // No Format line found, just append + return header + '\n' + dialogueLines.join('\n'); + } + + // Find Comment lines and next section after [Events] + const commentLines: string[] = []; + let nextSectionIndex = headerLines.length; + + for (let i = formatIndex + 1; i < headerLines.length; i++) { + const line = headerLines[i]; + if (line && line.startsWith('Comment:')) { + commentLines.push(line); + } + if (line && line.startsWith('[') && line.endsWith(']')) { + nextSectionIndex = i; + break; + } + } + + // Build final structure: + // 1. Everything up to and including Format line + // 2. All Dialogue lines + // 3. All Comment lines (at the end of Events) + // 4. Everything after Events section + const beforeDialogues = headerLines.slice(0, formatIndex + 1); + const afterDialogues = headerLines.slice(nextSectionIndex); + + return [ + ...beforeDialogues, + ...dialogueLines, + ...commentLines, + ...afterDialogues, + ].join('\n'); +}; diff --git a/test/node/isobmff-subtitle.test.ts b/test/node/isobmff-subtitle.test.ts new file mode 100644 index 00000000..6e7a097b --- /dev/null +++ b/test/node/isobmff-subtitle.test.ts @@ -0,0 +1,190 @@ +/*! + * Copyright (c) 2025-present, Vanilagy and contributors + * + * This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at https://mozilla.org/MPL/2.0/. + */ + +import { describe, it, expect } from 'vitest'; +import { Input, FilePathSource, ALL_FORMATS } from '../../src/index.js'; + +describe('ISOBMFF Subtitle Demuxing', () => { + it('should detect WebVTT subtitle track in MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-webvtt.mp4'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + + const track = subtitleTracks[0]!; + expect(track.codec).toBe('webvtt'); + expect(track.languageCode).toBe('eng'); + }); + + it('should export WebVTT subtitle track to WebVTT format', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-webvtt.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('webvtt'); + + const subtitleText = await track.exportToText(); + + // Check that it's WebVTT format + expect(subtitleText).toMatch(/^WEBVTT/); + expect(subtitleText).toMatch(/\d{2}:\d{2}:\d{2}\.\d{3} --> \d{2}:\d{2}:\d{2}\.\d{3}/); + expect(subtitleText).toContain('Hello world!'); + }); + + it('should read WebVTT cues from MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-webvtt.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + + for await (const cue of track.getCues()) { + cues.push(cue); + } + + expect(cues.length).toBeGreaterThan(0); + expect(cues[0]!).toHaveProperty('timestamp'); + expect(cues[0]!).toHaveProperty('duration'); + expect(cues[0]!).toHaveProperty('text'); + }); + + it('should detect tx3g subtitle track in MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-tx3g.mp4'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + + const track = subtitleTracks[0]!; + expect(track.codec).toBe('tx3g'); + expect(track.languageCode).toBe('eng'); + }); + + it('should detect tx3g subtitle track in MOV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mov-tx3g.mov'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + + const track = subtitleTracks[0]!; + expect(track.codec).toBe('tx3g'); + // MOV file may have undefined language + expect(['eng', 'und']).toContain(track.languageCode); + }); + + it('should export tx3g subtitle track to SRT format', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-tx3g.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('tx3g'); + + const subtitleText = await track.exportToText(); + + // Check that it's SRT format + expect(subtitleText).toMatch(/\d+\n\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}/); + expect(subtitleText).toContain('Hello world!'); + }); + + it('should read tx3g cues from MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-tx3g.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + + for await (const cue of track.getCues()) { + cues.push(cue); + } + + expect(cues.length).toBeGreaterThan(0); + expect(cues[0]!).toHaveProperty('timestamp'); + expect(cues[0]!).toHaveProperty('duration'); + expect(cues[0]!).toHaveProperty('text'); + }); + + it('should detect TTML subtitle track in MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-ttml.mp4'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + + const track = subtitleTracks[0]!; + expect(track.codec).toBe('ttml'); + expect(track.languageCode).toBe('eng'); + }); + + it('should detect TTML subtitle track in MOV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mov-ttml.mov'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + + const track = subtitleTracks[0]!; + expect(track.codec).toBe('ttml'); + expect(track.languageCode).toBe('eng'); + }); + + it('should export TTML subtitle track to TTML format', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-ttml.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('ttml'); + + const subtitleText = await track.exportToText(); + + // Check that it's TTML format + expect(subtitleText).toMatch(/]*xmlns="http:\/\/www\.w3\.org\/ns\/ttml"/); + expect(subtitleText).toContain('Hello world!'); + }); + + it('should read TTML cues from MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-ttml.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + + for await (const cue of track.getCues()) { + cues.push(cue); + } + + expect(cues.length).toBeGreaterThan(0); + expect(cues[0]!).toHaveProperty('timestamp'); + expect(cues[0]!).toHaveProperty('duration'); + expect(cues[0]!).toHaveProperty('text'); + }); + +}); diff --git a/test/node/matroska-subtitle.test.ts b/test/node/matroska-subtitle.test.ts new file mode 100644 index 00000000..142b31e6 --- /dev/null +++ b/test/node/matroska-subtitle.test.ts @@ -0,0 +1,127 @@ +/*! + * Copyright (c) 2025-present, Vanilagy and contributors + * + * This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at https://mozilla.org/MPL/2.0/. + */ + +import { describe, it, expect } from 'vitest'; +import { Input, FilePathSource, ALL_FORMATS } from '../../src/index.js'; + +describe('Matroska Subtitle Demuxing', () => { + it('should detect SRT subtitle track in MKV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks).toHaveLength(1); + + const track = subtitleTracks[0]!; + expect(track.codec).toBe('srt'); + expect(track.internalCodecId).toBe('S_TEXT/UTF8'); + expect(track.languageCode).toBe('eng'); + }); + + it('should detect ASS subtitle track in MKV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('ass'); + expect(track.internalCodecId).toBe('S_TEXT/ASS'); + }); + + it('should detect SSA subtitle track in MKV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ssa.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + // FFmpeg converts SSA to ASS (ASS is superset of SSA) + expect(track.codec).toBe('ass'); + expect(track.internalCodecId).toBe('S_TEXT/ASS'); + }); + + it('should detect WebVTT subtitle track in MKV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-vtt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('webvtt'); + // FFmpeg uses D_WEBVTT/SUBTITLES instead of S_TEXT/WEBVTT + expect(track.internalCodecId).toBe('D_WEBVTT/SUBTITLES'); + }); + + it('should read subtitle cues from MKV with SRT', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + + for await (const cue of track.getCues()) { + cues.push(cue); + } + + expect(cues.length).toBeGreaterThan(0); + expect(cues[0]!.text).toContain('Hello world'); + expect(cues[0]!.timestamp).toBeCloseTo(1.0, 1); + expect(cues[0]!.duration).toBeCloseTo(2.5, 1); + }); + + it('should export SRT subtitle track to text', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const srtText = await track.exportToText(); + + // Check for SRT timestamp format (HH:MM:SS,mmm) + expect(srtText).toMatch(/\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}/); + expect(srtText).toContain('Hello world'); + }); + + it('should preserve ASS CodecPrivate header', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText(); + + expect(assText).toContain('[Script Info]'); + expect(assText).toContain('[V4+ Styles]'); + expect(assText).toContain('Dialogue:'); + }); + + it('should handle multiple subtitle tracks', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-multi.mkv'), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await input.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThanOrEqual(2); + + const srtTrack = subtitleTracks.find(t => t.codec === 'srt'); + const assTrack = subtitleTracks.find(t => t.codec === 'ass'); + + expect(srtTrack).toBeDefined(); + expect(assTrack).toBeDefined(); + expect(srtTrack?.languageCode).toBe('eng'); + expect(assTrack?.languageCode).toBe('spa'); + }); +}); diff --git a/test/node/subtitle-advanced.test.ts b/test/node/subtitle-advanced.test.ts new file mode 100644 index 00000000..3da6277c --- /dev/null +++ b/test/node/subtitle-advanced.test.ts @@ -0,0 +1,515 @@ +/*! + * Copyright (c) 2025-present, Vanilagy and contributors + * + * This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at https://mozilla.org/MPL/2.0/. + */ + +import { describe, it, expect } from 'vitest'; +import { + Input, + FilePathSource, + ALL_FORMATS, + Conversion, + Output, + BufferTarget, + MkvOutputFormat, + BufferSource, + TextSubtitleSource, +} from '../../src/index.js'; +import { formatCuesToAss, convertDialogueLineToMkvFormat } from '../../src/subtitles.js'; + +describe('Advanced ASS Features', () => { + it('should preserve Comment lines from CodecPrivate', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass-fonts.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('ass'); + + const codecPrivate = (track as any)._backing.getCodecPrivate(); + console.log('\n=== CodecPrivate has Comment? ===', codecPrivate?.includes('Comment:')); + + const cues = []; + for await (const cue of track.getCues()) { + cues.push(cue); + } + console.log('Total cues:', cues.length); + console.log('First cue text:', cues[0]?.text); + + const assText = await track.exportToText('ass'); + + console.log('\n=== Exported has Comment? ===', assText.includes('Comment:')); + + // Should preserve Comment line from CodecPrivate + expect(assText).toContain('Comment:'); + expect(assText).toContain('This is a comment'); + }); + + it('should preserve [Fonts] section from CodecPrivate', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass-fonts.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText('ass'); + + // Should have [Fonts] section + expect(assText).toContain('[Fonts]'); + expect(assText).toContain('fontname: CustomFont'); + }); + + it('should preserve [Graphics] section from CodecPrivate', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass-fonts.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText('ass'); + + // Should have [Graphics] section + expect(assText).toContain('[Graphics]'); + expect(assText).toContain('filename: logo.png'); + }); + + it('should place Dialogue lines in correct position after Comment', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass-fonts.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText('ass'); + + // Verify Event ordering: Format, then Dialogue, then Comment (at end of Events) + const formatIdx = assText.indexOf('Format:'); + const firstDialogueIdx = assText.indexOf('Dialogue:'); + const commentIdx = assText.indexOf('Comment:'); + const fontsIdx = assText.indexOf('[Fonts]'); + + // Proper order: Format < Dialogue < Comment < [Fonts] + expect(formatIdx).toBeGreaterThan(-1); + expect(firstDialogueIdx).toBeGreaterThan(formatIdx); + expect(commentIdx).toBeGreaterThan(firstDialogueIdx); // Comment AFTER Dialogue + if (fontsIdx > -1) { + expect(commentIdx).toBeLessThan(fontsIdx); // Comment before [Fonts] + } + }); + + it('should have proper section ordering', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass-fonts.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText('ass'); + + const sections = []; + const lines = assText.split('\n'); + + for (const line of lines) { + if (line.startsWith('[') && line.endsWith(']')) { + sections.push(line); + } + } + + console.log('Section order:', sections); + + // Expected order: [Script Info], [V4+ Styles], [Events], [Fonts], [Graphics] + expect(sections[0]).toBe('[Script Info]'); + expect(sections).toContain('[V4+ Styles]'); + expect(sections).toContain('[Events]'); + expect(sections).toContain('[Fonts]'); + expect(sections).toContain('[Graphics]'); + + // [Events] should come before [Fonts] and [Graphics] + const eventsIdx = sections.indexOf('[Events]'); + const fontsIdx = sections.indexOf('[Fonts]'); + const graphicsIdx = sections.indexOf('[Graphics]'); + + expect(eventsIdx).toBeLessThan(fontsIdx); + expect(eventsIdx).toBeLessThan(graphicsIdx); + }); +}); + +describe('ASS Edge Cases - Parsing and Reconstruction', () => { + it('should handle text starting with comma in MKV format', async () => { + // Create ASS subtitle with text that starts with comma + const assContent = `[Script Info] +Title: Test +ScriptType: v4.00+ + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.00,Default,,0,0,0,,,comma-leading text`; + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const subtitleSource = new TextSubtitleSource('ass'); + output.addSubtitleTrack(subtitleSource, { languageCode: 'eng' }); + + await output.start(); + + await subtitleSource.add(assContent); + subtitleSource.close(); + + await output.finalize(); + + // Read back + const input = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + for await (const cue of track.getCues()) { + cues.push(cue); + } + + expect(cues.length).toBe(1); + // Should not have double comma at start + expect(cues[0]!.text).not.toMatch(/^,/); + + const exported = await track.exportToText('ass'); + const dialogueLine = exported.split('\n').find(l => l.startsWith('Dialogue:')); + expect(dialogueLine).toContain(',comma-leading text'); + // Should not have duplicate field data + expect(dialogueLine).not.toMatch(/Default,,0,0,0,,.*,Default,,0,0,0,,/); + + input[Symbol.dispose](); + }); + + it('should handle text containing multiple commas', async () => { + const assContent = `[Script Info] +Title: Test + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.00,Default,,0,0,0,,Hello, world, how, are, you?`; + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const subtitleSource = new TextSubtitleSource('ass'); + output.addSubtitleTrack(subtitleSource, { languageCode: 'eng' }); + + await output.start(); + + await subtitleSource.add(assContent); + subtitleSource.close(); + + await output.finalize(); + + const input = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const exported = await track.exportToText('ass'); + expect(exported).toContain('Hello, world, how, are, you?'); + + input[Symbol.dispose](); + }); + + it('should handle MKV format with ReadOrder field (9 fields)', async () => { + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const subtitleSource = new TextSubtitleSource('ass'); + output.addSubtitleTrack(subtitleSource, { languageCode: 'eng' }); + + const assContent = `[Script Info] +Title: Test + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.00,Default,,0,0,0,,Test text`; + + await output.start(); + + await subtitleSource.add(assContent); + subtitleSource.close(); + await output.finalize(); + + // Verify MKV block contains proper format (without ReadOrder, but could be added by muxer) + const input = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + for await (const cue of track.getCues()) { + cues.push(cue); + } + + // MKV block should have format: Layer,Style,Name,MarginL,MarginR,MarginV,Effect,Text (8 fields) + // or: ReadOrder,Layer,Style,Name,MarginL,MarginR,MarginV,Effect,Text (9 fields) + const parts = cues[0]!.text.split(','); + expect(parts.length).toBeGreaterThanOrEqual(8); + expect(parts[parts.length - 1]).toBe('Test text'); + + input[Symbol.dispose](); + }); + + it('should handle round-trip ASS -> MKV -> ASS conversion', async () => { + using input1 = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const target1 = new BufferTarget(); + const output1 = new Output({ + format: new MkvOutputFormat(), + target: target1, + }); + + // First conversion: MKV -> MKV (with ASS) + const conversion1 = await Conversion.init({ + input: input1, + output: output1, + subtitle: { codec: 'ass' }, + showWarnings: false, + }); + + await conversion1.execute(); + + // Read intermediate result + const input2 = new Input({ + source: new BufferSource(target1.buffer), + formats: ALL_FORMATS, + }); + + const target2 = new BufferTarget(); + const output2 = new Output({ + format: new MkvOutputFormat(), + target: target2, + }); + + // Second conversion: MKV -> MKV (with ASS) + const conversion2 = await Conversion.init({ + input: input2, + output: output2, + subtitle: { codec: 'ass' }, + showWarnings: false, + }); + + await conversion2.execute(); + + // Compare outputs + const input3 = new Input({ + source: new BufferSource(target2.buffer), + formats: ALL_FORMATS, + }); + + const track1 = (await input2.subtitleTracks)[0]!; + const track2 = (await input3.subtitleTracks)[0]!; + + const text1 = await track1.exportToText('ass'); + const text2 = await track2.exportToText('ass'); + + // Extract just the text content from dialogue lines (ignore timestamp precision differences) + const extractText = (line: string) => { + // Extract text after the 9th comma (after Effect field) + const parts = line.split(','); + return parts.slice(9).join(','); + }; + + const dialogue1 = text1.split('\n').filter(l => l.startsWith('Dialogue:')); + const dialogue2 = text2.split('\n').filter(l => l.startsWith('Dialogue:')); + + expect(dialogue1.length).toBe(dialogue2.length); + + // Compare text content (not timestamps due to precision issues) + for (let i = 0; i < dialogue1.length; i++) { + const text1Content = extractText(dialogue1[i]!); + const text2Content = extractText(dialogue2[i]!); + expect(text2Content).toBe(text1Content); + // Should not have duplicated field data + expect(text2Content).not.toMatch(/Default,,0,0,0,,.*Default,,0,0,0,,/); + } + + input2[Symbol.dispose](); + input3[Symbol.dispose](); + }); + + it('should handle empty fields in ASS format', async () => { + const assContent = `[Script Info] +Title: Test + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.00,Default,,0,0,0,,Text with empty name and effect`; + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const subtitleSource = new TextSubtitleSource('ass'); + output.addSubtitleTrack(subtitleSource, { languageCode: 'eng' }); + + await output.start(); + + await subtitleSource.add(assContent); + subtitleSource.close(); + await output.finalize(); + + const input = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const exported = await (await input.subtitleTracks)[0]!.exportToText('ass'); + const dialogueLine = exported.split('\n').find(l => l.startsWith('Dialogue:')); + + // Should preserve empty fields + expect(dialogueLine).toMatch(/Default,,0,0,0,,Text with empty name and effect/); + + input[Symbol.dispose](); + }); + + it('should handle convertDialogueLineToMkvFormat helper', () => { + // Test with full Dialogue line + const fullLine = 'Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello world'; + const converted = convertDialogueLineToMkvFormat(fullLine); + expect(converted).toBe('0,Default,,0,0,0,,Hello world'); + + // Test with text containing commas + const commaLine = 'Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello, world, test'; + const convertedComma = convertDialogueLineToMkvFormat(commaLine); + expect(convertedComma).toBe('0,Default,,0,0,0,,Hello, world, test'); + + // Test with already MKV format + const mkvFormat = '0,Default,,0,0,0,,Already MKV format'; + const convertedMkv = convertDialogueLineToMkvFormat(mkvFormat); + expect(convertedMkv).toBe('0,Default,,0,0,0,,Already MKV format'); + }); + + it('should handle formatCuesToAss with different field structures', () => { + const cues = [ + { + timestamp: 1.0, + duration: 2.0, + text: '0,Default,,0,0,0,,Standard format', + }, + { + timestamp: 4.0, + duration: 2.0, + text: '0,0,Default,,0,0,0,,ReadOrder format', + }, + ]; + + const header = `[Script Info] +Title: Test + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text`; + + const result = formatCuesToAss(cues, header); + const dialogueLines = result.split('\n').filter(l => l.startsWith('Dialogue:')); + + expect(dialogueLines.length).toBe(2); + expect(dialogueLines[0]).toContain('Standard format'); + expect(dialogueLines[1]).toContain('ReadOrder format'); + + // Both should have proper timestamps + expect(dialogueLines[0]).toMatch(/Dialogue: 0,0:00:01\.00,0:00:03\.00/); + expect(dialogueLines[1]).toMatch(/Dialogue: 0,0:00:04\.00,0:00:06\.00/); + + // Should not have extra field between End and Style + expect(dialogueLines[0]).toMatch(/Dialogue: 0,0:00:01\.00,0:00:03\.00,Default,,0,0,0,,/); + expect(dialogueLines[1]).toMatch(/Dialogue: 0,0:00:04\.00,0:00:06\.00,Default,,0,0,0,,/); + expect(dialogueLines[0]).not.toMatch(/End,\d+,Default/); + expect(dialogueLines[1]).not.toMatch(/End,\d+,Default/); + }); + + it('should not create extra commas when text is empty', async () => { + const assContent = `[Script Info] +Title: Test + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.00,Default,,0,0,0,,`; + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const subtitleSource = new TextSubtitleSource('ass'); + output.addSubtitleTrack(subtitleSource, { languageCode: 'eng' }); + + await output.start(); + + await subtitleSource.add(assContent); + subtitleSource.close(); + await output.finalize(); + + const input = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const cues = []; + for await (const cue of (await input.subtitleTracks)[0]!.getCues()) { + cues.push(cue); + } + + // Text should be empty, not starting with comma + expect(cues[0]!.text).not.toMatch(/^,/); + + const exported = await (await input.subtitleTracks)[0]!.exportToText('ass'); + const dialogueLine = exported.split('\n').find(l => l.startsWith('Dialogue:')); + + // Should end with ,, not ,,, + expect(dialogueLine).toMatch(/,,$/); + expect(dialogueLine).not.toMatch(/,,,$/); + + input[Symbol.dispose](); + }); +}); diff --git a/test/node/subtitle-conversion.test.ts b/test/node/subtitle-conversion.test.ts new file mode 100644 index 00000000..9c908bc8 --- /dev/null +++ b/test/node/subtitle-conversion.test.ts @@ -0,0 +1,989 @@ +/*! + * Copyright (c) 2025-present, Vanilagy and contributors + * + * This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at https://mozilla.org/MPL/2.0/. + */ + +import { describe, it, expect } from 'vitest'; +import { + Input, + FilePathSource, + ALL_FORMATS, + Conversion, + Output, + BufferTarget, + BufferSource, + MkvOutputFormat, + Mp4OutputFormat, + WebMOutputFormat, + TextSubtitleSource, +} from '../../src/index.js'; + +describe('Subtitle Conversion - Basic Cases', () => { + it('should passthrough subtitle track when codec matches output format', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'srt', // Keep as SRT + }, + }); + + expect(conversion.isValid).toBe(true); + expect(conversion.utilizedTracks.filter(t => t.type === 'subtitle').length).toBe(1); + + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(1); + expect(subtitleTracks[0]!.codec).toBe('srt'); + + const text = await subtitleTracks[0]!.exportToText(); + expect(text).toContain('Hello world'); + }); + + it('should convert SRT to WebVTT', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'webvtt', + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(1); + expect(subtitleTracks[0]!.codec).toBe('webvtt'); + + const text = await subtitleTracks[0]!.exportToText(); + expect(text).toContain('Hello world'); + }); + + it('should convert ASS to SRT', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'srt', + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(1); + expect(subtitleTracks[0]!.codec).toBe('srt'); + + const text = await subtitleTracks[0]!.exportToText(); + expect(text).toMatch(/\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}/); + + // Verify ASS metadata is stripped (no "0,0,Default,,0,0,0,," prefix) + expect(text).not.toContain('Default,,0,0,0'); + expect(text).toContain('Hello world!'); + expect(text).toContain('This is a test'); + }); + + it('should convert WebVTT to SRT', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-vtt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'srt', + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(1); + expect(subtitleTracks[0]!.codec).toBe('srt'); + }); + + it('should discard all subtitle tracks when discard is true', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + discard: true, + }, + }); + + expect(conversion.isValid).toBe(true); + expect(conversion.utilizedTracks.filter(t => t.type === 'subtitle').length).toBe(0); + expect(conversion.discardedTracks.filter(t => t.track.type === 'subtitle').length).toBe(1); + expect(conversion.discardedTracks[0]!.reason).toBe('discarded_by_user'); + }); + + it('should handle multiple subtitle tracks', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-multi.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'webvtt', + }, + }); + + expect(conversion.isValid).toBe(true); + const inputTracks = await input.subtitleTracks; + expect(conversion.utilizedTracks.filter(t => t.type === 'subtitle').length).toBe(inputTracks.length); + + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(inputTracks.length); + + for (const track of subtitleTracks) { + expect(track.codec).toBe('webvtt'); + } + }); +}); + +describe('Subtitle Conversion - Track-Specific Options', () => { + it('should selectively discard tracks based on language', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-multi.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const inputTracks = await input.subtitleTracks; + const firstTrackLang = inputTracks[0]!.languageCode; + + const conversion = await Conversion.init({ + input, + output, + subtitle: (track) => { + // Keep only the first track's language + if (track.languageCode !== firstTrackLang) { + return { discard: true }; + } + return {}; + }, + }); + + expect(conversion.isValid).toBe(true); + expect(conversion.utilizedTracks.filter(t => t.type === 'subtitle').length).toBeLessThan( + inputTracks.length, + ); + + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBeLessThan(inputTracks.length); + }); + + it('should apply different codec conversion per track', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-multi.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: (track, n) => { + // First track to SRT, rest to WebVTT + return { + codec: n === 1 ? 'srt' : 'webvtt', + }; + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks[0]!.codec).toBe('srt'); + for (let i = 1; i < subtitleTracks.length; i++) { + expect(subtitleTracks[i]!.codec).toBe('webvtt'); + } + }); + + it('should handle mixed operations: keep, convert, discard', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-multi.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const inputTracks = await input.subtitleTracks; + + const conversion = await Conversion.init({ + input, + output, + subtitle: (track, n) => { + if (n === 1) { + // First track: keep as is + return {}; + } else if (n === 2 && inputTracks.length >= 2) { + // Second track: convert to SRT + return { codec: 'srt' }; + } else { + // Rest: discard + return { discard: true }; + } + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output has expected tracks + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + if (inputTracks.length >= 2) { + expect(subtitleTracks.length).toBe(2); + expect(subtitleTracks[1]!.codec).toBe('srt'); + } + }); +}); + +describe('Subtitle Conversion - Trimming', () => { + it('should adjust subtitle timestamps when trimming', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + // Get first cue timestamp to use as trim start + const firstTrack = (await input.subtitleTracks)[0]!; + const cues = []; + for await (const cue of firstTrack.getCues()) { + cues.push(cue); + } + + const trimStart = cues[0]!.timestamp; + const trimEnd = cues[Math.min(cues.length - 1, 2)]!.timestamp + 1; + + const conversion = await Conversion.init({ + input, + output, + trim: { + start: trimStart, + end: trimEnd, + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + const outputCues = []; + for await (const cue of subtitleTracks[0]!.getCues()) { + outputCues.push(cue); + } + + // First cue should start at 0 (adjusted) + expect(outputCues[0]!.timestamp).toBeCloseTo(0, 2); + // Should have fewer cues than original + expect(outputCues.length).toBeLessThanOrEqual(cues.length); + }); + + it('should exclude cues outside trim range', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const firstTrack = (await input.subtitleTracks)[0]!; + const cues = []; + for await (const cue of firstTrack.getCues()) { + cues.push(cue); + } + + // Trim to only second cue + const trimStart = cues[1]!.timestamp; + const trimEnd = cues[1]!.timestamp + cues[1]!.duration; + + const conversion = await Conversion.init({ + input, + output, + trim: { + start: trimStart, + end: trimEnd, + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + const outputCues = []; + for await (const cue of subtitleTracks[0]!.getCues()) { + outputCues.push(cue); + } + + // Should have only 1 cue + expect(outputCues.length).toBe(1); + expect(outputCues[0]!.text).toBe(cues[1]!.text); + }); + + it('should handle partial cue trimming', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const firstTrack = (await input.subtitleTracks)[0]!; + const cues = []; + for await (const cue of firstTrack.getCues()) { + cues.push(cue); + } + + // Trim in middle of first cue + const firstCue = cues[0]!; + const trimStart = firstCue.timestamp + firstCue.duration / 2; + const trimEnd = firstCue.timestamp + firstCue.duration; + + const conversion = await Conversion.init({ + input, + output, + trim: { + start: trimStart, + end: trimEnd, + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + const outputCues = []; + for await (const cue of subtitleTracks[0]!.getCues()) { + outputCues.push(cue); + } + + // Should still have the cue but with adjusted duration + expect(outputCues.length).toBeGreaterThanOrEqual(1); + expect(outputCues[0]!.timestamp).toBeCloseTo(0, 2); + expect(outputCues[0]!.duration).toBeLessThan(firstCue.duration); + }); +}); + +describe('Subtitle Conversion - Codec Compatibility', () => { + it('should discard track when target codec not supported by output format', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new Mp4OutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + video: { discard: true }, + audio: { discard: true }, + subtitle: { + codec: 'ass', // MP4 only supports webvtt, not ass + }, + showWarnings: false, + }); + + // Track should be discarded because ASS is not supported in MP4 + expect(conversion.isValid).toBe(false); + expect(conversion.discardedTracks.filter(t => t.track.type === 'subtitle').length).toBe(1); + expect(conversion.discardedTracks.find(t => t.track.type === 'subtitle')!.reason).toBe( + 'no_encodable_target_codec', + ); + }); + + it('should support WebVTT and TX3G in MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new Mp4OutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'webvtt', + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks[0]!.codec).toBe('webvtt'); + }); + + it('should support all text formats in MKV', async () => { + // Test WebVTT and SRT which are well-supported + // ASS/SSA conversion is tested separately below + const testCases = [ + { codec: 'webvtt' as const }, + { codec: 'srt' as const }, + ]; + + for (const { codec } of testCases) { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec, + }, + showWarnings: false, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output has subtitle with correct codec + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length, `codec: ${codec}`).toBeGreaterThan(0); + expect(subtitleTracks[0]!.codec).toBe(codec); + } + }); + + it('should convert SRT to ASS with proper header', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'ass', + }, + showWarnings: false, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + expect(subtitleTracks[0]!.codec).toBe('ass'); + + // Verify ASS structure + const assText = await subtitleTracks[0]!.exportToText(); + expect(assText).toContain('[Script Info]'); + expect(assText).toContain('[V4+ Styles]'); + expect(assText).toContain('[Events]'); + expect(assText).toContain('Dialogue:'); + }); + + it('should preserve ASS header when trimming', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'ass', + }, + trim: { start: 0, end: 10 }, + showWarnings: false, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + expect(subtitleTracks[0]!.codec).toBe('ass'); + + // Verify ASS structure is preserved + const assText = await subtitleTracks[0]!.exportToText(); + expect(assText).toContain('[Script Info]'); + expect(assText).toContain('[V4+ Styles]'); + expect(assText).toContain('[Events]'); + }); + + it('should only support WebVTT in WebM', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new WebMOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'webvtt', + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks[0]!.codec).toBe('webvtt'); + }); +}); + +describe('Subtitle Conversion - External Subtitles', () => { + it('should add external subtitle track', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-video.mp4'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new Mp4OutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + video: { discard: true }, + audio: { discard: true }, + }); + + // Add external subtitle + const subtitleSource = new TextSubtitleSource('webvtt'); + conversion.addExternalSubtitleTrack(subtitleSource, { + languageCode: 'eng', + name: 'External Subtitle', + }, async () => { + await subtitleSource.add('WEBVTT\n\n00:00:00.000 --> 00:00:02.000\nExternal subtitle test'); + subtitleSource.close(); + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(1); + expect(subtitleTracks[0]!.codec).toBe('webvtt'); + + const text = await subtitleTracks[0]!.exportToText(); + expect(text).toContain('External subtitle test'); + }); + + it('should combine external subtitles with input subtitles', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + }); + + // Add external subtitle + const subtitleSource = new TextSubtitleSource('webvtt'); + conversion.addExternalSubtitleTrack(subtitleSource, { + languageCode: 'spa', + name: 'Spanish', + }, async () => { + await subtitleSource.add('WEBVTT\n\n00:00:00.000 --> 00:00:02.000\nHola mundo'); + subtitleSource.close(); + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBe(2); + + // Find the tracks + const srtTrack = subtitleTracks.find(t => t.codec === 'srt'); + const vttTrack = subtitleTracks.find(t => t.codec === 'webvtt'); + + expect(srtTrack).toBeDefined(); + expect(vttTrack).toBeDefined(); + + const vttText = await vttTrack!.exportToText(); + expect(vttText).toContain('Hola mundo'); + }); + + it('should respect track count limits for external subtitles', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-video.mp4'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new Mp4OutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + video: { discard: true }, + audio: { discard: true }, + }); + + // Add external subtitle + const subtitleSource = new TextSubtitleSource('webvtt'); + conversion.addExternalSubtitleTrack(subtitleSource, {}, async () => { + await subtitleSource.add('WEBVTT\n\n00:00:00.000 --> 00:00:02.000\nTest'); + subtitleSource.close(); + }); + + expect(conversion.isValid).toBe(true); + expect(() => { + // Try to add second external subtitle + const subtitleSource2 = new TextSubtitleSource('webvtt'); + conversion.addExternalSubtitleTrack(subtitleSource2, {}, async () => { + await subtitleSource2.add('WEBVTT\n\n00:00:00.000 --> 00:00:02.000\nTest 2'); + subtitleSource2.close(); + }); + }).not.toThrow(); // MP4 supports multiple subtitle tracks + }); +}); + +describe('Subtitle Conversion - Edge Cases', () => { + it('should handle empty subtitle track', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + subtitle: { + codec: 'webvtt', // Convert SRT to WebVTT + }, + }); + + expect(conversion.isValid).toBe(true); + await conversion.execute(); + + // Verify output has WebVTT track + using outputInput = new Input({ + source: new BufferSource(target.buffer), + formats: ALL_FORMATS, + }); + + const subtitleTracks = await outputInput.subtitleTracks; + expect(subtitleTracks.length).toBeGreaterThan(0); + + const webvttTrack = subtitleTracks.find(t => t.codec === 'webvtt'); + expect(webvttTrack).toBeDefined(); + + const cues = []; + for await (const cue of webvttTrack!.getCues()) { + cues.push(cue); + } + // Should have cues from original SRT + expect(cues.length).toBeGreaterThan(0); + }); + + it('should throw error for invalid subtitle options', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + await expect(() => + Conversion.init({ + input, + output: new Output({ + format: new MkvOutputFormat(), + target: new BufferTarget(), + }), + subtitle: { + // @ts-expect-error Testing invalid input + codec: 'invalid-codec', + }, + }), + ).rejects.toThrow(); + }); + + it('should not execute conversion after adding external subtitle', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const target = new BufferTarget(); + const output = new Output({ + format: new MkvOutputFormat(), + target, + }); + + const conversion = await Conversion.init({ + input, + output, + }); + + await conversion.execute(); + + // Try to add external subtitle after execution + const subtitleSource = new TextSubtitleSource('webvtt'); + expect(() => { + conversion.addExternalSubtitleTrack(subtitleSource, {}, async () => { + await subtitleSource.add('WEBVTT\n\n00:00:00.000 --> 00:00:02.000\nTest'); + subtitleSource.close(); + }); + }).toThrow('Cannot add subtitle tracks after conversion has been executed'); + }); +}); diff --git a/test/node/subtitle-extraction.test.ts b/test/node/subtitle-extraction.test.ts new file mode 100644 index 00000000..9a100cc1 --- /dev/null +++ b/test/node/subtitle-extraction.test.ts @@ -0,0 +1,263 @@ +/*! + * Copyright (c) 2025-present, Vanilagy and contributors + * + * This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at https://mozilla.org/MPL/2.0/. + */ + +import { describe, it, expect } from 'vitest'; +import { Input, FilePathSource, ALL_FORMATS } from '../../src/index.js'; +import { readFile } from 'fs/promises'; + +describe('Subtitle Extraction', () => { + it('should extract SRT subtitles from MKV and download as text', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('srt'); + + // Export to SRT format + const srtText = await track.exportToText('srt'); + + // Should have proper SRT format + expect(srtText).toMatch(/\d+\n\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}/); + expect(srtText).toContain('Hello world'); + expect(srtText).toContain('This is a test'); + + // Should have sequence numbers + expect(srtText).toMatch(/^1\n/m); + expect(srtText).toMatch(/\n2\n/); + }); + + it('should extract ASS subtitles from MKV and preserve header', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('ass'); + + // Export to ASS format + const assText = await track.exportToText('ass'); + + // Should have all ASS sections + expect(assText).toContain('[Script Info]'); + expect(assText).toContain('[V4+ Styles]'); + expect(assText).toContain('[Events]'); + expect(assText).toContain('Format:'); + + // Should have dialogue lines + expect(assText).toMatch(/Dialogue:/); + + // Should have actual subtitle content + expect(assText).toContain('Hello world'); + }); + + it('should extract WebVTT subtitles from MKV', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-vtt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + expect(track.codec).toBe('webvtt'); + + // Export to SRT (WebVTT export needs more work, so use SRT) + const srtText = await track.exportToText('srt'); + + expect(srtText).toBeTruthy(); + expect(srtText.length).toBeGreaterThan(0); + }); + + it('should extract WebVTT subtitles from MP4', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mp4-webvtt.mp4'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + + // Export to text + const text = await track.exportToText(); + + expect(text).toBeTruthy(); + expect(text.length).toBeGreaterThan(0); + }); + + it('should iterate through all subtitle cues', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const cues = []; + + for await (const cue of track.getCues()) { + cues.push(cue); + } + + expect(cues.length).toBeGreaterThan(0); + + // Verify cue structure + for (const cue of cues) { + expect(cue).toHaveProperty('timestamp'); + expect(cue).toHaveProperty('duration'); + expect(cue).toHaveProperty('text'); + expect(typeof cue.timestamp).toBe('number'); + expect(typeof cue.duration).toBe('number'); + expect(typeof cue.text).toBe('string'); + } + }); + + it('should handle multiple subtitle tracks', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-multi.mkv'), + formats: ALL_FORMATS, + }); + + const tracks = await input.subtitleTracks; + expect(tracks.length).toBeGreaterThanOrEqual(2); + + // Extract all tracks + const exportedTexts = await Promise.all( + tracks.map(track => track.exportToText()), + ); + + for (const text of exportedTexts) { + expect(text).toBeTruthy(); + expect(text.length).toBeGreaterThan(0); + } + }); +}); + +describe('Subtitle Format Conversion', () => { + it('should convert SRT to SRT (identity)', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const srtText = await track.exportToText('srt'); + + // Should be valid SRT + expect(srtText).toMatch(/\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}/); + expect(srtText).toContain('Hello world'); + }); + + it('should convert ASS to ASS (identity)', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText('ass'); + + // Should preserve ASS structure + expect(assText).toContain('[Script Info]'); + expect(assText).toContain('[V4+ Styles]'); + expect(assText).toContain('[Events]'); + expect(assText).toContain('Dialogue:'); + }); + + it('should convert ASS to SRT (extract dialogue text)', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const srtText = await track.exportToText('srt'); + + // Should have SRT format + expect(srtText).toMatch(/\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}/); + expect(srtText.length).toBeGreaterThan(0); + }); +}); + +describe('Subtitle Export Validation', () => { + it('should export SRT with correct sequence numbers', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-srt.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const srtText = await track.exportToText('srt'); + + const lines = srtText.split('\n'); + const numbers = lines.filter(line => /^\d+$/.test(line)); + + // Should have sequential numbers starting from 1 + expect(numbers[0]).toBe('1'); + if (numbers.length > 1) { + expect(numbers[1]).toBe('2'); + } + }); + + it('should export ASS with Dialogue lines in [Events] section', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + const assText = await track.exportToText('ass'); + + // Find [Events] section + const eventsIndex = assText.indexOf('[Events]'); + expect(eventsIndex).toBeGreaterThan(-1); + + // Find Format line after [Events] + const afterEventsHeader = assText.substring(eventsIndex); + const formatMatch = afterEventsHeader.match(/Format:\s*Layer,\s*Start,\s*End/); + expect(formatMatch).toBeTruthy(); + + const formatIndex = eventsIndex + formatMatch!.index!; + const afterFormat = assText.substring(formatIndex); + + // Find first Dialogue/Comment after Format + const dialogueMatch = afterFormat.match(/^(Dialogue|Comment):/m); + expect(dialogueMatch).toBeTruthy(); + + // Extract Events section (from [Events] to next section or end) + const nextSectionMatch = afterEventsHeader.match(/\n\[([^\]]+)\]/); + const eventsSection = nextSectionMatch + ? assText.substring(eventsIndex, eventsIndex + nextSectionMatch.index!) + : assText.substring(eventsIndex); + + // Verify structure + expect(eventsSection).toContain('Format: Layer, Start, End'); + expect(eventsSection).toMatch(/Dialogue:|Comment:/); + + // Verify Dialogue lines have timestamps + const dialogueLines = eventsSection.match(/Dialogue:\s*\d+,\d+:\d{2}:\d{2}\.\d{2},\d+:\d{2}:\d{2}\.\d{2},/g); + expect(dialogueLines).toBeTruthy(); + expect(dialogueLines!.length).toBeGreaterThan(0); + }); + + it('should preserve Comment lines in ASS export', async () => { + using input = new Input({ + source: new FilePathSource('test/public/subtitles/test-mkv-ass.mkv'), + formats: ALL_FORMATS, + }); + + const track = (await input.subtitleTracks)[0]!; + + // Check if original has comments + const originalAss = await readFile('test/public/subtitles/test.ass', 'utf-8'); + const hasComments = originalAss.includes('Comment:'); + + if (hasComments) { + const assText = await track.exportToText('ass'); + expect(assText).toContain('Comment:'); + } + }); +}); diff --git a/test/node/subtitle-parsing.test.ts b/test/node/subtitle-parsing.test.ts new file mode 100644 index 00000000..22f2a095 --- /dev/null +++ b/test/node/subtitle-parsing.test.ts @@ -0,0 +1,205 @@ +/*! + * Copyright (c) 2025-present, Vanilagy and contributors + * + * This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at https://mozilla.org/MPL/2.0/. + */ + +import { describe, it, expect } from 'vitest'; +import { + parseSrtTimestamp, + formatSrtTimestamp, + splitSrtIntoCues, + formatCuesToSrt, + parseAssTimestamp, + formatAssTimestamp, + splitAssIntoCues, + formatCuesToAss, + SubtitleCue, +} from '../../src/subtitles.js'; + +describe('SRT Timestamp Parsing', () => { + it('should parse SRT timestamp format', () => { + expect(parseSrtTimestamp('00:00:01,000')).toBe(1.0); + expect(parseSrtTimestamp('00:00:03,500')).toBe(3.5); + expect(parseSrtTimestamp('01:23:45,678')).toBe(5025.678); + }); + + it('should format seconds to SRT timestamp', () => { + expect(formatSrtTimestamp(1.0)).toBe('00:00:01,000'); + expect(formatSrtTimestamp(3.5)).toBe('00:00:03,500'); + expect(formatSrtTimestamp(5025.678)).toBe('01:23:45,678'); + }); +}); + +describe('SRT Splitting', () => { + it('should split SRT text into cues', () => { + const srt = `1 +00:00:01,000 --> 00:00:03,500 +Hello world! + +2 +00:00:05,000 --> 00:00:07,000 +Goodbye!`; + + const cues = splitSrtIntoCues(srt); + + expect(cues).toHaveLength(2); + expect(cues[0]).toMatchObject({ + timestamp: 1.0, + duration: 2.5, + text: 'Hello world!', + }); + expect(cues[1]).toMatchObject({ + timestamp: 5.0, + duration: 2.0, + text: 'Goodbye!', + }); + }); + + it('should handle multi-line subtitle text', () => { + const srt = `1 +00:00:01,000 --> 00:00:03,500 +Line 1 +Line 2 +Line 3 + +2 +00:00:05,000 --> 00:00:07,000 +Single line`; + + const cues = splitSrtIntoCues(srt); + expect(cues[0]!.text).toBe('Line 1\nLine 2\nLine 3'); + expect(cues[1]!.text).toBe('Single line'); + }); + + it('should handle SRT with missing sequence numbers', () => { + const srt = `1 +00:00:01,000 --> 00:00:03,500 +Text one + +3 +00:00:05,000 --> 00:00:07,000 +Text two`; + + const cues = splitSrtIntoCues(srt); + expect(cues).toHaveLength(2); + }); +}); + +describe('SRT Formatting', () => { + it('should format cues back to SRT', () => { + const cues: SubtitleCue[] = [ + { timestamp: 1.0, duration: 2.5, text: 'Hello' }, + { timestamp: 5.0, duration: 2.0, text: 'Goodbye' }, + ]; + + const srt = formatCuesToSrt(cues); + + expect(srt).toContain('1\n00:00:01,000 --> 00:00:03,500\nHello'); + expect(srt).toContain('2\n00:00:05,000 --> 00:00:07,000\nGoodbye'); + }); + + it('should preserve multi-line text', () => { + const cues: SubtitleCue[] = [ + { timestamp: 1.0, duration: 2.5, text: 'Line 1\nLine 2' }, + ]; + + const srt = formatCuesToSrt(cues); + expect(srt).toContain('Line 1\nLine 2'); + }); +}); + +describe('ASS Timestamp Parsing', () => { + it('should parse ASS timestamp format', () => { + expect(parseAssTimestamp('0:00:01.00')).toBe(1.0); + expect(parseAssTimestamp('0:00:03.50')).toBe(3.5); + expect(parseAssTimestamp('1:23:45.67')).toBe(5025.67); + }); + + it('should format seconds to ASS timestamp', () => { + expect(formatAssTimestamp(1.0)).toBe('0:00:01.00'); + expect(formatAssTimestamp(3.5)).toBe('0:00:03.50'); + expect(formatAssTimestamp(5025.67)).toBe('1:23:45.67'); + }); +}); + +describe('ASS Splitting', () => { + it('should split ASS into header and cues', () => { + const ass = `[Script Info] +Title: Test + +[V4+ Styles] +Style: Default,Arial,20 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello world! +Dialogue: 0,0:00:05.00,0:00:07.00,Default,,0,0,0,,Goodbye!`; + + const { header, cues } = splitAssIntoCues(ass); + + expect(header).toContain('[Script Info]'); + expect(header).toContain('[V4+ Styles]'); + expect(header).toContain('[Events]'); + expect(header).toContain('Format:'); + expect(cues).toHaveLength(2); + expect(cues[0]!.timestamp).toBe(1.0); + expect(cues[0]!.duration).toBe(2.5); + }); + + it('should preserve full dialogue line in cue text', () => { + const ass = `[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,{\\pos(320,240)}Styled text`; + + const { cues } = splitAssIntoCues(ass); + expect(cues[0]!.text).toContain('Dialogue:'); + expect(cues[0]!.text).toContain('{\\pos(320,240)}Styled text'); + }); + + it('should handle SSA format (v4.00)', () => { + const ssa = `[Script Info] +Title: Test +ScriptType: v4.00 + +[V4 Styles] +Style: Default,Arial,20 + +[Events] +Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: Marked=0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello`; + + const { header, cues } = splitAssIntoCues(ssa); + expect(header).toContain('[V4 Styles]'); + expect(cues).toHaveLength(1); + }); +}); + +describe('ASS Formatting', () => { + it('should format cues back to ASS with header', () => { + const header = `[Script Info] +Title: Test + +[V4+ Styles] +Style: Default,Arial,20 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text`; + + const cues: SubtitleCue[] = [ + { + timestamp: 1.0, + duration: 2.5, + text: 'Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello', + }, + ]; + + const ass = formatCuesToAss(cues, header); + + expect(ass).toContain('[Script Info]'); + expect(ass).toContain('[V4+ Styles]'); + expect(ass).toContain('Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello'); + }); +}); diff --git a/test/public/subtitles/test-converted.ttml b/test/public/subtitles/test-converted.ttml new file mode 100644 index 00000000..0708b067 --- /dev/null +++ b/test/public/subtitles/test-converted.ttml @@ -0,0 +1,34 @@ + + + + + + + + +

+

Hello world!

+

This is a test.

+

Goodbye!

+
+ + diff --git a/test/public/subtitles/test-mkv-ass-fonts.mkv b/test/public/subtitles/test-mkv-ass-fonts.mkv new file mode 100644 index 00000000..1b4f6569 Binary files /dev/null and b/test/public/subtitles/test-mkv-ass-fonts.mkv differ diff --git a/test/public/subtitles/test-mkv-ass.mkv b/test/public/subtitles/test-mkv-ass.mkv new file mode 100644 index 00000000..93771b34 Binary files /dev/null and b/test/public/subtitles/test-mkv-ass.mkv differ diff --git a/test/public/subtitles/test-mkv-multi.mkv b/test/public/subtitles/test-mkv-multi.mkv new file mode 100644 index 00000000..8cd2a25a Binary files /dev/null and b/test/public/subtitles/test-mkv-multi.mkv differ diff --git a/test/public/subtitles/test-mkv-srt.mkv b/test/public/subtitles/test-mkv-srt.mkv new file mode 100644 index 00000000..7512523e Binary files /dev/null and b/test/public/subtitles/test-mkv-srt.mkv differ diff --git a/test/public/subtitles/test-mkv-ssa.mkv b/test/public/subtitles/test-mkv-ssa.mkv new file mode 100644 index 00000000..51fdeb4e Binary files /dev/null and b/test/public/subtitles/test-mkv-ssa.mkv differ diff --git a/test/public/subtitles/test-mkv-vtt.mkv b/test/public/subtitles/test-mkv-vtt.mkv new file mode 100644 index 00000000..89802201 Binary files /dev/null and b/test/public/subtitles/test-mkv-vtt.mkv differ diff --git a/test/public/subtitles/test-mov-ttml.mov b/test/public/subtitles/test-mov-ttml.mov new file mode 100644 index 00000000..49cb93a4 Binary files /dev/null and b/test/public/subtitles/test-mov-ttml.mov differ diff --git a/test/public/subtitles/test-mov-tx3g.mov b/test/public/subtitles/test-mov-tx3g.mov new file mode 100644 index 00000000..d3ff965e Binary files /dev/null and b/test/public/subtitles/test-mov-tx3g.mov differ diff --git a/test/public/subtitles/test-mp4-ttml.mp4 b/test/public/subtitles/test-mp4-ttml.mp4 new file mode 100644 index 00000000..49cb93a4 Binary files /dev/null and b/test/public/subtitles/test-mp4-ttml.mp4 differ diff --git a/test/public/subtitles/test-mp4-tx3g.mp4 b/test/public/subtitles/test-mp4-tx3g.mp4 new file mode 100644 index 00000000..af69be07 Binary files /dev/null and b/test/public/subtitles/test-mp4-tx3g.mp4 differ diff --git a/test/public/subtitles/test-mp4-webvtt.mp4 b/test/public/subtitles/test-mp4-webvtt.mp4 new file mode 100644 index 00000000..05d8943b Binary files /dev/null and b/test/public/subtitles/test-mp4-webvtt.mp4 differ diff --git a/test/public/subtitles/test-video.mp4 b/test/public/subtitles/test-video.mp4 new file mode 100644 index 00000000..a8065d10 Binary files /dev/null and b/test/public/subtitles/test-video.mp4 differ diff --git a/test/public/subtitles/test-with-fonts.ass b/test/public/subtitles/test-with-fonts.ass new file mode 100644 index 00000000..84679fd5 --- /dev/null +++ b/test/public/subtitles/test-with-fonts.ass @@ -0,0 +1,20 @@ +[Script Info] +Title: Test with Embedded Font +ScriptType: v4.00+ +PlayResX: 1280 +PlayResY: 720 + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,CustomFont,48,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Text with custom font +Comment: 0,0:00:02.00,0:00:02.50,Default,,0,0,0,,This is a comment +[Fonts] +fontname: CustomFont + +[Graphics] +filename: logo.png +0 diff --git a/test/public/subtitles/test.ass b/test/public/subtitles/test.ass new file mode 100644 index 00000000..e6c21200 --- /dev/null +++ b/test/public/subtitles/test.ass @@ -0,0 +1,15 @@ +[Script Info] +Title: Test Subtitles +ScriptType: v4.00+ +PlayResX: 1280 +PlayResY: 720 + +[V4+ Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding +Style: Default,Arial,20,&H00FFFFFF,&H000000FF,&H00000000,&H00000000,0,0,0,0,100,100,0,0,1,2,0,2,10,10,10,1 + +[Events] +Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: 0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello world! +Dialogue: 0,0:00:05.00,0:00:07.00,Default,,0,0,0,,This is a test. +Dialogue: 0,0:00:08.50,0:00:10.00,Default,,0,0,0,,Goodbye! diff --git a/test/public/subtitles/test.srt b/test/public/subtitles/test.srt new file mode 100644 index 00000000..0245330a --- /dev/null +++ b/test/public/subtitles/test.srt @@ -0,0 +1,11 @@ +1 +00:00:01,000 --> 00:00:03,500 +Hello world! + +2 +00:00:05,000 --> 00:00:07,000 +This is a test. + +3 +00:00:08,500 --> 00:00:10,000 +Goodbye! diff --git a/test/public/subtitles/test.ssa b/test/public/subtitles/test.ssa new file mode 100644 index 00000000..93880e90 --- /dev/null +++ b/test/public/subtitles/test.ssa @@ -0,0 +1,13 @@ +[Script Info] +Title: Test Subtitles +ScriptType: v4.00 + +[V4 Styles] +Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, TertiaryColour, BackColour, Bold, Italic, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, AlphaLevel, Encoding +Style: Default,Arial,20,16777215,65535,65535,0,0,0,1,2,0,2,10,10,10,0,1 + +[Events] +Format: Marked, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text +Dialogue: Marked=0,0:00:01.00,0:00:03.50,Default,,0,0,0,,Hello world! +Dialogue: Marked=0,0:00:05.00,0:00:07.00,Default,,0,0,0,,This is a test. +Dialogue: Marked=0,0:00:08.50,0:00:10.00,Default,,0,0,0,,Goodbye! diff --git a/test/public/subtitles/test.ttml b/test/public/subtitles/test.ttml new file mode 100644 index 00000000..8a57951b --- /dev/null +++ b/test/public/subtitles/test.ttml @@ -0,0 +1,14 @@ + + + + +