parakeet-tdt_ctc-110m model rnnt loss scale is strange #14707

ehgusehgus · 2025-09-10T06:56:13Z

ehgusehgus
Sep 10, 2025

The model config for nvidia/parakeet-tdt_ctc-110m looks a bit odd to me.
This file is from the extracted nemo file
I think the scaling between the TDT loss and the CTC loss is off.

Currently, rnnt_reduction is set to "mean-volume", while ctc_reduction is "mean-batch".
That setup seems more like it would converge to a pure CTC model, not a TDT model.
I think that this may not be a critical issue but the performance of the TDT could be sub-optimal.

Is there a specific reason why the losses are defined this way?

(parakeet-tdt_ctc-1.1b was also trained with rnnt_reduction set to "mean-volume".)

zhenyih · 2025-09-10T13:23:06Z

zhenyih
Sep 10, 2025

ai generated, please verify

You're right to question this configuration. In the parakeet-tdt_ctc-110m model, having different reduction methods for the two loss components creates an imbalance:

"mean-volume" for RNNT (TDT) loss: Normalizes by total token count, giving more weight to longer sequences
"mean-batch" for CTC loss: Normalizes by batch size, giving equal weight to all sequences regardless of length

This discrepancy likely causes the CTC component to have more influence than intended, potentially undermining the TDT performance. The standard practice is to use the same reduction method for both components to ensure balanced training.

This appears to be unusual compared to other hybrid TDT-CTC models in the NeMo repository, which typically use consistent reduction methods. Without knowing the specific training objectives, it's hard to determine if this was intentional or an oversight, but your concern about suboptimal TDT performance is valid.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

parakeet-tdt_ctc-110m model rnnt loss scale is strange #14707

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

parakeet-tdt_ctc-110m model rnnt loss scale is strange #14707

Uh oh!

Uh oh!

ehgusehgus Sep 10, 2025

Replies: 1 comment

Uh oh!

zhenyih Sep 10, 2025

ehgusehgus
Sep 10, 2025

zhenyih
Sep 10, 2025