-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
I tried building ASR systems on a very common standard task (LibriSpeech-100h) using the torchaudio ctc decoder. This decoder uses the flashlight/text library as decoding backend. While my subword (BPE) based setups worked fine, the phoneme based did not.
The standard librispeech lexicon includes e.g. those 7 words, that in ARPA notation all get the same phone sequence:
BAE B AY#
BAI B AY#
BI B AY#
BUY B AY#
BY B AY#
BY' B AY#
BYE B AY#
Which resulted e.g. in the word BY not being recognized anymore.
In the log I get the message:
[Trie] Trie label number reached limit: 6
which correctly tells if this limit is applied, but I would like to raise that this limit is very low, and not configurable without re-compiling. Also the message did not look to me like a serious issue at first.
Reproduction Steps
- Use torchaudio ctc_decoder with a phoneme based lexicon containing homophones with more than 6 variations.
tshmak
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working