You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to distinguish a trivial example with multiple languages, in this case where one language is Chinese, it sometimes detects it correctly and sometimes does not. I expected that lingua could use a rule-based approach since hanzi is always an indicator of a non-English language.
I was wondering if this is expected behavior. In my testing for other languages that don't use a latin script like Russian, I am not finding this to be an issue.
Example for reproducing
use lingua::DetectionResult;use lingua::Language::{English,Chinese};use lingua::LanguageDetectorBuilder;fnmain(){let languages = vec![English,Chinese];let detector = LanguageDetectorBuilder::from_languages(&languages).build();let sentence = "Hello world. 你好世界";let results:Vec<DetectionResult> = detector.detect_multiple_languages_of(sentence);assert_eq!(results.len(),2);let sentence2 = "Hello my name is bob. 你好世界";let results2:Vec<DetectionResult> = detector.detect_multiple_languages_of(sentence2);assert_eq!(results2.len(),1);}
When trying to distinguish a trivial example with multiple languages, in this case where one language is Chinese, it sometimes detects it correctly and sometimes does not. I expected that lingua could use a rule-based approach since hanzi is always an indicator of a non-English language.
I was wondering if this is expected behavior. In my testing for other languages that don't use a latin script like Russian, I am not finding this to be an issue.
Example for reproducing
Related to #463
Environment
I am running
lingua = "1.7.1"
The text was updated successfully, but these errors were encountered: