Tokenizer Warning when Loading DPR #2935
-
So, I'm building a LFQA pipeline, and when the load the models inside DPR it gives some tokenizer warning(mismatch tokenizer). I wanted to ask if it's something that might cause problem in the retrieving process? Code:
Warnings:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi, @Zeni69 I'm not entirely sure what is causing this error to be thrown. It is an error that comes from the HuggingFace library and it's unclear why there is a mismatch. You can check that the Tokenizer classes for both the type(retriever.query_tokenizer)
# transformers.models.dpr.tokenization_dpr_fast.DPRQuestionEncoderTokenizerFast gives the expected type(retriever.passage_tokenizer)
# transformers.models.dpr.tokenization_dpr_fast.DPRContextEncoderTokenizerFast gives the expected So to answer your question you are doing everything correctly and everything is being loaded properly despite the warning being thrown by HuggingFace. We will have to investigate further why this might be occurring. |
Beta Was this translation helpful? Give feedback.
-
You've probably hit this known issue in HF Transformers: huggingface/transformers#12926. It's just a misleading log message, nothing wrong is actually happening behind the scenes. |
Beta Was this translation helpful? Give feedback.
You've probably hit this known issue in HF Transformers: huggingface/transformers#12926. It's just a misleading log message, nothing wrong is actually happening behind the scenes.