Fix `{Bert,DistilBert}SpladeHead` when loading from Safetensors #564

alvarobartt · 2025-04-07T10:04:59Z

What does this PR do?

This PR fixes an issue preventing loading BERT and DistilBERT models with the SPLADE pooling, as when converting the pytorch_model.bin into a model.safetensors files, the tensors with shared memory for the content are removed for safety, meaning that the required weights for the SPLADE head were not there, as support for SPLADE was originally introduced for the models at https://huggingface.co/naver which are indeed pytorch_model.bin files.

So on, this PR bypasses that by adding a check on whether the required tensors are there, and if not, it falls back to the tensor with the shared memory instead.

To reproduce the issue, simply grab any model under https://huggingface.co/naver as e.g. naver/efficient-splade-V-large-query and download the pytorch_model.bin file and then convert it into a model.safetensors file with the following script:

import torch
from safetensors.torch import save_file

model_state_dict = torch.load("pytorch_model.bin", map_location=torch.device("cpu"))
contiguous_state_dict = {k: v.contiguous() for k, v in model_state_dict.items()}

save_file(contiguous_state_dict, "model.safetensors")

Then the following error will be raised:

Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'distilbert.embeddings.word_embeddings.weight', 'vocab_projector.weight'}].
A potential way to correctly save your model is to use `save_model`.
More information at https://huggingface.co/docs/safetensors/torch_shared_tensors

And indeed if we inspect the model.safetensors metadata, we'll see that the latter tensor that shares memory with a previous tensor won't be there.

Fixes #548

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@Narsil or @McPatate

Mainly to keep consistency with the rest of the codebase

Narsil

LGTM. Those many locations loads are getting quite annoying, at some point we should figure out a nicer way to abstract those.

But this looks good.

alvarobartt added 5 commits April 7, 2025 11:38

Fix BertSpladeHead::load when loading from Safetensors

d827a07

Use pp over fully qualified get

1dbc411

Mainly to keep consistency with the rest of the codebase

Update docstring wording

e7625c1

Fix DistilBertSpladeHead::load when loading from Safetensors

dee5ea0

Fix explanation on shared tensor memory

1444c52

Narsil approved these changes Apr 8, 2025

View reviewed changes

Narsil merged commit 3c50308 into main Apr 8, 2025
14 checks passed

Narsil deleted the patch-splade-from-safetensors branch April 8, 2025 08:44

BrewTestBot mentioned this pull request Apr 8, 2025

text-embeddings-inference 1.7.0 Homebrew/homebrew-core#218821

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix `{Bert,DistilBert}SpladeHead` when loading from Safetensors #564

Fix `{Bert,DistilBert}SpladeHead` when loading from Safetensors #564

Uh oh!

alvarobartt commented Apr 7, 2025 •

edited

Loading

Uh oh!

Narsil left a comment

Uh oh!

Uh oh!

Uh oh!

Fix {Bert,DistilBert}SpladeHead when loading from Safetensors #564

Fix {Bert,DistilBert}SpladeHead when loading from Safetensors #564

Uh oh!

Conversation

alvarobartt commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Narsil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Fix `{Bert,DistilBert}SpladeHead` when loading from Safetensors #564

Fix `{Bert,DistilBert}SpladeHead` when loading from Safetensors #564

alvarobartt commented Apr 7, 2025 •

edited

Loading