Add mean pooling strategy for Modernbert classifier #616

kwnath · 2025-05-27T16:39:01Z

What does this PR do?

Apologies in advance, new this this area and Rust 😅.

I noticed differences between Transformers and TEI libraries when running rerankers with a mean classifier pooling strategy. More details here.

Fixes #615

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Not sure who might be best to review 😄 @OlivierDehaene OR @Narsil

kwnath · 2025-05-27T16:40:03Z

backends/candle/tests/test_modernbert.rs

@@ -202,3 +202,61 @@ fn test_modernbert_classification() -> Result<()> {

    Ok(())
 }
+
+#[test]


Having a bit of trouble running this test (unable to download the HF model, hitting a 429, just lacking a HF token) and not super pleased with this test but it;s just to test that both cls/mean produce different outputs as expected.

How about just testing with the reranker-ModernBERT-large-gooaq-bce model directly, which is a mean classifier-pooling, instead of gte-reranker model with mean pooling?

I think we can check whether classifier pooling correctly works by running the reranker-ModernBERT-gooqa-bce model, while cls classifier pooling has already been verified in the above test! (implicitly assume cls and mean will behave differently)

unable to download the HF model, hitting a 429,

when I faced that issue, in my case, I manually download the model via git or huggingface-cli on my local, and then set HUGGINGFACE_HUB_CACHE to point to that local path.

That sounds like a good idea, will try this!

It couldn't find from the cache but I managed to just export a hugginface token, all good!

successes: test_modernbert_classification_mean_pooling test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 3 filtered out; finished in 3.94s

kwnath · 2025-05-28T11:46:48Z

backends/candle/src/models/flash_modernbert.rs

@@ -260,7 +260,15 @@ impl FlashModernBertModel {

        let (pool, classifier) = match model_type {
            ModelType::Classifier => {
-                let pool = Pool::Cls;
+                let pool = if let Some(pooling) = &config.classifier_pooling {


It might make more sense to do this like in below:

ModelType::Classifier(pool) => { if pool == Pool::Splade { candle::bail!("`splade` is not supported for ModernBert") } if pool == Pool::LastToken { candle::bail!("`LastToken` is not supported for ModernBert") } }

But this would be a bigger change that I'm not as confident about.

I'm also not confident about this, maybe we could consider std::str::FromStr for the Pool enum like below. It would be great if the maintainers could provide some guidance or feedback on this :)

impl std::str::FromStr for Pool { fn from_str(s: &str) -> Result<Self, Err> { match s { "cls" => Ok(Pool::Cls), "mean" => Ok(Pool::Mean), "splade" => Ok(Pool::Splade), "last_token" => Ok(Pool::LastToken), _ => Err(()), } } }

kozistr

i'm just a passerby lol. I leaved some comments, to hope if it could help you in some way :) thanks

kozistr · 2025-05-29T03:32:42Z

backends/candle/tests/test_modernbert.rs

@@ -202,3 +202,61 @@ fn test_modernbert_classification() -> Result<()> {

    Ok(())
 }
+
+#[test]


How about just testing with the reranker-ModernBERT-large-gooaq-bce model directly, which is a mean classifier-pooling, instead of gte-reranker model with mean pooling?

I think we can check whether classifier pooling correctly works by running the reranker-ModernBERT-gooqa-bce model, while cls classifier pooling has already been verified in the above test! (implicitly assume cls and mean will behave differently)

unable to download the HF model, hitting a 429,

when I faced that issue, in my case, I manually download the model via git or huggingface-cli on my local, and then set HUGGINGFACE_HUB_CACHE to point to that local path.

kozistr · 2025-05-29T03:37:17Z

backends/candle/src/models/flash_modernbert.rs

@@ -260,7 +260,15 @@ impl FlashModernBertModel {

        let (pool, classifier) = match model_type {
            ModelType::Classifier => {
-                let pool = Pool::Cls;
+                let pool = if let Some(pooling) = &config.classifier_pooling {


I'm also not confident about this, maybe we could consider std::str::FromStr for the Pool enum like below. It would be great if the maintainers could provide some guidance or feedback on this :)

impl std::str::FromStr for Pool { fn from_str(s: &str) -> Result<Self, Err> { match s { "cls" => Ok(Pool::Cls), "mean" => Ok(Pool::Mean), "splade" => Ok(Pool::Splade), "last_token" => Ok(Pool::LastToken), _ => Err(()), } } }

Add different pooling stategies for Modernbert classifier

dbbbf24

kwnath commented May 27, 2025

View reviewed changes

kwnath commented May 28, 2025

View reviewed changes

kozistr reviewed May 29, 2025

View reviewed changes

kwnath added 7 commits May 29, 2025 11:34

update test to use reranker model

683d0db

Add test snapshot

9832f20

update snapshots

665294d

Remove unneeded snapshot

95f90ac

Add new snapshot for classifier mean pooling strat

b07060d

Update pooling method

aed7457

remove use std::str::FromStr;

3ab78be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add mean pooling strategy for Modernbert classifier #616

Add mean pooling strategy for Modernbert classifier #616

Uh oh!

kwnath commented May 27, 2025 •

edited

Loading

Uh oh!

kwnath May 27, 2025 •

edited

Loading

Uh oh!

kozistr May 29, 2025

Uh oh!

kwnath May 29, 2025

Uh oh!

kwnath May 29, 2025

Uh oh!

kwnath May 28, 2025

Uh oh!

kozistr May 29, 2025 •

edited

Loading

Uh oh!

kozistr left a comment

Uh oh!

kozistr May 29, 2025

Uh oh!

kozistr May 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

@@ @@ -202,3 +202,61 @@ fn test_modernbert_classification() -> Result<()> { @@
                   Ok(())
               }
+              #[test]

Add mean pooling strategy for Modernbert classifier #616

Are you sure you want to change the base?

Add mean pooling strategy for Modernbert classifier #616

Uh oh!

Conversation

kwnath commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

kwnath May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kozistr May 29, 2025

Choose a reason for hiding this comment

Uh oh!

kwnath May 29, 2025

Choose a reason for hiding this comment

Uh oh!

kwnath May 29, 2025

Choose a reason for hiding this comment

Uh oh!

kwnath May 28, 2025

Choose a reason for hiding this comment

Uh oh!

kozistr May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kozistr left a comment

Choose a reason for hiding this comment

Uh oh!

kozistr May 29, 2025

Choose a reason for hiding this comment

Uh oh!

kozistr May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kwnath commented May 27, 2025 •

edited

Loading

kwnath May 27, 2025 •

edited

Loading

kozistr May 29, 2025 •

edited

Loading

kozistr May 29, 2025 •

edited

Loading