Skip to content

Commit d1d65cf

Browse files
authored
Default to Qwen3 in README.md and docs/ examples (#641)
1 parent a69cc2e commit d1d65cf

File tree

7 files changed

+30
-29
lines changed

7 files changed

+30
-29
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ Below are some examples of the currently supported models:
110110
### Docker
111111

112112
```shell
113-
model=BAAI/bge-large-en-v1.5
113+
model=Qwen/Qwen3-Embedding-0.6B
114114
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
115115

116116
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id $model
@@ -369,13 +369,13 @@ cd models
369369

370370
# Make sure you have git-lfs installed (https://git-lfs.com)
371371
git lfs install
372-
git clone https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5
372+
git clone https://huggingface.co/Qwen/Qwen3-Embedding-0.6B
373373

374374
# Set the models directory as the volume path
375375
volume=$PWD
376376

377377
# Mount the models directory inside the container with a volume and set the model ID
378-
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id /data/gte-base-en-v1.5
378+
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id /data/Qwen3-Embedding-0.6B
379379
```
380380

381381
### Using Re-rankers models
@@ -458,7 +458,7 @@ found [here](https://github.com/huggingface/text-embeddings-inference/blob/main/
458458
You can use the gRPC API by adding the `-grpc` tag to any TEI Docker image. For example:
459459

460460
```shell
461-
model=BAAI/bge-large-en-v1.5
461+
model=Qwen/Qwen3-Embedding-0.6B
462462
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
463463

464464
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7-grpc --model-id $model
@@ -494,7 +494,7 @@ cargo install --path router -F metal
494494
You can now launch Text Embeddings Inference on CPU with:
495495

496496
```shell
497-
model=BAAI/bge-large-en-v1.5
497+
model=Qwen/Qwen3-Embedding-0.6B
498498

499499
text-embeddings-router --model-id $model --port 8080
500500
```
@@ -532,7 +532,7 @@ cargo install --path router -F candle-cuda -F http --no-default-features
532532
You can now launch Text Embeddings Inference on GPU with:
533533

534534
```shell
535-
model=BAAI/bge-large-en-v1.5
535+
model=Qwen/Qwen3-Embedding-0.6B
536536

537537
text-embeddings-router --model-id $model --port 8080
538538
```

docs/source/en/intel_container.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ docker build . -f Dockerfile-intel --build-arg PLATFORM=$platform -t tei_cpu_ipe
3535
To deploy your model on an Intel® CPU, use the following command:
3636

3737
```shell
38-
model='BAAI/bge-large-en-v1.5'
38+
model='Qwen/Qwen3-Embedding-0.6B'
3939
volume=$PWD/data
4040

4141
docker run -p 8080:80 -v $volume:/data tei_cpu_ipex --model-id $model
@@ -58,7 +58,7 @@ docker build . -f Dockerfile-intel --build-arg PLATFORM=$platform -t tei_xpu_ipe
5858
To deploy your model on an Intel® XPU, use the following command:
5959

6060
```shell
61-
model='BAAI/bge-large-en-v1.5'
61+
model='Qwen/Qwen3-Embedding-0.6B'
6262
volume=$PWD/data
6363

6464
docker run -p 8080:80 -v $volume:/data --device=/dev/dri -v /dev/dri/by-path:/dev/dri/by-path tei_xpu_ipex --model-id $model --dtype float16
@@ -81,7 +81,7 @@ docker build . -f Dockerfile-intel --build-arg PLATFORM=$platform -t tei_hpu
8181
To deploy your model on an Intel® HPU (Gaudi), use the following command:
8282

8383
```shell
84-
model='BAAI/bge-large-en-v1.5'
84+
model='Qwen/Qwen3-Embedding-0.6B'
8585
volume=$PWD/data
8686

8787
docker run -p 8080:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e MAX_WARMUP_SEQUENCE_LENGTH=512 tei_hpu --model-id $model --dtype bfloat16

docs/source/en/local_cpu.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,10 +47,9 @@ cargo install --path router -F metal
4747
Once the installation is successfully complete, you can launch Text Embeddings Inference on CPU with the following command:
4848

4949
```shell
50-
model=BAAI/bge-large-en-v1.5
51-
revision=refs/pr/5
50+
model=Qwen/Qwen3-Embedding-0.6B
5251

53-
text-embeddings-router --model-id $model --revision $revision --port 8080
52+
text-embeddings-router --model-id $model --port 8080
5453
```
5554

5655
<Tip>

docs/source/en/local_gpu.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,8 +58,7 @@ cargo install --path router -F candle-cuda -F http --no-default-features
5858
You can now launch Text Embeddings Inference on GPU with:
5959

6060
```shell
61-
model=BAAI/bge-large-en-v1.5
62-
revision=refs/pr/5
61+
model=Qwen/Qwen3-Embedding-0.6B
6362

64-
text-embeddings-router --model-id $model --revision $revision --port 8080
63+
text-embeddings-router --model-id $model --dtype float16 --port 8080
6564
```

docs/source/en/local_metal.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,10 +38,9 @@ cargo install --path router -F metal
3838
Once the installation is successfully complete, you can launch Text Embeddings Inference with Metal with the following command:
3939

4040
```shell
41-
model=BAAI/bge-large-en-v1.5
42-
revision=refs/pr/5
41+
model=Qwen/Qwen3-Embedding-0.6B
4342

44-
text-embeddings-router --model-id $model --revision $revision --port 8080
43+
text-embeddings-router --model-id $model --port 8080
4544
```
4645

4746
Now you are ready to use `text-embeddings-inference` locally on your machine.

docs/source/en/quick_tour.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@ Next, install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/
2828

2929
## Deploy
3030

31-
Next it's time to deploy your model. Let's say you want to use [`BAAI/bge-large-en-v1.5`](https://huggingface.co/BAAI/bge-large-en-v1.5). Here's how you can do this:
31+
Next it's time to deploy your model. Let's say you want to use [`Qwen/Qwen3-Embedding-0.6B`](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B). Here's how you can do this:
3232

3333
```shell
34-
model=BAAI/bge-large-en-v1.5
34+
model=Qwen/Qwen3-Embedding-0.6B
3535
volume=$PWD/data
3636

3737
docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id $model

docs/source/en/supported_models.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,21 +21,24 @@ We are continually expanding our support for other model types and plan to inclu
2121
## Supported embeddings models
2222

2323
Text Embeddings Inference currently supports Nomic, BERT, CamemBERT, XLM-RoBERTa models with absolute positions, JinaBERT
24-
model with Alibi positions and Mistral, Alibaba GTE, Qwen2 models with Rope positions, MPNet, and ModernBERT.
24+
model with Alibi positions and Mistral, Alibaba GTE, Qwen2 models with Rope positions, MPNet, ModernBERT, and Qwen3.
2525

2626
Below are some examples of the currently supported models:
2727

2828
| MTEB Rank | Model Size | Model Type | Model ID |
2929
|-----------|---------------------|-------------|--------------------------------------------------------------------------------------------------|
30-
| 3 | 7B (Very Expensive) | Qwen2 | [Alibaba-NLP/gte-Qwen2-7B-instruct](https://hf.co/Alibaba-NLP/gte-Qwen2-7B-instruct) |
31-
| 11 | 1.5B (Expensive) | Qwen2 | [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://hf.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct) |
32-
| 14 | 7B (Very Expensive) | Mistral | [Salesforce/SFR-Embedding-2_R](https://hf.co/Salesforce/SFR-Embedding-2_R) |
33-
| 20 | 0.3B | Bert | [WhereIsAI/UAE-Large-V1](https://hf.co/WhereIsAI/UAE-Large-V1) |
34-
| 31 | 0.5B | XLM-RoBERTa | [Snowflake/snowflake-arctic-embed-l-v2.0](https://hf.co/Snowflake/snowflake-arctic-embed-l-v2.0) |
35-
| 37 | 0.3B | Alibaba GTE | [Snowflake/snowflake-arctic-embed-m-v2.0](https://hf.co/Snowflake/snowflake-arctic-embed-m-v2.0) |
36-
| 49 | 0.5B | XLM-RoBERTa | [intfloat/multilingual-e5-large-instruct](https://hf.co/intfloat/multilingual-e5-large-instruct) |
30+
| 2 | 8B (Very Expensive) | Qwen3 | [Qwen/Qwen3-Embedding-8B](https://hf.co/Qwen/Qwen3-Embedding-8B) |
31+
| 4 | 0.6B | Qwen3 | [Qwen/Qwen3-Embedding-0.6B](https://hf.co/Qwen/Qwen3-Embedding-0.6B) |
32+
| 6 | 7B (Very Expensive) | Qwen2 | [Alibaba-NLP/gte-Qwen2-7B-instruct](https://hf.co/Alibaba-NLP/gte-Qwen2-7B-instruct) |
33+
| 7 | 0.5B | XLM-RoBERTa | [intfloat/multilingual-e5-large-instruct](https://hf.co/intfloat/multilingual-e5-large-instruct) |
34+
| 14 | 1.5B (Expensive) | Qwen2 | [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://hf.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct) |
35+
| 17 | 7B (Very Expensive) | Mistral | [Salesforce/SFR-Embedding-2_R](https://hf.co/Salesforce/SFR-Embedding-2_R) |
36+
| 34 | 0.5B | XLM-RoBERTa | [Snowflake/snowflake-arctic-embed-l-v2.0](https://hf.co/Snowflake/snowflake-arctic-embed-l-v2.0) |
37+
| 40 | 0.3B | Alibaba GTE | [Snowflake/snowflake-arctic-embed-m-v2.0](https://hf.co/Snowflake/snowflake-arctic-embed-m-v2.0) |
38+
| 51 | 0.3B | Bert | [WhereIsAI/UAE-Large-V1](https://hf.co/WhereIsAI/UAE-Large-V1) |
3739
| N/A | 0.4B | Alibaba GTE | [Alibaba-NLP/gte-large-en-v1.5](https://hf.co/Alibaba-NLP/gte-large-en-v1.5) |
38-
| N/A | 0.4B | ModernBERT | [answerdotai/ModernBERT-large](https://hf.co/answerdotai/ModernBERT-large) |
40+
| N/A | 0.4B | ModernBERT | [answerdotai/ModernBERT-large](https://hf.co/answerdotai/ModernBERT-large) |
41+
| N/A | 0.3B | NomicBert | [nomic-ai/nomic-embed-text-v2-moe](https://hf.co/nomic-ai/nomic-embed-text-v2-moe) |
3942
| N/A | 0.1B | NomicBert | [nomic-ai/nomic-embed-text-v1](https://hf.co/nomic-ai/nomic-embed-text-v1) |
4043
| N/A | 0.1B | NomicBert | [nomic-ai/nomic-embed-text-v1.5](https://hf.co/nomic-ai/nomic-embed-text-v1.5) |
4144
| N/A | 0.1B | JinaBERT | [jinaai/jina-embeddings-v2-base-en](https://hf.co/jinaai/jina-embeddings-v2-base-en) |
@@ -56,6 +59,7 @@ Below are some examples of the currently supported models:
5659
| Re-Ranking | XLM-RoBERTa | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) |
5760
| Re-Ranking | XLM-RoBERTa | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) |
5861
| Re-Ranking | GTE | [Alibaba-NLP/gte-multilingual-reranker-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-reranker-base) |
62+
| Re-Ranking | ModernBert | [Alibaba-NLP/gte-reranker-modernbert-base](https://huggingface.co/Alibaba-NLP/gte-reranker-modernbert-base) |
5963
| Sentiment Analysis | RoBERTa | [SamLowe/roberta-base-go_emotions](https://huggingface.co/SamLowe/roberta-base-go_emotions) |
6064

6165
## Supported hardware

0 commit comments

Comments
 (0)