@@ -56,6 +56,7 @@ Ember, GTE and E5. TEI implements many features such as:
56
56
[ Candle] ( https://github.com/huggingface/candle )
57
57
and [ cuBLASLt] ( https://docs.nvidia.com/cuda/cublas/#using-the-cublaslt-api )
58
58
* [ Safetensors] ( https://github.com/huggingface/safetensors ) weight loading
59
+ * [ ONNX] ( https://github.com/onnx/onnx ) weight loading
59
60
* Production ready (distributed tracing with Open Telemetry, Prometheus metrics)
60
61
61
62
## Get Started
@@ -478,7 +479,9 @@ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
478
479
Then run:
479
480
480
481
``` shell
481
- # On x86
482
+ # On x86 with ONNX backend (recommended)
483
+ cargo install --path router -F ort
484
+ # On x86 with Intel backend
482
485
cargo install --path router -F mkl
483
486
# On M1 or M2
484
487
cargo install --path router -F metal
@@ -498,11 +501,11 @@ text-embeddings-router --model-id $model --port 8080
498
501
sudo apt-get install libssl-dev gcc -y
499
502
```
500
503
501
- ### Cuda
504
+ ### CUDA
502
505
503
- GPUs with Cuda compute capabilities < 7.5 are not supported (V100, Titan V, GTX 1000 series, ...).
506
+ GPUs with CUDA compute capabilities < 7.5 are not supported (V100, Titan V, GTX 1000 series, ...).
504
507
505
- Make sure you have Cuda and the nvidia drivers installed. NVIDIA drivers on your device need to be compatible with CUDA
508
+ Make sure you have CUDA and the nvidia drivers installed. NVIDIA drivers on your device need to be compatible with CUDA
506
509
version 12.2 or higher.
507
510
You also need to add the nvidia binaries to your path:
508
511
@@ -538,7 +541,7 @@ You can build the CPU container with:
538
541
docker build .
539
542
```
540
543
541
- To build the Cuda containers, you need to know the compute cap of the GPU you will be using
544
+ To build the CUDA containers, you need to know the compute cap of the GPU you will be using
542
545
at runtime.
543
546
544
547
Then you can build the container with:
0 commit comments