triton+vllm serve embeddings but not support list[str]

**Description**
Hi team, I follow this guide: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/client_guide/openai_readme.html#embedding-models to serve the "bge-large-zh-v1.5" model, link here:https://huggingface.co/BAAI/bge-large-zh-v1.5,  and I try to send a single request it also works, but when I try to use aiperf to conduct the benchmark test, it faild, tips Input should be a valid string, so it seems like not support the list[str].

**Triton Information**
What version of Triton are you using?
nvcr.io/nvidia/tritonserver:26.01-vllm-python-py3

Are you using the Triton container or did you build it yourself?
nvcr.io/nvidia/tritonserver:26.01-vllm-python-py3

**To Reproduce**
Steps to reproduce the behavior.
I use the above image and create a deployment, command as below:
cd /opt/tritonserver/python/openai 
python3 openai_frontend/main.py --model-repository xxx --openai-port 8000                                                                                                                                                         

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

model:BAAI/bge-large-zh-v1.5

config.pbtxt:
    backend: "vllm"
    instance_group [{kind: KIND_MODEL}]

model.json
{"model": "xxx","gpu_memory_utilization": 0.9}

**Expected behavior**
It should allow the list[str], its the openai specific.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

triton+vllm serve embeddings but not support list[str] #8655

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

triton+vllm serve embeddings but not support list[str] #8655

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions