Deployment of local MultiLoRA model using TGI

Hi Team,

Was trying to deploy a multi-lora adapter model with Starcoder2-3B as base.

Referring to the below blog:
https://huggingface.co/blog/multi-lora-serving

Please correct my understanding if I'm am wrong, that the Starcoder2 model is not supported for the multi-lora deployment using TGI. We are getting the below error while deploying.

```
AttributeError: 'TensorParallelColumnLinear' object has no attribute 'base_layer' rank=0
```
Also, can you suggest how we can deploy a local model and adapters saved in the local directory using TGI. 
Every time I try running the below docker command, it is downloading the files from HF.

```
docker run --gpus all --shm-size 1g -p 8080:80 -v $PWD:/data \
    ghcr.io/huggingface/text-generation-inference:3.0.1 \
    --model-id bigcode/starcoder2-3b \
    --lora-adapters=<local_adapter_path>
```

Please let me know if any additional information is required.

Thanks,
Ashwin.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Deployment of local MultiLoRA model using TGI #2564

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Deployment of local MultiLoRA model using TGI #2564

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions