Skip to content

Deployment of local MultiLoRA model using TGI #2564

@ashwincv0112

Description

@ashwincv0112

Hi Team,

Was trying to deploy a multi-lora adapter model with Starcoder2-3B as base.

Referring to the below blog:
https://huggingface.co/blog/multi-lora-serving

Please correct my understanding if I'm am wrong, that the Starcoder2 model is not supported for the multi-lora deployment using TGI. We are getting the below error while deploying.

AttributeError: 'TensorParallelColumnLinear' object has no attribute 'base_layer' rank=0

Also, can you suggest how we can deploy a local model and adapters saved in the local directory using TGI.
Every time I try running the below docker command, it is downloading the files from HF.

docker run --gpus all --shm-size 1g -p 8080:80 -v $PWD:/data \
    ghcr.io/huggingface/text-generation-inference:3.0.1 \
    --model-id bigcode/starcoder2-3b \
    --lora-adapters=<local_adapter_path>

Please let me know if any additional information is required.

Thanks,
Ashwin.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions