[Feature] Support AWQ models quantized with 'llm-compressor' framework

### Motivation

Currently only AWQ models quantized with 'AutoAWQ' framework [1] are supported by 'lmdeploy'.
The AutoAWQ framework is deprecated, so many recent AWQ models on HF model repo are quantized with its successor 'llm-compressor' [2].
For example, the Qwen3-VL models by 'cynkiwi' (e.g. the 4B model [3]) are all quantized with 'llm-compressor'.
One can see which framework was used in the 'config.json' of the downloaded model, in key 'quant_method'.
For new 'llm-compressor' framework, its value is 'compressed-tensors'.

So it would be great if llmdeploy could support also AWQ models quantized with 'llm-compressor' framework.

References:
[1] https://docs.vllm.ai/en/latest/features/quantization/auto_awq/
[2] https://github.com/vllm-project/llm-compressor
[3] https://huggingface.co/cyankiwi/Qwen3-VL-4B-Instruct-AWQ-4bit/

### Related resources

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support AWQ models quantized with 'llm-compressor' framework #4539

Motivation

Related resources

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Support AWQ models quantized with 'llm-compressor' framework #4539

Description

Motivation

Related resources

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions