Skip to content

Integrate vLLM Evaluator #23

@adivekar-utexas

Description

@adivekar-utexas

vLLM is a high-throughput LLM evaluator which runs on HuggingFace models, performing various kinds of model sharding across GPUs using Ray backend.
In its basic form, vLLM is a great speedup over AccelerateEvaluator, which is quite slow.

Basic requirements:

  1. Should be compatible with RayEvaluator (and GenerativeLM if needed).
  2. Should support only single-node models; scaling up models should require larger nodes (design choice for better execution speed).
  3. Should integrate with all HF transformers LLMs.

Metadata

Metadata

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions