|
1 |
| -# llm-perf-backend |
2 |
| -The backend of [the LLM-perf leaderboard](https://huggingface.co/spaces/optimum/llm-perf-leaderboard) |
| 1 | +# LLM-perf Backend 🏋️ |
3 | 2 |
|
4 |
| -## Why |
5 |
| -this runs all the benchmarks to get results for the leaderboard |
| 3 | +The official backend system powering the [LLM-perf Leaderboard](https://huggingface.co/spaces/optimum/llm-perf-leaderboard). This repository contains the infrastructure and tools needed to run standardized benchmarks for Large Language Models (LLMs) across different hardware configurations and optimization backends. |
6 | 4 |
|
7 |
| -## How to install |
8 |
| -git clone |
9 |
| -pip install -e .[openvino] |
| 5 | +## About 📝 |
10 | 6 |
|
11 |
| -## How to use the cli |
12 |
| -llm-perf run-benchmark --hardware cpu --backend openvino |
| 7 | +LLM-perf Backend is designed to: |
| 8 | +- Run automated benchmarks for the LLM-perf leaderboard |
| 9 | +- Ensure consistent and reproducible performance measurements |
| 10 | +- Support multiple hardware configurations and optimization backends |
| 11 | +- Generate standardized performance metrics for latency, throughput, memory usage, and energy consumption |
| 12 | + |
| 13 | +## Key Features 🔑 |
| 14 | + |
| 15 | +- Standardized benchmarking pipeline using [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) |
| 16 | +- Support for multiple hardware configurations (CPU, GPU) |
| 17 | +- Multiple backend implementations (PyTorch, Onnxruntime, etc.) |
| 18 | +- Automated metric collection: |
| 19 | + - Latency and throughput measurements |
| 20 | + - Memory usage tracking |
| 21 | + - Energy consumption monitoring |
| 22 | + - Quality metrics integration with Open LLM Leaderboard |
| 23 | + |
| 24 | +## Installation 🛠️ |
| 25 | + |
| 26 | +1. Clone the repository: |
| 27 | +```bash |
| 28 | +git clone https://github.com/huggingface/llm-perf-backend |
| 29 | +cd llm-perf-backend |
| 30 | +``` |
| 31 | + |
| 32 | +2. Create a python env |
| 33 | +```bash |
| 34 | +python -m venv .venv |
| 35 | +source .venv/bin/activate |
| 36 | +``` |
| 37 | + |
| 38 | +2. Install the package with required dependencies: |
| 39 | +```bash |
| 40 | +pip install -e "." |
| 41 | +# or |
| 42 | +pip install -e ".[all]" # to install optional dependency like Onnxruntime |
| 43 | +``` |
| 44 | + |
| 45 | +## Usage 📋 |
| 46 | + |
| 47 | +### Command Line Interface |
| 48 | + |
| 49 | +Run benchmarks using the CLI tool: |
| 50 | + |
| 51 | +```bash |
13 | 52 | llm-perf run-benchmark --hardware cpu --backend pytorch
|
| 53 | +``` |
| 54 | + |
| 55 | +### Configuration Options |
| 56 | + |
| 57 | +View all the options with |
| 58 | +```bash |
| 59 | +llm-perf run-benchmark --help |
| 60 | +``` |
| 61 | + |
| 62 | +- `--hardware`: Target hardware platform (cpu, cuda) |
| 63 | +- `--backend`: Backend framework to use (pytorch, onnxruntime, etc.) |
| 64 | + |
| 65 | +## Benchmark Dataset 📊 |
| 66 | + |
| 67 | +Results are published to the official dataset: |
| 68 | +[optimum-benchmark/llm-perf-leaderboard](https://huggingface.co/datasets/optimum-benchmark/llm-perf-leaderboard) |
| 69 | + |
| 70 | +## Benchmark Specifications 📑 |
14 | 71 |
|
15 |
| -https://huggingface.co/datasets/optimum-benchmark/llm-perf-leaderboard |
| 72 | +All benchmarks follow these standardized settings: |
| 73 | +- Single GPU usage to avoid communication-dependent results |
| 74 | +- Energy monitoring via CodeCarbon |
| 75 | +- Memory tracking: |
| 76 | + - Maximum allocated memory |
| 77 | + - Maximum reserved memory |
| 78 | + - Maximum used memory (via PyNVML for GPU) |
0 commit comments