Skip to content

Commit 08361b7

Browse files
dsikkamgoin
andauthored
Update Speculators v0.3.0 Blog Links (#141)
Signed-off-by: Michael Goin <[email protected]> Co-authored-by: Michael Goin <[email protected]>
1 parent 0c0484e commit 08361b7

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

_posts/2025-12-13-speculators-v030.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ image: /assets/figures/2025-12-13-speculators-v030/cropped_workflow.png
1111
- [Speculators v0.3.0](https://github.com/vllm-project/speculators/releases/tag/v0.3.0) provides end-to-end training support for Eagle3 draft models that can seamlessly run with vLLM
1212
- Support for training includes offline data generation using vLLM as well as training capabilities for single- and multi-layer draft models, for both MoE and non-MoE verifiers
1313

14+
## Inference at scale
1415
Over the past decade, LLMs have expanded rapidly in both scale and capability, bringing with it increasing demands on inference performance. As LLMs generate tokens sequentially—with each token requiring a full forward pass through billions of parameters—the cost of generation scales quickly. As model sizes continue to rise, this sequential computation becomes a significant bottleneck, making today’s LLMs incredibly capable yet often slow.
1516

1617
One promising optimization to alleviate this challenge is speculative decoding, which accelerates generation by allowing smaller draft models to propose tokens that the larger model can quickly verify.
@@ -100,7 +101,7 @@ Together these components make Speculators Eagle3 model training fast and memory
100101

101102
## Running Speculators models in vLLM
102103

103-
Once training is complete, the library generates a complete model artifact with an extended config.json file that includes a speculators_config. Models can then be run seamlessly in vLLM using a simple vllm serve command:
104+
Once training is complete, the library generates a complete model artifact with an extended config.json file that includes a `speculators_config`. Models can then be run seamlessly in vLLM using a simple vllm serve command:
104105

105106
```bash
106107
vllm serve RedHatAI/Llama-3.1-8B-Instruct-speculator.eagle3
@@ -146,13 +147,13 @@ Speculators will be focusing on the following next set of features:
146147

147148
## Get involved!
148149

149-
Interested in learning more about speculative decoding? Check out the [Speculators repository](https://github.com/vllm-project/speculators) and help grow the repository by checking out [First Good Issues](https://github.com/vllm-project/speculators/issues)!
150+
Interested in learning more about speculative decoding? Check out the [Speculators repository](https://github.com/vllm-project/speculators) and help grow the repository by checking out [Good First Issues](https://github.com/vllm-project/speculators/issues)!
150151

151152
For additional resources, documentation, and slack channels, check out:
152-
- **Speculators Documentation**: https://docs.vllm.ai/projects/Speculators/en/latest/
153+
- **Speculators Documentation**: [https://docs.vllm.ai/projects/speculators/en/latest/](https://docs.vllm.ai/projects/speculators/en/latest/)
153154
- **vLLM slack channels**: `#speculators`, `#feat-spec-decode`
154-
- **Data Generation and Training Scripts**: https://github.com/vllm-project/speculators/blob/main/scripts/README.md
155-
- **End-to-end examples**: https://github.com/vllm-project/Speculators/tree/main/examples/data_generation_and_training
155+
- **Data Generation and Training Scripts**: [https://github.com/vllm-project/speculators/blob/main/scripts/README.md](https://github.com/vllm-project/speculators/blob/main/scripts/README.md)
156+
- **End-to-end examples**: [https://github.com/vllm-project/Speculators/tree/main/examples/data_generation_and_training](https://github.com/vllm-project/Speculators/tree/main/examples/data_generation_and_training)
156157
- For a list of already trained Speculators models, check out the [Red Hat AI Hub](https://huggingface.co/collections/RedHatAI/speculator-models)
157158

158159
## Appendix

0 commit comments

Comments
 (0)