You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/gpu/llm/README.md
+5-6Lines changed: 5 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ Here you can find examples for large language models (LLM) text generation. Thes
5
5
> [!NOTE]
6
6
> New Llama models like Llama3.2-1B, Llama3.2-3B and Llama3.3-7B are also supported from release v2.7.10+xpu.
7
7
8
-
- Include both inference/finetuning(lora)/bitsandbytes(qlora-finetuning)/training.
8
+
- Include both inference/finetuning(lora)/bitsandbytes(qlora-finetuning).
9
9
- Include both single instance and distributed (DeepSpeed) use cases for FP16 optimization.
10
10
- Support Llama, GPT-J, Qwen, OPT, Bloom model families and some other models such as Baichuan2-13B and Phi3-mini.
11
11
- Cover model generation inference with low precision cases for different models with best performance and accuracy (fp16 AMP and weight only quantization)
0 commit comments