You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Intel® Extension for PyTorch\* extends PyTorch\* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel X<sup>e</sup> Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.
Copy file name to clipboardExpand all lines: docs/tutorials/getting_started.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Quick Start
2
2
3
-
The following instructions assume you have installed the Intel® Extension for PyTorch\*. For installation instructions, refer to [Installation](../../../index.html#installation?platform=cpu&version=v2.7.0%2Bcpu).
3
+
The following instructions assume you have installed the Intel® Extension for PyTorch\*. For installation instructions, refer to [Installation](../../../index.html#installation?platform=cpu&version=v2.8.0%2Bcpu).
4
4
5
5
To start using the Intel® Extension for PyTorch\* in your code, you need to make the following changes:
6
6
@@ -157,4 +157,4 @@ with torch.inference_mode(), torch.cpu.amp.autocast(enabled=amp_enabled):
157
157
print(gen_text, total_new_tokens, flush=True)
158
158
```
159
159
160
-
More LLM examples, including usage of low precision data types are available in the [LLM Examples](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/llm) section.
160
+
More LLM examples, including usage of low precision data types are available in the [LLM Examples](https://github.com/intel/intel-extension-for-pytorch/tree/release/2.8/examples/cpu/llm) section.
Select your preferences and follow the installation instructions provided on the [Installation page](../../../index.html#installation?platform=cpu&version=v2.7.0%2Bcpu).
4
+
Select your preferences and follow the installation instructions provided on the [Installation page](../../../index.html#installation?platform=cpu&version=v2.8.0%2Bcpu).
5
5
6
6
After successful installation, refer to the [Quick Start](getting_started.md) and [Examples](examples.md) sections to start using the extension in your code.
7
7
8
-
**NOTE:** For detailed instructions on installing and setting up the environment for Large Language Models (LLM), as well as example scripts, refer to the [LLM best practices](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/llm).
8
+
**NOTE:** For detailed instructions on installing and setting up the environment for Large Language Models (LLM), as well as example scripts, refer to the [LLM best practices](https://github.com/intel/intel-extension-for-pytorch/tree/release/2.8/examples/cpu/llm).
Copy file name to clipboardExpand all lines: docs/tutorials/llm.rst
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ Verified for distributed inference mode via DeepSpeed
30
30
31
31
*Note*: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and customized linear kernels. We are working in progress to better support the models in the tables with various data types. In addition, more models will be optimized in the future.
32
32
33
-
Please check `LLM best known practice <https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/llm>`_ for instructions to install/setup environment and example scripts.
33
+
Please check `LLM best known practice <https://github.com/intel/intel-extension-for-pytorch/tree/release/2.8/examples/cpu/llm>`_ for instructions to install/setup environment and example scripts.
34
34
35
35
Module Level Optimization API for customized LLM (Prototype)
Copy file name to clipboardExpand all lines: docs/tutorials/llm/llm_optimize.md
+2-4Lines changed: 2 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,12 +9,10 @@ This API currently supports for inference workloads of certain models.
9
9
API documentation is available at [API Docs page](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/api_doc.html#ipex.llm.optimize),
10
10
and supported model list can be found at [this page](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/llm.html#ipexllm-optimized-model-list-for-inference).
11
11
12
-
For LLM fine-tuning, please check the [LLM fine-tuning tutorial](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/llm/fine-tuning).
13
-
14
12
## Pseudocode of Common Usage Scenarios
15
13
16
14
The following sections show pseudocode snippets to invoke Intel® Extension for PyTorch\* APIs to work with LLM models.
17
-
Complete examples can be found at [the Example directory](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/llm/inference).
15
+
Complete examples can be found at [the Example directory](https://github.com/intel/intel-extension-for-pytorch/tree/release/2.8/examples/cpu/llm/inference).
18
16
19
17
### FP32/BF16
20
18
@@ -59,7 +57,7 @@ model = ipex.llm.optimize(model, quantization_config=qconfig, low_precision_chec
59
57
60
58
Distributed inference can be performed with `DeepSpeed`. Based on original Intel® Extension for PyTorch\* scripts, the following code changes are required.
61
59
62
-
Check [LLM distributed inference examples](https://github.com/intel/intel-extension-for-pytorch/tree/main/examples/cpu/llm/inference/distributed) for complete codes.
60
+
Check [LLM distributed inference examples](https://github.com/intel/intel-extension-for-pytorch/tree/release/2.8/examples/cpu/llm/inference/distributed) for complete codes.
0 commit comments