You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/known_issues.md
+9-36Lines changed: 9 additions & 36 deletions
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ Troubleshooting
4
4
## General Usage
5
5
6
6
-**Problem**: FP64 data type is unsupported on current platform.
7
-
-**Cause**: FP64 is not natively supported by the [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html) and [Intel® Arc™ A-Series Graphics](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/arc.html) platforms.
7
+
-**Cause**: FP64 is not natively supported by the [Intel® Arc™ A-Series Graphics](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/arc.html) platforms.
8
8
If you run any AI workload on that platform and receive this error message, it means a kernel requires FP64 instructions that are not supported and the execution is stopped.
9
9
-**Problem**: Runtime error `invalid device pointer` if `import horovod.torch as hvd` before `import intel_extension_for_pytorch`.
10
10
-**Cause**: Intel® Optimization for Horovod\* uses utilities provided by Intel® Extension for PyTorch\*. The improper import order causes Intel® Extension for PyTorch\* to be unloaded before Intel®
@@ -23,12 +23,6 @@ Troubleshooting
23
23
-**Problem**: Some workloads terminate with an error `CL_DEVICE_NOT_FOUND` after some time on WSL2.
24
24
-**Cause**: This issue is due to the [TDR feature](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys#tdrdelay) on Windows.
25
25
-**Solution**: Try increasing TDRDelay in your Windows Registry to a large value, such as 20 (it is 2 seconds, by default), and reboot.
26
-
-**Problem**: RuntimeError: Can't add devices across platforms to a single context. -33 (PI_ERROR_INVALID_DEVICE).
27
-
-**Cause**: If you run Intel® Extension for PyTorch\* in a Windows environment where Intel® discrete GPU and integrated GPU co-exist, and the integrated GPU is not supported by Intel® Extension for PyTorch\* but is wrongly identified as the first GPU platform.
28
-
-**Solution**: Disable the integrated GPU in your environment to work around. For long term, Intel® Graphics Driver will always enumerate the discrete GPU as the first device so that Intel® Extension for PyTorch\* could provide the fastest device to end framework users in such co-exist scenario based on that.
29
-
-**Problem**: RuntimeError: Failed to load the backend extension: intel_extension_for_pytorch. You can disable extension auto-loading with TORCH_DEVICE_BACKEND_AUTOLOAD=0.
30
-
-**Cause**: If you import any third party library such as Transformers before `import torch`, and the third party library has dependency to torch and then implicitly autoloads intel_extension_for_pytorch, which introduces circle import.
31
-
-**Solution**: Disable extension auto-loading with TORCH_DEVICE_BACKEND_AUTOLOAD=0.
32
26
33
27
## Library Dependencies
34
28
@@ -84,13 +78,8 @@ Troubleshooting
84
78
```
85
79
86
80
- **Problem**: If you encounter issues Runtime error related to C++ compiler with `torch.compile`. Runtime Error: Failed to find C++ compiler. Please specify via CXX environment variable.
87
-
- **Cause**: Not install and activate DPC++/C++ Compiler correctly.
88
-
- **Solution**: [Install DPC++/C++ Compiler](https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html) and activate it by following commands.
89
-
90
-
```bash
91
-
# {dpcpproot} is the location for dpcpp ROOT path and it is where you installed oneAPI DPCPP, usually it is /opt/intel/oneapi/compiler/latest or ~/intel/oneapi/compiler/latest
92
-
source {dpcpproot}/env/vars.sh
93
-
```
81
+
- **Cause**: Not activate C++ compiler. `torch.compile` need to find correct `cl.exe` path.
82
+
- **Solution**: One could open "Developer Command Prompt for VS 2022" or follow [Visual Studio Developer Command Prompt and Developer PowerShell](https://learn.microsoft.com/en-us/visualstudio/ide/reference/command-prompt-powershell?view=vs-2022#developer-command-prompt) to activate visual studio environment.
94
83
95
84
- **Problem**: LoweringException: ImportError: cannot import name 'intel' from 'triton._C.libtriton'
96
85
- **Cause**: Installing Triton causes pytorch-triton-xpu to stop working.
- **Problem**: ERROR: can not install dpcpp-cpp-rt and torch==2.6.0 because these packages version has conflicting dependencies.
109
-
- **Cause**: The intel-extension-for-pytorch v2.6.10+xpu uses Intel DPC++ Compiler 2025.0.4 to get a crucial bug fix in unified runtime, while torch v2.6.0+xpu is pinned with 2025.0.2, so we can not install PyTorch and intel-extension-for-pytorch in one pip installation command.
110
-
- **Solution**: Install PyTorch and intel-extension-for-pytorch with seperate commands.
- **Problem**: ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
117
-
118
-
```
119
-
torch 2.6.0+xpu requires intel-cmplr-lib-rt==2025.0.2, but you have intel-cmplr-lib-rt 2025.0.4 which is incompatible.
120
-
torch 2.6.0+xpu requires intel-cmplr-lib-ur==2025.0.2, but you have intel-cmplr-lib-ur 2025.0.4 which is incompatible.
121
-
torch 2.6.0+xpu requires intel-cmplr-lic-rt==2025.0.2, but you have intel-cmplr-lic-rt 2025.0.4 which is incompatible.
122
-
torch 2.6.0+xpu requires intel-sycl-rt==2025.0.2, but you have intel-sycl-rt 2025.0.4 which is incompatible.
123
-
```
124
-
125
-
- **Cause**: The intel-extension-for-pytorch v2.6.10+xpu uses Intel DPC++ Compiler 2025.0.4 to get a crucial bug fix in unified runtime, while torch v2.6.0+xpu is pinned with 2025.0.2.
126
-
- **Solution**: Ignore the Error since actually torch v2.6.0+xpu is compatible with Intel Compiler 2025.0.4.
127
96
128
97
- **Problem**: RuntimeError: oneCCL: ze_handle_manager.cpp:226 get_ptr: EXCEPTION: unknown memory type, when executing DLRMv2 BF16 training on 4 cards Intel® Data Center GPU Max platform.
129
98
- **Cause**: Issue exists in the default sycl path of oneCCL 2021.14 which uses two IPC exchanges.
130
-
- **Solution**: Use `export CCL_ATL_TRANSPORT=ofi` to work around.
99
+
- **Solution**: Use `export CCL_ATL_TRANSPORT=ofi` to work around.
100
+
101
+
- **Problem**: Segmentation fault, when executing LLaMa2-70B inference on Intel® Data Center GPU Max platform, base on online quantization.
102
+
- **Cause**: Issue exists Intel Neural Compressor (INC) v3.3: during the initial import of INC, the accelerator is cached with `lru_cache`. Subsequently, setting `INC_TARGET_DEVICE`in INC transformers-like API does not take effect. This results in two devices being present in the model, leading to memory-related errors as seen in the error messages.
103
+
- **Solution**: Run the workload `INC_TARGET_DEVICE="cpu" python` to work around, if using online quantization.
Copy file name to clipboardExpand all lines: docs/tutorials/releases.md
+35Lines changed: 35 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,41 @@
1
1
Releases
2
2
=============
3
3
4
+
## 2.7.10+xpu
5
+
6
+
Intel® Extension for PyTorch\* v2.7.10+xpu is the new release which supports Intel® GPU platforms (Intel® Arc™ Graphics family, Intel® Core™ Ultra Processors with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 Mobile Processors and Intel® Data Center GPU Max Series) based on PyTorch\* 2.7.0.
7
+
8
+
### Highlights
9
+
10
+
- Intel® oneDNN v3.7.1 integration
11
+
12
+
- Large Language Model (LLM) optimization
13
+
14
+
Intel® Extension for PyTorch* optimizes typical LLM models like Llama 2, Llama 3, Phi-3-mini, Qwen2, and GLM-4 on the Intel® Arc™ Graphics family. Moreover, new LLM inference models such as Llama 3.3, Phi-3.5-mini, Qwen2.5, and Mistral-7B are also optimized on Intel® Data Center GPU Max Series platforms compared to the previous release. A full list of optimized models can be found in the [LLM Optimizations Overview](https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/llm.html), with supported transformer version updates to [4.48.3](https://github.com/huggingface/transformers/releases/tag/v4.48.3).
15
+
16
+
- Serving framework support
17
+
18
+
Intel® Extension for PyTorch\* offers extensive support for various ecosystems, including [vLLM](https://github.com/vllm-project/vllm) and [TGI](https://github.com/huggingface/text-generation-inference), with the goal of enhancing performance and flexibility for LLM workloads on Intel® GPU platforms (intensively verified on Intel® Data Center GPU Max Series and Intel® Arc™ B-Series graphics on Linux). The vLLM/TGI features, such as chunked prefill and MoE (Mixture of Experts), are supported by the backend kernels provided in Intel® Extension for PyTorch*. In this release, Intel® Extension for PyTorch\* adds sliding windows support in `ipex.llm.modules.PagedAttention.flash_attn_varlen_func` to meet the need of models like Phi3, and Mistral, which enable sliding window support by default.
19
+
20
+
-[Prototype] QLoRA/LoRA finetuning using BitsAndBytes
21
+
22
+
Intel® Extension for PyTorch* supports QLoRA/LoRA finetuning with [BitsAndBytes](https://github.com/bitsandbytes-foundation/bitsandbytes) on Intel® GPU platforms. This release includes several enhancements for better performance and functionality:
23
+
- The performance of the NF4 dequantize kernel has been improved by approximately 4.4× to 5.6× across different shapes compared to the previous release.
24
+
-`_int_mm` support in INT8 has been added to enable INT8 LoRA finetuning in PEFT (with float optimizers like `adamw_torch`).
25
+
26
+
- Codegen support removal
27
+
28
+
Removes codegen support from Intel® Extension for PyTorch\* and reuses the codegen capability from [Torch XPU Operators](https://github.com/intel/torch-xpu-ops), to ensure interoperability of code change in codegen with usages in Intel® Extension for PyTorch\*.
29
+
30
+
-[Prototype] Python 3.13t support
31
+
32
+
Adds prototype support for Python 3.13t and provides prebuilt binaries on the [download server](https://pytorch-extension.intel.com/release-whl/stable/xpu/us/).
33
+
34
+
### Known Issues
35
+
36
+
Please refer to [Known Issues webpage](./known_issues.md).
37
+
38
+
4
39
## 2.6.10+xpu
5
40
6
41
Intel® Extension for PyTorch\* v2.6.10+xpu is the new release which supports Intel® GPU platforms (Intel® Data Center GPU Max Series, Intel® Arc™ Graphics family, Intel® Core™ Ultra Processors with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 Mobile Processors and Intel® Data Center GPU Flex Series) based on PyTorch* 2.6.0.
0 commit comments