Skip to content

Commit 3f1ac30

Browse files
tye1dbkinderganyi1996ppo
authored
[doc] Add releases.md (#1706)
* Add releases.md * Add features * add onemkl issues Co-authored-by: David Kinder <[email protected]> Co-authored-by: Pleaplusone <[email protected]>
1 parent 825031c commit 3f1ac30

File tree

2 files changed

+125
-1
lines changed

2 files changed

+125
-1
lines changed

docs/tutorials/features.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Check the `API Documentation <api_doc.html>`_ for details of API functions and `
1111
DPC++ Extension
1212
---------------
1313

14-
Intel® Extension for PyTorch\* provides C++ APIs to get DPCPP queue and configure floating-point math mode.
14+
Intel® Extension for PyTorch\* provides C++ APIs to get SYCL queue and configure floating-point math mode.
1515

1616
Check the `API Documentation`_ for the details of API functions. `DPC++ Extension <features/DPC++_Extension.md>`_ describes how to write customized DPC++ kernels with a practical example and build it with setuptools and CMake.
1717

docs/tutorials/releases.md

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
Releases
2+
=============
3+
4+
## 1.10.200+gpu
5+
6+
Intel® Extension for PyTorch\* v1.10.200+gpu extends PyTorch\* 1.10 with up-to-date features and optimizations on XPU for an extra performance boost on Intel Graphics cards. XPU is a user visible device that is a counterpart of the well-known CPU and CUDA in the PyTorch\* community. XPU represents an Intel-specific kernel and graph optimizations for various “concrete” devices. The XPU runtime will choose the actual device when executing AI workloads on the XPU device. The default selected device is Intel GPU. XPU kernels from Intel® Extension for PyTorch\* are written in [DPC++](https://github.com/intel/llvm#oneapi-dpc-compiler) that supports [SYCL language](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html) and also a number of [DPC++ extensions](https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions).
7+
8+
### Highlights
9+
10+
This release introduces specific XPU solution optimizations on Intel Graphics cards. Optimized operators and kernels are implemented and registered through PyTorch\* dispatching mechanism for the XPU device and XPU backend. These operators and kernels are accelerated on Intel GPU hardware from the corresponding native vectorization and matrix calculation features. In graph mode, additional operator fusions are supported to reduce operator/kernel invocation overheads, and thus increase performance.
11+
12+
This release provides the following features:
13+
- Auto Mixed Precision (AMP)
14+
- support of AMP with BFloat16 and Float16 optimization of GPU operators
15+
- Channels Last
16+
- support of channels_last (NHWC) memory format for most key GPU operators
17+
- DPC++ Extension
18+
- mechanism to create PyTorch\* operators with custom DPC++ kernels running on the XPU backend
19+
- Optimized Fusion
20+
- support of SGD/AdamW fusion for both FP32 and BF16 precision
21+
22+
This release supports the following fusion patterns in PyTorch\* JIT mode:
23+
24+
- Conv2D + ReLU
25+
- Conv2D + Sum
26+
- Conv2D + Sum + ReLU
27+
- Pad + Conv2d
28+
- Conv2D + SiLu
29+
- Permute + Contiguous
30+
- Conv3D + ReLU
31+
- Conv3D + Sum
32+
- Conv3D + Sum + ReLU
33+
- Linear + ReLU
34+
- Linear + Sigmoid
35+
- Linear + Div(scalar)
36+
- Linear + GeLu
37+
- Linear + GeLu_
38+
- T + Addmm
39+
- T + Addmm + ReLu
40+
- T + Addmm + Sigmoid
41+
- T + Addmm + Dropout
42+
- T + Matmul
43+
- T + Matmul + Add
44+
- T + Matmul + Add + GeLu
45+
- T + Matmul + Add + Dropout
46+
- Transpose + Matmul
47+
- Transpose + Matmul + Div
48+
- Transpose + Matmul + Div + Add
49+
- MatMul + Add
50+
- MatMul + Div
51+
- Dequantize + PixelShuffle
52+
- Dequantize + PixelShuffle + Quantize
53+
- Mul + Add
54+
- Add + ReLU
55+
- Conv2D + Leaky_relu
56+
- Conv2D + Leaky_relu_
57+
- Conv2D + Sigmoid
58+
- Conv2D + Dequantize
59+
- Softplus + Tanh
60+
- Softplus + Tanh + Mul
61+
- Conv2D + Dequantize + Softplus + Tanh + Mul
62+
- Conv2D + Dequantize + Softplus + Tanh + Mul + Quantize
63+
- Conv2D + Dequantize + Softplus + Tanh + Mul + Quantize + Add
64+
65+
### Known Issues
66+
67+
- #### [CRITICAL ERROR] Kernel 'XXX' removed due to usage of FP64 instructions unsupported by the targeted hardware
68+
69+
FP64 is not natively supported by the [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html) platform. If you run any AI workload on that platform and receive this error message, it means a kernel requiring FP64 instructions is removed and not executed, hence the accuracy of the whole workload is wrong.
70+
71+
- #### symbol undefined caused by _GLIBCXX_USE_CXX11_ABI:
72+
73+
Error info: <br>
74+
75+
```bash
76+
File "/root/.local/lib/python3.9/site-packages/ipex/__init__.py", line 4, in <module>
77+
from . import _C
78+
ImportError: /root/.local/lib/python3.9/site-packages/ipex/lib/libipex_gpu_core.so: undefined symbol: _ZNK5torch8autograd4Node4nameB5cxx11Ev
79+
```
80+
81+
DPC++ does not support \_GLIBCXX_USE_CXX11_ABI=0, Intel® Extension for PyTorch\* is always compiled with \_GLIBCXX_USE_CXX11_ABI=1. This symbol undefined issue appears when PyTorch\* is compiled with \_GLIBCXX_USE_CXX11_ABI=0. Update PyTorch\* CMAKE file to set \_GLIBCXX_USE_CXX11_ABI=1 and compile PyTorch\* with particular compiler which supports \_GLIBCXX_USE_CXX11_ABI=1. We recommend to use gcc version 9.4.0 on ubuntu 20.04. <br>
82+
83+
- #### Can't find oneMKL library when build Intel® Extension for PyTorch\* without oneMKL
84+
85+
Error info: <br>
86+
87+
```bash
88+
/usr/bin/ld: cannot find -lmkl_sycl <br>
89+
/usr/bin/ld: cannot find -lmkl_intel_ilp64 <br>
90+
/usr/bin/ld: cannot find -lmkl_core <br>
91+
/usr/bin/ld: cannot find -lmkl_tbb_thread <br>
92+
dpcpp: error: linker command failed with exit code 1 (use -v to see invocation) <br>
93+
```
94+
95+
When PyTorch\* is built with oneMKL library and Intel® Extension for PyTorch\* is built without oneMKL library, this linker issue may occur. Resolve it by setting:
96+
97+
```bash
98+
export USE_ONEMKL=OFF
99+
export MKL_DPCPP_ROOT=${PATH_To_Your_oneMKL}/__release_lnx/mkl
100+
```
101+
102+
Then clean build Intel® Extension for PyTorch\*.
103+
104+
- #### undefined symbol: mkl_lapack_dspevd. Intel MKL FATAL ERROR: cannot load libmkl_vml_avx512.so.2 or libmkl_vml_def.so.2
105+
106+
This issue may occur when Intel® Extension for PyTorch\* is built with oneMKL library and PyTorch\* is not build with any MKL library. The oneMKL kernel may run into CPU backend incorrectly and trigger this issue. Resolve it by installing MKL library from conda:
107+
108+
```bash
109+
conda install mkl
110+
conda install mkl-include
111+
```
112+
113+
then clean build PyTorch\*.
114+
115+
- #### OSError: libmkl_intel_lp64.so.1: cannot open shared object file: No such file or directory
116+
117+
Wrong MKL library is used when multiple MKL libraries exist in system. Preload oneMKL by:
118+
119+
```bash
120+
export LD_PRELOAD=${MKL_DPCPP_ROOT}/lib/intel64/libmkl_intel_lp64.so.1:${MKL_DPCPP_ROOT}/lib/intel64/libmkl_intel_ilp64.so.1:${MKL_DPCPP_ROOT}/lib/intel64/libmkl_sequential.so.1:${MKL_DPCPP_ROOT}/lib/intel64/libmkl_core.so.1:${MKL_DPCPP_ROOT}/lib/intel64/libmkl_sycl.so.1
121+
```
122+
123+
If you continue seeing similar issues for other shared object files, add the corresponding files under ${MKL_DPCPP_ROOT}/lib/intel64/ by `LD_PRELOAD`. Note that the suffix of the libraries may change (e.g. from .1 to .2), if more than one oneMKL library is installed on the system.
124+

0 commit comments

Comments
 (0)