Skip to content

Intel Neural Compressor Release 3.7

Latest

Choose a tag to compare

@thuang6 thuang6 released this 25 Dec 03:50
· 4 commits to master since this release
v3.7
  • Highlights
  • Features
  • Improvements
  • Validated Hardware
  • Validated Configurations

Highlights

  • Introduced NVFP4 quantization experimental support and mixed bits (MXFP4 & MXFP8) autotuning on LLMs

Features

  • Support NVFP4 Post-Training Quantization (PTQ) on LLM models (experimental)
  • Support mixed bits (MXFP4 & MXFP8) autotuning on LLM models (experimental)
  • Support MXFP8 PTQ on video generation diffusion model (experimental)
  • Support MXFP4 Quantization-Aware Training (QAT) on LLM models (experimental)

Improvements

  • Update of Llama 3 series example for NVFP4 and auto mixed-bits (MXFP4 & MXFP8) PTQ
  • New LLM example (DeepSeek R1) for MXFP8, MXFP4, NVFP4 PTQ
  • New LLM example (Qwen3-235B ) for MXFP8, MXFP4 PTQ
  • New video generation diffusion example (FramePack) for MXFP8 PTQ
  • Update of Llama3 example for MXFP4 QAT
  • Removal of test purpose benchmarking feature for security consideration

Validated Hardware

  • Intel Gaudi Al Accelerators (Gaudi 2 and 3)
  • Intel Xeon Scalable processor (4th, 5th, 6th Gen)
  • Intel Core Ultra Processors (Series 1 and 2)
  • Intel Data Center GPU Max Series (1550)
  • Intel® Arc™ B-Series Graphics GPU (B580 and B60)

Validated Configurations

  • Ubuntu 24.04 & Win 11
  • Python 3.10, 3.11, 3.12, 3.13
  • PyTorch/IPEX 2.7, 2.8
  • PyTorch 2.9