Add Intel Gaudi HPU Support #1023

BartoszBLL · 2025-03-18T19:30:19Z

This PR introduces support for Habana Gaudi (HPU) acceleration and improves device handling across the project. Key changes include:

New HPU-Compatible Dockerfile

Added Dockerfile.hpu to support Habana Gaudi with PyTorch.
Uses vault.habana.ai/gaudi-docker as the base image.
Installs necessary dependencies and sets environment variables for HPU.

Benchmarking on HPU

Introduced benchmark_hpu.py to compare execution times between CPU and HPU.
Uses habana_frameworks.torch for HPU acceleration.

Refactored Device Selection

Unified device selection via get_device() in utils/util.py.
Replaced multiple scattered torch.device("cuda" if torch.cuda.is_available() else "cpu") checks.
Now prioritizes HPU if available, then CUDA, falling back to CPU.

Model Loading Adjustments

Ensured models are first loaded on CPU before transferring to the appropriate device to prevent issues with incompatible state dictionaries.
Updated synthesizer, vocoder, encoder, and other model scripts accordingly.

Dependency Updates

Updated requirements.txt to allow NumPy versions >=1.21.0 for better compatibility.

Why

Adds Habana Gaudi acceleration support for improved training and inference performance.
Standardizes device management for easier maintainability.
Enhances model compatibility across different hardware configurations.

Testing

Verified model loading and inference on HPU, CUDA, and CPU.
Benchmark (3 runs averaged) shows a 16.25× speedup on HPU vs CPU:
- HPU: ~23.04 seconds
- CPU: ~374.27 seconds

Use only in inference as we do not touch training rn

Load models on cpu first to avoid errors

Feat/hpu support

BartoszBLL added 9 commits February 12, 2025 15:47

feat: Add device handling that supports HPU too

81cfef5

Use only in inference as we do not touch training rn

fix: Model loading

22b14e0

Load models on cpu first to avoid errors

feat: Add dockerfile for HPU

cf4e325

Add benchmark for hpu vs cpu

a69ef80

fix: revert weights path change

c9e4463

fix: revert weights path change in Dockerfile

37731e6

fix: Remove redundant habana imports

3770652

fix: Update Dockerfile.hpu

ef3e42c

Merge pull request #1 from BlueLabelLabs/feat/hpu-support

fc3f556

Feat/hpu support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Intel Gaudi HPU Support #1023

Add Intel Gaudi HPU Support #1023

Uh oh!

BartoszBLL commented Mar 18, 2025

Uh oh!

Uh oh!

Add Intel Gaudi HPU Support #1023

Are you sure you want to change the base?

Add Intel Gaudi HPU Support #1023

Uh oh!

Conversation

BartoszBLL commented Mar 18, 2025

New HPU-Compatible Dockerfile

Benchmarking on HPU

Refactored Device Selection

Model Loading Adjustments

Dependency Updates

Why

Testing

Uh oh!

Uh oh!