Unexpected Error: AssertHandler::printMessage

### Describe the bug

# Unexpected Error: AssertHandler::printMessage  
Dear respected developers,

Sorry to bother you! I’m just a tech enthusiast without any formal education in AI. After trying many methods and reading a lot of documentation, I still haven’t been able to solve this issue. My description might not be very professional or standardized—please forgive me! I’ll try to describe the problem as accurately as I can.

I’m attempting to port a project originally based on CUDA to Intel GPU (using IPEX), but I encountered an error that I couldn’t find any information about on Google.

I suspect this might be an internal error rather than a project-specific issue (just my guess—please correct me if I’m wrong, and I apologize if so):

1. The language is Python, but based on my research, this error seems to originate from C.
2. I only made simple changes like replacing `model.to("cuda")` with `model.to("xpu")`, and `torch.cuda` with `torch.xpu`. I didn’t modify anything else.
3. When I tried to run inference on CPU by changing the device to "cpu", everything worked fine without any issues.

I’m really sorry that I don’t have the ability to pinpoint the problem or construct a minimal reproducible example. I also couldn’t find any similar cases on Google. All I can do is describe what I did—apologies again.


My platform: 12490F + A750 8GB  
System: Windows 11 22H2 (22621.2283)  
Environment: Python 3.10.5 and 3.11.9 (I tried both and encountered the same issue; I didn’t use Anaconda and set up a clean development environment. I even reinstalled the OS for this.)  
Torch version: v2.7.0 + xpu  
IPEX version: v2.7.10  
driver version:101.6913
oneAPI version: 2025.02
Project: YourMT3, an audio-to-MIDI project built with PyTorch: https://huggingface.co/spaces/mimbres/YourMT3


After git cloning the project to  local and opening it in VSCode, I modified `app.py` and `module_helper.py` to replace the original CUDA-related calls with XPU equivalents. I didn’t change anything else. When I ran the script, the error mentioned above occurred. I’ve been trying to fix it for three days but haven’t found the cause.

Here are the modified `app.py` and `module_helper.py`:  
app.py :
https://gist.github.com/SkyIsland-CN/af26e877f4c26e7b65e04014e4acd1f1
module_helper.py : 
https://gist.github.com/SkyIsland-CN/78900713d4d420bb7fff6f8e9a4216e0


Below is the terminal output after running:  

> PS D:\Projects\004>  d:; cd 'd:\Projects\004'; & 'c:\Users\wyh\AppData\Local\Programs\Python\Python311\python.exe' 'c:\Users\wyh\.trae-cn\extensions\ms-python.debugpy-2025.6.0-win32-x64\bundled\libs\debugpy\launcher' '64895' '--' 'D:\Projects\004\app.py' 
[W707 21:23:12.000000000 OperatorEntry.cpp:161] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::geometric_(Tensor(a!) self, float p, *, Generator? generator=None) -> Tensor(a!)
    registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\build\aten\src\ATen\RegisterSchema.cpp:6
  dispatch key: XPU
  previous kernel: registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:37
       new kernel: registered at H:\frameworks.ai.pytorch.ipex-gpu\build\Release\csrc\gpu\csrc\gpu\xpu\ATen\RegisterXPU_0.cpp:186 (function operator ())
D:\Projects\004\amt\src\model\RoPE\RoPE.py:35: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled=False)
D:\Projects\004\amt\src\model\RoPE\RoPE.py:242: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled=False)
IPEX是否可用: True
Intel GPU数量: 1
当前设备名称: Intel(R) Arc(TM) A750 Graphics
Resuming from amt/logs\2024\mc13_256_g4_all_v7_mt3f_sqr_rms_moe_wf4_n8k2_silu_rope_rp_b36_nops\checkpoints\last.ckpt
c:\Users\wyh\AppData\Local\Programs\Python\Python311\Lib\site-packages\lightning_fabric\connector.py:571: `precision=16` is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!
c:\Users\wyh\AppData\Local\Programs\Python\Python311\Lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py:513: You passed `Trainer(accelerator='cpu', precision='16-mixed')` but AMP with fp16 is not supported on CPU. Using `precision='bf16-mixed'` instead.
Using bfloat16 Automatic Mixed Precision (AMP)
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used..
`Trainer(limit_test_batches=1.0)` was configured so 100% of the batches will be used..
Task: mc13_full_plus_256, Max Shift Steps: 206
"add_melody_metric_to_singing": True
"add_pitch_class_metric":       None
"audio_cfg":                    {'codec': 'spec', 'hop_length': 300, 'audio_backend': 'torchaudio', 'sample_rate': 16000, 'input_frames': 32767, 'n_fft': 2048, 'n_mels': 512, 'f_min': 50.0, 'f_max': 8000.0}
"base_lr":                      None
"eval_drum_vocab":              None
"eval_subtask_key":             default
"eval_vocab":                   None
"init_factor":                  None
"max_steps":                    None
"model_cfg":                    {'encoder_type': 'perceiver-tf', 'decoder_type': 'multi-t5', 'pre_encoder_type': 'conv', 'pre_encoder_type_default': {'t5': None, 'perceiver-tf': 'conv', 'conformer': None}, 'pre_decoder_type': 'mc_shared_linear', 'pre_decoder_type_default': {'t5': {'t5': None}, 'perceiver-tf': {'t5': 'linear', 'multi-t5': 'mc_shared_linear'}, 'conformer': {'t5': None}}, 'conv_out_channels': 128, 't5_basename': 'google/t5-v1_1-small', 'pretrained': False, 'use_task_conditional_encoder': True, 'use_task_conditional_decoder': True, 'd_feat': 128, 'tie_word_embeddings': True, 'vocab_size': 596, 'num_max_positions': 1034, 'encoder': {'t5': {'d_model': 512, 'num_heads': 6, 'num_layers': 8, 'dropout_rate': 0.05, 'position_encoding_type': 'sinusoidal', 'ff_widening_factor': 2, 'ff_layer_type': 't5_gmlp'}, 'perceiver-tf': {'num_latents': 26, 'd_latent': 128, 'd_model': 128, 'num_blocks': 3, 'num_local_transformers_per_block': 2, 'num_temporal_transformers_per_block': 2, 'sca_use_query_residual': True, 'dropout_rate': 0.1, 'position_encoding_type': 'rope', 'attention_to_channel': True, 'layer_norm_type': 'layer_norm', 'ff_layer_type': 'moe', 'ff_widening_factor': 4, 'moe_num_experts': 8, 'moe_topk': 2, 'hidden_act': 'silu', 'rotary_type_sca': 'pixel', 'rotary_type_latent': 'pixel', 'rotary_type_temporal': 'lang', 'rotary_apply_to_keys': False, 'rotary_partial_pe': False, 'rope_partial_pe': True, 'num_max_positions': 110, 'vocab_size': 596}, 'conformer': {'d_model': 512, 'intermediate_size': 512, 'num_heads': 8, 'num_layers': 8, 'dropout_rate': 0.1, 'layerdrop': 0.1, 'position_encoding_type': 'rotary', 'conv_dim': (512, 512, 512, 512, 512, 512, 512), 'conv_stride': (5, 2, 2, 2, 2, 2, 2), 'conv_kernel': (10, 3, 3, 3, 3, 3, 3), 'conv_depthwise_kernel_size': 31}}, 'decoder': {'t5': {'d_model': 512, 'num_heads': 6, 'num_layers': 8, 'dropout_rate': 0.05, 'position_encoding_type': 'sinusoidal', 'ff_widening_factor': 2, 'ff_layer_type': 't5_gmlp'}, 'multi-t5': {'d_model': 512, 'num_heads': 6, 'num_layers': 8, 'dropout_rate': 0.05, 'position_encoding_type': 'sinusoidal', 'ff_widening_factor': 2, 'ff_layer_type': 't5_gmlp', 'num_channels': 13, 'num_max_positions': 1034, 'vocab_size': 596}}, 'feat_length': 110, 'event_length': 1024, 'init_factor': 1.0}
"onset_tolerance":              0.05
"optimizer":                    None
"optimizer_name":               adamwscale
"pretrained":                   False
"scheduler_name":               cosine
"shared_cfg":                   {'PATH': {'data_home': '../../data'}, 'BSZ': {'train_sub': 12, 'train_local': 24, 'validation': 64, 'test': 16}, 'AUGMENTATION': {'train_random_amp_range': [0.8, 1.1], 'train_stem_iaug_prob': 0.7, 'train_stem_xaug_policy': {'max_k': 3, 'tau': 0.3, 'alpha': 1.0, 'max_subunit_stems': 12, 'p_include_singing': None, 'no_instr_overlap': True, 'no_drum_overlap': True, 'uhat_intra_stem_augment': True}, 'train_pitch_shift_range': [-2, 2]}, 'DATAIO': {'num_workers': 4, 'prefetch_factor': 2, 'pin_memory': True, 'persistent_workers': False}, 'CHECKPOINT': {'save_top_k': 4, 'monitor': 'validation/macro_onset_f', 'mode': 'max', 'save_last': True, 'filename': '{epoch}-{step}'}, 'TRAINER': {'limit_train_batches': 1.0, 'limit_val_batches': 1.0, 'limit_test_batches': 1.0, 'gradient_clip_val': 1.0, 'accumulate_grad_batches': 1, 'check_val_every_n_epoch': 1, 'num_sanity_val_steps': 0}, 'WANDB': {'save_dir': 'amt/logs', 'resume': 'allow', 'anonymous': 'allow', 'mode': 'disabled'}, 'LR_SCHEDULE': {'warmup_steps': 1000, 'total_steps': 100000, 'final_cosine': 1e-05}, 'TOKENIZER': {'max_shift_steps': 206, 'shift_step_ms': 10}}
"task_manager":                 <utils.task_manager.TaskManager object at 0x0000017107319B50>
"test_optimal_octave_shift":    False
"test_pitch_shift_layer":       None
"weight_decay":                 0.0
"write_output_dir":             amt/logs\2024\mc13_256_g4_all_v7_mt3f_sqr_rms_moe_wf4_n8k2_silu_rope_rp_b36_nops
"write_output_vocab":           None
 Running on local URL:  http://127.0.0.1:7861
 To create a public link, set `share=True` in `launch()`.


Everything seems normal up to this point—the Gradio webUI loads and opens correctly.	The current stack:

![Image](https://github.com/user-attachments/assets/7e00919e-7bd8-4a09-a98f-bbec80014951)

(*正在运行 means Running)

But once I upload an audio file and the GPU starts inference, I observe a brief spike in GPU memory usage, and then within 2 seconds, the script stops running.(The stack remained unchanged during this period.) The terminal output is as follows:  

> c:\Users\wyh\AppData\Local\Programs\Python\Python311\Lib\site-packages\torchaudio\_backend\soundfile_backend.py:71: UserWarning: The MPEG_LAYER_III subtype is unknown to TorchAudio. As a result, the bits_per_sample attribute will be set to 0. If you are seeing this warning, please report by opening an issue on github (after checking for existing/closed ones). You may otherwise ignore this warning.        
  warnings.warn(
⏰ converting audio: 0m 0s 35.23ms
AssertHandler::printMessage



*Note: It’s not an “Out of memory” issue, because I’ve seen that kind of error before and the output is different.

Apologies—English is not my native language. I used AI to help translate this issue. If there are any mistakes or mistranslations, please let me know. Also, I’m not very familiar with GitHub, so if anything I’ve done here is against the rules or etiquette, please feel free to point it out.

Thank you for reading. I’ve also reported this issue to the PyTorch team, since I’m not sure where the problem lies.



Thanks for your hand working and PLZ don't mind my lack of professionalism.

### Versions

The information of being printed in terminal by run the 'collect_env.py':

PyTorch version:   2.7.0+xpu
PyTorch CXX11 ABI: No
IPEX version:      2.7.10+xpu
IPEX commit:       0e47515e4
Build type:        Release

OS:                Microsoft Windows 11 Pro (10.0.22621 64-bit)
GCC version:       N/A
Clang version:     N/A
IGC version:       N/A
CMake version:     N/A
Libc version:      N/A

Python version:    3.11.9 (tags/v3.11.9:de54cf5, Apr  2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)] (64-bit runtime)
Python platform:   Windows-10-10.0.22621-SP0
Is XPU available:  True
DPCPP runtime:     N/A
MKL version:       N/A

GPU models and configuration onboard: 
* Intel(R) Arc(TM) A750 Graphics

GPU models and configuration detected: 
* [0] _XpuDeviceProperties(name='Intel(R) Arc(TM) A750 Graphics', platform_name='Intel(R) oneAPI Unified Runtime over Level-Zero', type='gpu', driver_version='1.6.33890', total_memory=7934MB, max_compute_units=448, gpu_eu_count=448, gpu_subslice_count=56, max_work_group_size=1024, max_num_sub_groups=128, sub_group_sizes=[8 16 32], has_fp16=1, has_fp64=0, has_atomic64=1)

Driver version: 
* 32.0.101.6913 (20250621000000.******+***)

CPU:
Description: Intel64 Family 6 Model 151 Stepping 2
Manufacturer: GenuineIntel
Name: 12th Gen Intel(R) Core(TM) i5-12490F
NumberOfCores: 6
NumberOfEnabledCore: 6
NumberOfLogicalProcessors: 12
ThreadCount: 12

Versions of relevant libraries:
[pip] dpcpp-cpp-rt==2025.0.5
[pip] intel-cmplr-lib-rt==2025.0.5
[pip] intel-cmplr-lib-ur==2025.0.5
[pip] intel-cmplr-lic-rt==2025.0.5
[pip] intel_extension_for_pytorch==2.7.10+xpu
[pip] intel-opencl-rt==2025.0.5
[pip] intel-openmp==2025.0.5
[pip] intel-pti==0.10.1
[pip] intel-sycl-rt==2025.0.5
[pip] mkl==2025.0.1
[pip] mkl-dpcpp==2025.0.1
[pip] numpy==1.26.4
[pip] onemkl-sycl-blas==2025.0.1
[pip] onemkl-sycl-datafitting==2025.0.1
[pip] onemkl-sycl-dft==2025.0.1
[pip] onemkl-sycl-lapack==2025.0.1
[pip] onemkl-sycl-rng==2025.0.1
[pip] onemkl-sycl-sparse==2025.0.1
[pip] onemkl-sycl-stats==2025.0.1
[pip] onemkl-sycl-vm==2025.0.1
[pip] pytorch-lightning==2.5.2
[pip] pytorch-triton-xpu==3.3.0
[pip] torch==2.7.0+xpu
[pip] torchaudio==2.7.0+xpu
[pip] torchmetrics==1.7.4
[pip] torchvision==0.22.0+xpu
[pip] transformers==4.45.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected Error: AssertHandler::printMessage #843

Describe the bug

Unexpected Error: AssertHandler::printMessage

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected Error: AssertHandler::printMessage #843

Description

Describe the bug

Unexpected Error: AssertHandler::printMessage

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions