Skip to content

Intel Arc B580 GPU Not Detected Inside Official Docker Container on Ubuntu 24.04 Host #88

@hermannvargens

Description

@hermannvargens

Hello Intel Extension for TensorFlow Team,

I am unable to get TensorFlow to detect my Intel Arc B580 GPU when running inside the official Docker container. The software inside the container appears to load correctly, but the SYCL runtime fails to find any physical XPU devices.

This seems to be a low-level incompatibility issue with the Ubuntu 24.04 host environment.

Environment Details

Host OS: Ubuntu 24.04 LTS "Noble Numbat"

GPU: Intel Arc B580

Host Graphics Driver: xe DRM driver (verified via dmesg, graphical acceleration is working on the host via X.org)

Docker Version: (Cole aqui a versão que você obteve no passo 1)

Docker Image: intel/intel-optimized-tensorflow:2.15.0.1-xpu-pip-jupyter

Steps to Reproduce

On a fresh Ubuntu 24.04 host system, install Docker via sudo apt install docker.io.

Add the current user to the docker group with sudo usermod -aG docker $USER and then log out and log back in.

Run the official Docker container with flags to pass through the GPU device and render group permissions:
Bash

docker run -it --rm -p 8888:8888 --device=/dev/dri --group-add=$(getent group render | cut -d: -f3) intel/intel-optimized-tensorflow:2.15.0.1-xpu-pip-jupyter

Inside the container's terminal (e.g., via Jupyter Lab), run the following Python command to check for XPU devices:
Python

import tensorflow as tf
import intel_extension_for_tensorflow as itex
print('Devices found:', tf.config.list_physical_devices('XPU'))

Expected Behavior

The Python script should output a success message, listing one XPU device:
Devices found: [PhysicalDevice(name='/physical_device:XPU:0', device_type='XPU')]

Actual Behavior

The Python script outputs an empty list: Devices found: [].

The full log from the Python execution inside the container is as follows:

2025-09-07 00:55:39.394092: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2025-09-07 00:55:39.439004: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-09-07 00:55:39.439039: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-09-07 00:55:39.440414: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-09-07 00:55:39.447415: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2025-09-07 00:55:39.447651: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-09-07 00:55:40.280102: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2025-09-07 00:55:41.486461: I itex/core/wrapper/itex_gpu_wrapper.cc:38] Intel Extension for Tensorflow* GPU backend is loaded.
2025-09-07 00:55:41.486887: I external/local_xla/xla/pjrt/pjrt_api.cc:67] PJRT_Api is set for device type xpu
2025-09-07 00:55:41.486911: I external/local_xla/xla/pjrt/pjrt_api.cc:72] PJRT plugin for XPU has PJRT API version 0.33. The framework PJRT API version is 0.34.
2025-09-07 00:55:41.525929: E external/intel_xla/xla/stream_executor/sycl/sycl_gpu_runtime.cc:178] Can not found any devices.
2025-09-07 00:55:41.525996: E itex/core/kernels/xpu_kernel.cc:60] Failed precondition: No visible XPU devices. To check runtime environment on your host, please run itex/tools/python/env_check.py.
If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues
2025-09-07 00:55:41.613297: E itex/core/devices/gpu/itex_gpu_runtime.cc:174] Can not found any devices. To check runtime environment on your host, please run itex/tools/python/env_check.py.
If you need help, create an issue at https://github.com/intel/intel-extension-for-tensorflow/issues

Thank you for looking into this. It seems there is an incompatibility between the latest Ubuntu LTS release and the device passthrough for Intel Arc GPUs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions