-
Notifications
You must be signed in to change notification settings - Fork 563
Open
Labels
bugSomething isn't workingSomething isn't workinginstallPyTorch/XLA installation related issues.PyTorch/XLA installation related issues.needs reproductionxla:tpuTPU specific issues and PRsTPU specific issues and PRs
Description
❓ Questions and Help
On a fresh Google Cloud TPU VM (created today), I cannot get torch-xla to recognize the TPU.
I tried to delete and recreate the tpu vm many times and even in all TPU VMs these same errors came for that:
scd@t1v-n-1219602b-w-0:~$ PJRT_DEVICE=TPU python3
Python 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch_xla.core.xla_model as xm
>>> print(xm.get_xla_supported_devices(\"TPU\"))
File "<stdin>", line 1
print(xm.get_xla_supported_devices(\"TPU\"))
^
SyntaxError: unexpected character after line continuation character
>>> print(xm.get_xla_supported_devices("TPU\"))
File "<stdin>", line 1
print(xm.get_xla_supported_devices("TPU\"))
^
SyntaxError: unterminated string literal (detected at line 1)
>>> print(xm.get_xla_supported_devices())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/scd/.local/lib/python3.10/site-packages/torch_xla/core/xla_model.py", line 79, in get_xla_supported_devices
devices = torch_xla._XLAC._xla_get_devices()
RuntimeError: Bad StatusOr access: UNKNOWN: TPU initialization failed: open(/dev/accel2): Operation not permitted: Operation not permitted; Couldn't open device: /dev/accel2; [/dev/accel2]
>>> import torch
>>> import torch_xla.core.xla_model as xm
>>>
>>> dev = xm.xla_device()
<stdin>:1: DeprecationWarning: Use torch_xla.device instead
F0828 01:09:54.071199 532138 runtime.cpp:21] Check failed: !g_computation_client_initialized InitializeComputationClient() can only be called once.
*** Check fail
Steps tried:
-
Checked
/dev/accel*
→ exists but owned byroot:root
:
Then i tried (but failed): -
Tried adding my user to
tpu
group, but group does not exist. -
Attempted to install
libtpu-nightly
, but package not found: -
TPU training script fails with
RuntimeError: tensorflow/compiler/xla/pjrt/pjrt_client.cc: Could not initialize PJRT client: Not found: Unable to access /dev/accel0
.
Environment
- TPU VM: v4-8 (freshly created Aug 21, 2025)
- OS: Ubuntu 22.04 (default TPU VM image)
- torch &torch-xla: by using commands from this reference (https://cloud.google.com/tpu/docs/run-calculation-pytorch)
What I’ve tried
- Updating apt sources
- Installing
libtpu-nightly
(fails) - Checking device permissions
- Following official setup docs
Request
Please help me with torch xla on my tpu device on google cloud
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinginstallPyTorch/XLA installation related issues.PyTorch/XLA installation related issues.needs reproductionxla:tpuTPU specific issues and PRsTPU specific issues and PRs