Skip to content

ROCclr segfault when running Julia with threads #770

@leios

Description

@leios

Nothing works when using Julia in threaded mode:

[leios@noema Fable.jl]$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.3 (2025-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> AMDGPU.zeros(10)
10-element ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

julia> 
[leios@noema Fable.jl]$ julia -t 2
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.3 (2025-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU

julia> AMDGPU.zeros(10)
10-element ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}:
julia: /usr/src/debug/hip-runtime/clr-rocm-6.2.4/rocclr/os/os_posix.cpp:321: static void amd::Os::currentStackInfo(unsigned char**, size_t*): Assertion `Os::currentStackPtr() >= *base - *size && Os::currentStackPtr() < *base && "just checking"' failed.

[7772] signal 6 (-6): Aborted
in expression starting at none:0
unknown function (ip: 0x7087c8970624)
gsignal at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x7087c88fe4ea)
unknown function (ip: 0x70874990c108)
unknown function (ip: 0x708749918dc7)
unknown function (ip: 0x708749706285)
macro expansion at /home/leios/.julia/packages/GPUToolbox/cZlg7/src/ccalls.jl:143 [inlined]
macro expansion at /home/leios/.julia/packages/AMDGPU/STpZC/src/utils.jl:122 [inlined]
hipGetDeviceCount at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/libhip.jl:42 [inlined]
ndevices at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/device.jl:103
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11 [inlined]
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11
TaskLocalState at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:11 [inlined]
#25 at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:27 [inlined]
get! at ./iddict.jl:171
task_local_state! at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:26
prepare_state at /home/leios/.julia/packages/AMDGPU/STpZC/src/tls.jl:193 [inlined]
hipStreamQuery at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/libhip.jl:113 [inlined]
#11 at /home/leios/.julia/packages/AMDGPU/STpZC/src/hip/stream.jl:114
unknown function (ip: 0x7087bc5f29ff)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
start_task at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/task.c:1202
Allocations: 24176043 (Pool: 24175407; Big: 636); GC: 17
Aborted (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions