Skip to content

TPU ifrt nondeterministically segfaults on initialization #1497

@wsmoses

Description

@wsmoses

@avik-pal

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1753671964.127475   48996 pjrt_client.cc:531] PjRt-IFRT device count: total=12, addressable=12
I0000 00:00:1753671964.127548   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=0)
I0000 00:00:1753671964.127562   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=1)
I0000 00:00:1753671964.127565   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=2)
I0000 00:00:1753671964.127568   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=3)
I0000 00:00:1753671964.127571   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=4)
I0000 00:00:1753671964.127574   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=5)
I0000 00:00:1753671964.127576   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=6)
I0000 00:00:1753671964.127579   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=7)
I0000 00:00:1753671964.127582   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=8)
I0000 00:00:1753671964.127585   48996 pjrt_client.cc:535] Addressable PjRt-IFRT device: CpuDevice(id=9)
I0000 00:00:1753671964.127588   48996 pjrt_client.cc:538] ... (omitted) ...
2025-07-28 03:06:04.191607: I external/xla/xla/pjrt/pjrt_api.cc:115] GetPjrtApi was found for tpu at /root/.julia/scratchspaces/3c362404-f566-11ee-1572-e11a4b42c853/libtpu/libtpu.so
2025-07-28 03:06:04.191641: I external/xla/xla/pjrt/pjrt_api.cc:93] PJRT_Api is set for device type tpu
2025-07-28 03:06:04.191651: I external/xla/xla/pjrt/pjrt_api.cc:161] The PJRT plugin has PJRT API version 0.73. The framework PJRT API version is 0.73.
2025-07-28 03:06:10.024656: I external/xla/xla/pjrt/pjrt_c_api_client.cc:131] PjRtCApiClient created.
[48996] signal 11 (1): Segmentation fault
in expression starting at /__w/Reactant.jl/Reactant.jl/test/layout.jl:4
unknown function (ip: 0x7e378bdc32cd)
_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPcEEvT_S7_St20forward_iterator_tag.isra.0 at /root/.julia/artifacts/9f4b9d13eebd000fb6bd684723fa83f9bb0252e9/lib/libReactantExtra.so (unknown line)
ifrt_pjrt_make_client_with_default_kv_store at /root/.julia/artifacts/9f4b9d13eebd000fb6bd684723fa83f9bb0252e9/lib/libReactantExtra.so (unknown line)
#MakeIFRTPJRTClientViaPluginAPI#15 at /__w/Reactant.jl/Reactant.jl/src/xla/IFRT/Client.jl:237
MakeIFRTPJRTClientViaPluginAPI at /__w/Reactant.jl/Reactant.jl/src/xla/IFRT/Client.jl:222 [inlined]
#MakeIFRTPJRTTPUClient#13 at /__w/Reactant.jl/Reactant.jl/src/xla/IFRT/Client.jl:201 [inlined]
MakeIFRTPJRTTPUClient at /__w/Reactant.jl/Reactant.jl/src/xla/IFRT/Client.jl:195 [inlined]
#TPUClient#9 at /__w/Reactant.jl/Reactant.jl/src/xla/IFRT/Client.jl:130
TPUClient at /__w/Reactant.jl/Reactant.jl/src/xla/IFRT/Client.jl:126
unknown function (ip: 0x7e347ada4ed2)
initialize_default_clients! at /__w/Reactant.jl/Reactant.jl/src/xla/XLA.jl:213
getproperty at /__w/Reactant.jl/Reactant.jl/src/xla/XLA.jl:55 [inlined]
default_backend at /__w/Reactant.jl/Reactant.jl/src/xla/XLA.jl:87 [inlined]
#ConcreteIFRTArray#41 at /__w/Reactant.jl/Reactant.jl/src/Types.jl:338
ConcreteIFRTArray at /__w/Reactant.jl/Reactant.jl/src/Types.jl:331
to_rarray_internal at /__w/Reactant.jl/Reactant.jl/src/Tracing.jl:1904
#to_rarray#150 at /__w/Reactant.jl/Reactant.jl/src/Tracing.jl:1837 [inlined]
to_rarray at /__w/Reactant.jl/Reactant.jl/src/Tracing.jl:1827
unknown function (ip: 0x7e347ad9e952)
jl_apply at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:562
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:539
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:539
jl_interpret_toplevel_thunk at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:824
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
include_string at ./loading.jl:2734
_include at ./loading.jl:2794
include at ./Base.jl:562
jfptr_include_46943.1 at /__w/_tool/julia/1.11.6/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
jl_f__call_latest at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/builtins.c:875
include at /root/.julia/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:28
unknown function (ip: 0x7e347ad97e92)
jl_apply at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:666
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:539
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:539
jl_interpret_toplevel_thunk at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:824
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_eval_module_expr at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:215 [inlined]
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:743
ijl_toplevel_eval_in at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430
jfptr_eval_28337.1 at /__w/_tool/julia/1.11.6/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:666
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:539
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:539
jl_interpret_toplevel_thunk at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:824
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
include_string at ./loading.jl:2734
_include at ./loading.jl:2794
include at ./sysimg.jl:38
unknown function (ip: 0x7e3774f00092)
jl_apply at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
do_call at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:126
eval_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:666
jl_interpret_toplevel_thunk at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/interpreter.c:824
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:943
jl_toplevel_eval_flex at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:886
ijl_toplevel_eval_in at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/toplevel.c:994
eval at ./boot.jl:430 [inlined]
exec_options at ./client.jl:296
_start at ./client.jl:531
jfptr__start_73597.1 at /__w/_tool/julia/1.11.6/x64/lib/julia/sys.so (unknown line)
jl_apply at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
true_main at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/jlapi.c:900
jl_repl_entrypoint at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/src/jlapi.c:1059
main at /cache/build/tester-amdci4-12/julialang/julia-release-1-dot-11/cli/loader_exe.c:58
unknown function (ip: 0x7e378bc4c1c9)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 27192764 (Pool: 27192456; Big: 308); GC: 37
ERROR: LoadError: Package Reactant errored during testing (received signal: 11)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions