Skip to content

cudaError_t 209, unable to build for newer versions of cuda #2012

@akalinux

Description

@akalinux

I want to thank you for your work so far. 👍 The Vosk project is amazing and the cpu containers works very well!
I can get by with the cpu based model.. but the gpu model fails when running on current hardware.

My host os is running cuda 13.1 ( minimum version for my gpu is 12.8 )
Current version the container is built from is: nvidia/cuda:12.1.0-devel-ubuntu22.04.

I have tried running it with the nvidia runtime:

docker run --gpus all --runtime=nvidia -p 2700:2700 alphacep/kaldi-en-gpu

I have tried running it without the nvidia runtime:

docker run --gpus all -p 2700:2700 alphacep/kaldi-en-gpu

I have also tried to rebuild the images by upgrading the base images to cuda to 12.8.0,12.8.1 and the newer 13.x images with no luck.
The build-gpu.sh builds always fail when constructing the first container with the updated versions of cuda.

The log from vosk starting up on my current machine.

WARNING ([5.5.10891-a25f2]:SelectGpuId():cu-device.cc:243) Not in compute-exclusive mode. Suggestion: use 'nvidia-smi -c 3' to set compute exclusive mode
LOG ([5.5.1089
1-a25f2]:SelectGpuIdAuto():cu-device.cc:438) Selecting from 1 GPUs
LOG ([5.5.10891-a25f2]:SelectGpuIdAuto():cu-device.cc:453) cudaSetDevice(0): NVIDIA GeForce RTX 5090 free:30927M, used:1679M, total:32606M, free/total:0.94849
LOG ([5.5.1089
1-a25f2]:SelectGpuIdAuto():cu-device.cc:501) Device: 0, mem_ratio: 0.94849
LOG ([5.5.10891-a25f2]:SelectGpuId():cu-device.cc:382) Trying to select device: 0
LOG ([5.5.1089
1-a25f2]:SelectGpuIdAuto():cu-device.cc:511) Success selecting device 0 free mem ratio: 0.94849
LOG ([5.5.10891-a25f2]:FinalizeActiveGpu():cu-device.cc:338) The active GPU is [0]: NVIDIA GeForce RTX 5090 free:29643M, used:2963M, total:32606M, free/total:0.909111 version 12.0
ERROR ([5.5.1089
1-a25f2]:AddVecVec():cu-vector.cc:572) cudaError_t 209 : "no kernel image is available for execution on the device" returned from 'cudaGetLastError()'

[ Stack-Trace: ]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(kaldi::MessageLogger::LogMessage() const+0x80e) [0x73dae501a38e]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(+0x742363) [0x73dae4f0a363]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(kaldi::CuVectorBase::AddVecVec(double, kaldi::CuVectorBase const&, kaldi::CuVectorBase const&, double)+0x2ca) [0x73dae4f1903a]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(kaldi::nnet3::BatchNormComponent::Read(std::istream&, bool)+0x1f7) [0x73dae4e73da7]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(kaldi::nnet3::Component::ReadNew(std::istream&, bool)+0xab) [0x73dae4e3b06b]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(kaldi::nnet3::Nnet::Read(std::istream&, bool)+0x50e) [0x73dae4dcfb3e]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(kaldi::nnet3::AmNnetSimple::Read(std::istream&, bool)+0x1a) [0x73dae4dc4c1a]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(BatchModel::BatchModel(char const*)+0x836) [0x73dae4b9bc36]
/usr/local/lib/python3.10/dist-packages/vosk-0.3.45-py3.10.egg/vosk/libvosk.so(vosk_batch_model_new+0x26) [0x73dae4b964f6]
/usr/local/lib/python3.10/dist-packages/_cffi_backend.cpython-310-x86_64-linux-gnu.so(+0x25052) [0x73dae8616052]
/usr/local/lib/python3.10/dist-packages/_cffi_backend.cpython-310-x86_64-linux-gnu.so(+0x2320c) [0x73dae861420c]
/usr/local/lib/python3.10/dist-packages/_cffi_backend.cpython-310-x86_64-linux-gnu.so(+0x210ff) [0x73dae86120ff]
python3(_PyObject_MakeTpCall+0x25b) [0x5d1cb8f814ab]
python3(_PyEval_EvalFrameDefault+0x6db6) [0x5d1cb8f79e66]
python3(_PyObject_FastCallDictTstate+0xc4) [0x5d1cb8f80634]
python3(+0x166e94) [0x5d1cb8f94e94]
python3(_PyObject_MakeTpCall+0x1fc) [0x5d1cb8f8144c]
python3(_PyEval_EvalFrameDefault+0x6765) [0x5d1cb8f79815]
python3(+0x17a150) [0x5d1cb8fa8150]
/usr/lib/python3.10/lib-dynload/_asyncio.cpython-310-x86_64-linux-gnu.so(+0x962e) [0x73dae94b162e]
/usr/lib/python3.10/lib-dynload/_asyncio.cpython-310-x86_64-linux-gnu.so(+0x9444) [0x73dae94b1444]
python3(_PyObject_MakeTpCall+0x25b) [0x5d1cb8f814ab]
python3(+0x2eaab2) [0x5d1cb9118ab2]
python3(+0x1503bb) [0x5d1cb8f7e3bb]
python3(_PyEval_EvalFrameDefault+0x2a40) [0x5d1cb8f75af0]
python3(_PyFunction_Vectorcall+0x7c) [0x5d1cb8f8b1ec]
python3(_PyEval_EvalFrameDefault+0x81b) [0x5d1cb8f738cb]
python3(_PyFunction_Vectorcall+0x7c) [0x5d1cb8f8b1ec]
python3(_PyEval_EvalFrameDefault+0x81b) [0x5d1cb8f738cb]
python3(_PyFunction_Vectorcall+0x7c) [0x5d1cb8f8b1ec]
python3(_PyEval_EvalFrameDefault+0x81b) [0x5d1cb8f738cb]
python3(_PyFunction_Vectorcall+0x7c) [0x5d1cb8f8b1ec]
python3(_PyEval_EvalFrameDefault+0x81b) [0x5d1cb8f738cb]
python3(_PyFunction_Vectorcall+0x7c) [0x5d1cb8f8b1ec]
python3(_PyEval_EvalFrameDefault+0x63b2) [0x5d1cb8f79462]
python3(+0x141ed6) [0x5d1cb8f6fed6]
python3(PyEval_EvalCode+0x86) [0x5d1cb9066366]
python3(+0x265108) [0x5d1cb9093108]
python3(+0x25df5b) [0x5d1cb908bf5b]
python3(+0x264e55) [0x5d1cb9092e55]
python3(_PyRun_SimpleFileObject+0x1a8) [0x5d1cb9092338]
python3(_PyRun_AnyFileObject+0x43) [0x5d1cb9092033]
python3(Py_RunMain+0x2be) [0x5d1cb90832de]
python3(Py_BytesMain+0x2d) [0x5d1cb905932d]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x73dae97b1d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x73dae97b1e40]
python3(_start+0x25) [0x5d1cb9059225]

terminate called after throwing an instance of 'kaldi::KaldiFatalError'
what(): kaldi::KaldiFatalError

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions