Dynamic batching with variable-length audio fails for NeMo Titanet (ONNX Runtime CUDA – CUDNN_STATUS_BAD_PARAM)

Hi Triton team,

I am deploying the NVIDIA NeMo Titanet encoder model (speaker diarization) using Triton Inference Server with the ONNX Runtime backend. My goal is to support multiple concurrent clients, so I enabled dynamic batching.

However, when dynamic batching is enabled, inference fails with a cuDNN error. The same model works correctly when dynamic batching is disabled (single request per instance).

**Environment:**

- Triton Inference Server: 2.42.0
- Backend: onnxruntime_onnx (CUDA)
- GPU: NVIDIA GPU (single GPU setup)
- Model: NeMo Titanet encoder (ONNX)
- CUDA / cuDNN: Default versions from Triton 2.42.0 container

**Triton Model Configuration:**
```
name: "titanet_encoder"
platform: "onnxruntime_onnx"

max_batch_size: 32

input [
  {
    name: "features"
    data_type: TYPE_FP32
    dims: [80, -1]
  },
  {
    name: "length"
    data_type: TYPE_INT64
    dims: [1]
  }
]

output [
  {
    name: "embeddings"
    data_type: TYPE_FP32
    dims: [6144]
  }
]

dynamic_batching {
  preferred_batch_size: [4, 8, 16]
  max_queue_delay_microseconds: 2000
}

instance_group [
  {
    kind: KIND_GPU
    count: 1
  }
]
```


**Error Observed**

When multiple requests with different audio lengths are dynamically batched, inference fails with:

```
tritonclient.utils.InferenceServerException: [500] onnx runtime error 1:
Non-zero status code returned while running FusedConv node.
Name:'/encoder/encoder/encoder.1/res.0.0/conv/Conv'
Status Message: CUDNN failure 3: CUDNN_STATUS_BAD_PARAM
file=onnxruntime/contrib_ops/cuda/fused_conv.cc
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic batching with variable-length audio fails for NeMo Titanet (ONNX Runtime CUDA – CUDNN_STATUS_BAD_PARAM) #8644

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dynamic batching with variable-length audio fails for NeMo Titanet (ONNX Runtime CUDA – CUDNN_STATUS_BAD_PARAM) #8644

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions