Skip to content

[Bug] CUDA kernel not found in registries for Op type: MMCVModulatedDeformConv2d #2892

Open
@chetir

Description

@chetir

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

I use mmcv.ops in my model. When run in pytorch, all works fine. But when export to onnx, it runs extremely slow, and i found dcn module may run on cpu. So how can I run dcn module on onnx cuda executor?

  • torch exec time: 0.5s
  • onnx exec time: 16s

Reproduction

export script example:

ort_custom_op_path = (
    "mmdeploy/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so"
)
assert os.path.exists(ort_custom_op_path)
session_options = ort.SessionOptions()
session_options.register_custom_ops_library(ort_custom_op_path)


def export_onnx(model, model_path):
    input_0 = torch.zeros(1, 3, 1024, 1024).to(device)

    filename = model_path.replace("pth", "onnx")
    torch.onnx.export(
        model=model,
        args=(input_0,),
        f=filename,
        input_names=["input0"],
        output_names=["output0"],
        opset_version=11,
    )
    print("Finished onnx export")

Environment

04/30 18:11:42 - mmengine - INFO -

04/30 18:11:42 - mmengine - INFO - **********Environmental information**********
04/30 18:11:44 - mmengine - INFO - sys.platform: linux
04/30 18:11:44 - mmengine - INFO - Python: 3.7.7 (default, Mar 26 2020, 15:48:22) [GCC 7.3.0]
04/30 18:11:44 - mmengine - INFO - CUDA available: True
04/30 18:11:44 - mmengine - INFO - MUSA available: False
04/30 18:11:44 - mmengine - INFO - numpy_random_seed: 2147483648
04/30 18:11:44 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3090
04/30 18:11:44 - mmengine - INFO - CUDA_HOME: /data/ao.xu/miniconda3
04/30 18:11:44 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.122
04/30 18:11:44 - mmengine - INFO - GCC: x86_64-conda_cos6-linux-gnu-gcc (crosstool-NG 1.23.0.449-a04d0) 7.3.0
04/30 18:11:44 - mmengine - INFO - PyTorch: 1.13.1+cu117
04/30 18:11:44 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.5
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

04/30 18:11:44 - mmengine - INFO - TorchVision: 0.11.0
04/30 18:11:44 - mmengine - INFO - OpenCV: 4.11.0
04/30 18:11:44 - mmengine - INFO - MMEngine: 0.10.7
04/30 18:11:44 - mmengine - INFO - MMCV: 2.2.0
04/30 18:11:44 - mmengine - INFO - MMCV Compiler: GCC 9.3
04/30 18:11:44 - mmengine - INFO - MMCV CUDA Compiler: 11.7
04/30 18:11:44 - mmengine - INFO - MMDeploy: 1.3.1+3f8604b
04/30 18:11:44 - mmengine - INFO -

04/30 18:11:44 - mmengine - INFO - **********Backend information**********
04/30 18:11:44 - mmengine - INFO - tensorrt:    None
04/30 18:11:44 - mmengine - INFO - ONNXRuntime: None
04/30 18:11:44 - mmengine - INFO - ONNXRuntime-gpu:     1.14.1
04/30 18:11:44 - mmengine - INFO - ONNXRuntime custom ops:      Available
04/30 18:11:44 - mmengine - INFO - pplnn:       None
04/30 18:11:44 - mmengine - INFO - ncnn:        None
04/30 18:11:44 - mmengine - INFO - snpe:        None
04/30 18:11:44 - mmengine - INFO - openvino:    None
04/30 18:11:44 - mmengine - INFO - torchscript: 1.13.1
04/30 18:11:44 - mmengine - INFO - torchscript custom ops:      NotAvailable
04/30 18:11:44 - mmengine - INFO - rknn-toolkit:        None
04/30 18:11:44 - mmengine - INFO - rknn-toolkit2:       None
04/30 18:11:44 - mmengine - INFO - ascend:      None
04/30 18:11:44 - mmengine - INFO - coreml:      None
04/30 18:11:44 - mmengine - INFO - tvm: None
04/30 18:11:44 - mmengine - INFO - vacc:        None
04/30 18:11:44 - mmengine - INFO -

04/30 18:11:44 - mmengine - INFO - **********Codebase information**********
04/30 18:11:44 - mmengine - INFO - mmdet:       None
04/30 18:11:44 - mmengine - INFO - mmseg:       None
04/30 18:11:44 - mmengine - INFO - mmpretrain:  None
04/30 18:11:44 - mmengine - INFO - mmocr:       None
04/30 18:11:44 - mmengine - INFO - mmagic:      None
04/30 18:11:44 - mmengine - INFO - mmdet3d:     None
04/30 18:11:44 - mmengine - INFO - mmpose:      None
04/30 18:11:44 - mmengine - INFO - mmrotate:    None
04/30 18:11:44 - mmengine - INFO - mmaction:    None
04/30 18:11:44 - mmengine - INFO - mmrazor:     None
04/30 18:11:44 - mmengine - INFO - mmyolo:      None

Error traceback

2025-04-30 18:01:45.270207582 [I:onnxruntime:Default, cuda_execution_provider.cc:2382 GetCapability] CUDA kernel not found in registries for Op type: MMCVModulatedDeformConv2d node name: /dla_up/ida_0/proj_1/conv/MMCVModulatedDeformConv2d

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions