Skip to content

Conversation

@baijumeswani
Copy link
Collaborator

@baijumeswani baijumeswani commented Oct 30, 2025

Up until now, for the cuda execution provider, onnxruntime-genai tried using the built-in cuda ep using the legacy OrtCUDAProviderOptionsV2 and AppendExecutionProvider_CUDA_V2 API. These functions are designed to make use of the built-in cuda execution provider and are not compatible with the world of plug-in eps.

The changes in this pull-request now extend support to executing the model with a pre-registered (plugged-in) CUDAExecutionProvider.

In order to use the plug-in capabilities, application layer needs to do the following:

  • C++

    #include "onnxruntime_cxx_api.h"
    #include "ort_genai.h"
    
    auto env = Ort::Env();
    env.RegisterExecutionProviderLibrary("CUDAExecutionProvider", "path\to\onnxruntime_providers_cuda.dll");
    
    auto model = OgaModel("path\to\model\directory");
    ...
  • Python

    import onnxruntime_genai as og
    
    og.register_execution_provider_library("CUDAExecutionProvider", "path\to\onnxruntime_providers_cuda.dll")
    
    model = og.Model("path\to\model\directory")
    ...
  • C#

    using Microsoft.ML.OnnxRuntime;
    using Microsoft.ML.OnnxRuntimeGenAI;
    
    var ortEnv = OrtEnv.Instance();
    ortEnv.RegisterExecutionProviderLibrary("CUDAExecutionProvider", "path\to\onnxruntime_providers_cuda.dll");
    
    using Model model = new Model("path\to\model\directory");
    ...

@baijumeswani baijumeswani marked this pull request as ready for review October 30, 2025 17:36
@kunal-vaishnavi
Copy link
Contributor

Can we provide a direct API from ORT GenAI for registering a provider library for non-Python language bindings?

@baijumeswani
Copy link
Collaborator Author

Can we provide a direct API from ORT GenAI for registering a provider library for non-Python language bindings?

We do have a C API:

/**
* \brief Registers an execution provider library with ONNXRuntime API.
* \param registration_name name for registration.
* \param path provider path.
*
*/
OGA_EXPORT void OGA_API_CALL OgaRegisterExecutionProviderLibrary(const char* registration_name, const char* library_path);
/**
* \brief Unregisters an execution provider library with ONNXRuntime API.
* \param registration_name name for registration.
*
*/
OGA_EXPORT void OGA_API_CALL OgaUnregisterExecutionProviderLibrary(const char* registration_name);

But people are probably more used to using the onnxruntime env since those APIs offer more control. onnxruntime-genai API fr library registration is needed for Python in particular because the onnxruntime python library is not the one we load from onnxruntime-genai's Python package.

@baijumeswani baijumeswani merged commit 6903a36 into main Oct 31, 2025
15 checks passed
@baijumeswani baijumeswani deleted the baijumeswani/use-session-options-v2 branch October 31, 2025 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants