You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Move Cuda usage into a shared library similar to onnxruntime (#973)
This separates out the cuda code into a /src/cuda folder and builds
onnxruntime_genai_cuda.dll/.so.
It is only this shared library that depends on the cuda runtimes.
There is an interface inside the core genai code to route functions to
this cuda library, but my next step is to refactor the code to separate
out provider specific, so this interface will shrink.
There is a new type to manage provider specific memory, DeviceMemory<T>
and DeviceMemorySpan<T>, this will replace RoamingArray once the
refactoring is complete.
0 commit comments