Open
Description
Can we use custom kernel with atomics for ∇getindex!(dx::AbstractGPUArray, dy, inds...)
instead of copying everything to CPU?
This way we'd be able to avoid synchronizations and we can add such kernel via extension
Can we use custom kernel with atomics for ∇getindex!(dx::AbstractGPUArray, dy, inds...)
instead of copying everything to CPU?
This way we'd be able to avoid synchronizations and we can add such kernel via extension