-
Notifications
You must be signed in to change notification settings - Fork 194
Description
The scenario:
5 Threads concurrently calling clfftBakePlan with identically configured fft handles.
Immediate symptoms:
The assert(NULL == p)
in repo.cpp, line 218 triggers.
Line 218 in c59712e
assert (NULL == p); |
Then occasionally a crash with a nullptr later on.
The cause:
The function FFTAction::compileKernels
will compile kernels, but only if they are not cached already. The problem is that the query of the cache is not protected with a mutex.
Line 713 in c59712e
if( fftRepo.getclProgram( this->getGenerator(), this->getSignatureData(), program, q_device, fftPlan->context ) == CLFFT_INVALID_PROGRAM ) |
- five threads concurrently try to
compileKernels
for the first time - all threads will query the
fftRepo
at the same time - all threads will get a
CLFFT_INVALID_PROGRAM
return code. - Consequently, all five threads assume that the kernel has not been cached and will compile the kernel and
- all threads will call
fftRepo.setclProgram
with the same parameters.
The first call will set the program, the next calls will trigger the assert
.
The fix:
Any query to the cache followed by a set to a cache must be an atomic operation. Here a scopedLock would do the trick.
I could prepare a PR, but can only take the time to do so if the PR has a chance of being merged into the code. Is this repository still being maintained? Also, I'd like the fix to be integrated with vcpkg.