Skip to content

clfftBakePlan is not thread-safe #250

@hajokirchhoff

Description

@hajokirchhoff

The scenario:
5 Threads concurrently calling clfftBakePlan with identically configured fft handles.

Immediate symptoms:
The assert(NULL == p) in repo.cpp, line 218 triggers.

assert (NULL == p);

Then occasionally a crash with a nullptr later on.

The cause:
The function FFTAction::compileKernels will compile kernels, but only if they are not cached already. The problem is that the query of the cache is not protected with a mutex.

if( fftRepo.getclProgram( this->getGenerator(), this->getSignatureData(), program, q_device, fftPlan->context ) == CLFFT_INVALID_PROGRAM )

  • five threads concurrently try to compileKernels for the first time
  • all threads will query the fftRepo at the same time
  • all threads will get a CLFFT_INVALID_PROGRAM return code.
  • Consequently, all five threads assume that the kernel has not been cached and will compile the kernel and
  • all threads will call fftRepo.setclProgram with the same parameters.

The first call will set the program, the next calls will trigger the assert.

The fix:
Any query to the cache followed by a set to a cache must be an atomic operation. Here a scopedLock would do the trick.

I could prepare a PR, but can only take the time to do so if the PR has a chance of being merged into the code. Is this repository still being maintained? Also, I'd like the fix to be integrated with vcpkg.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions