Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clfftBakePlan is not thread-safe #250

Open
hajokirchhoff opened this issue Jul 3, 2024 · 5 comments
Open

clfftBakePlan is not thread-safe #250

hajokirchhoff opened this issue Jul 3, 2024 · 5 comments

Comments

@hajokirchhoff
Copy link

The scenario:
5 Threads concurrently calling clfftBakePlan with identically configured fft handles.

Immediate symptoms:
The assert(NULL == p) in repo.cpp, line 218 triggers.

assert (NULL == p);

Then occasionally a crash with a nullptr later on.

The cause:
The function FFTAction::compileKernels will compile kernels, but only if they are not cached already. The problem is that the query of the cache is not protected with a mutex.

if( fftRepo.getclProgram( this->getGenerator(), this->getSignatureData(), program, q_device, fftPlan->context ) == CLFFT_INVALID_PROGRAM )

  • five threads concurrently try to compileKernels for the first time
  • all threads will query the fftRepo at the same time
  • all threads will get a CLFFT_INVALID_PROGRAM return code.
  • Consequently, all five threads assume that the kernel has not been cached and will compile the kernel and
  • all threads will call fftRepo.setclProgram with the same parameters.

The first call will set the program, the next calls will trigger the assert.

The fix:
Any query to the cache followed by a set to a cache must be an atomic operation. Here a scopedLock would do the trick.

I could prepare a PR, but can only take the time to do so if the PR has a chance of being merged into the code. Is this repository still being maintained? Also, I'd like the fix to be integrated with vcpkg.

@duohappy
Copy link

if I create different context, command queue , fft plan in every thread,it is thread safety to invoke clfftBakePlan?

@hajokirchhoff
Copy link
Author

If I remember correctly, the problem manifests if you have identically configured handles. context, command queue etc... do not matter. The generated kernel source code is the problem. So if you configure different FFTs, it should be fine, but if you configure the FFT all the same then you will run into this problem.

Unless you have specific performance problems/requirements I would use a global mutex around "clfftBakePlan" to be safe. The chance of a race condition is slim, but non-zero.

@duohappy
Copy link

duohappy commented Sep 19, 2024

If I remember correctly, the problem manifests if you have identically configured handles. context, command queue etc... do not matter. The generated kernel source code is the problem. So if you configure different FFTs, it should be fine, but if you configure the FFT all the same then you will run into this problem.

Unless you have specific performance problems/requirements I would use a global mutex around "clfftBakePlan" to be safe. The chance of a race condition is slim, but non-zero.

clfftBakePlan such as clfftBakePlan( plan_handle, 1, &queue, NULL, NULL );
if I create different plan_handle in every thread, and invoke clfftBakePlan only once in Init() function , is it thread safety?

plan_handle is unique, which guarantee safety without mutex? I dont know

@hajokirchhoff
Copy link
Author

A different handle does not matter, it is still not thread-safe. The question is how you create the handle.
If you use different parameters to create the handle, then you are thread-safe. But if you create different handles with the same parameters, i.e. the same fft size, then you are not thread-safe.

@duohappy
Copy link

A different handle does not matter, it is still not thread-safe. The question is how you create the handle. If you use different parameters to create the handle, then you are thread-safe. But if you create different handles with the same parameters, i.e. the same fft size, then you are not thread-safe.

I review some code, like repo.cpp. It has a lock, which can guarantee safety?

scopedLock sLock( lockRepo(), _T( "setclProgram" ) );

If I create opencl context in every thread, and the map key is different, which create by opencl context.

FFTRepoKey key(gen, data, planContext, device);

I dont review all code, and am not clear about relationship between function. I will write some demo to test it.

I am new to write opencl code. There are not many libraries, and many libraries have not been updated. Is opencl good choice to write GPU code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants