-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Programmatically Control Thread Count #188
Comments
Optimally, the default should be to use all available cores, and users should be able to control it with an We never explicitly control multithreading in Setting the following in the terminal before calling
Unfortunately, setting these values within the workflow (with One interesting note though- on one core the ICA step takes a lot less time than using all available cores, although the results are somewhat different. We'll need to take a look at results to make sure they're still valid, but it's promising. |
It's hard to tell from the issue, but I looked at the version they use in the code they post and it looks like that issue relates to back when tedana used That said, I think you have a great point. Since OS is within the scope of the reliability analysis, it might be worth it to include parallelization as well. We definitely don't want to make it the focus of the analysis, so running across a range of numbers of cores is unnecessary, but a comparison of "one core" vs "all available" would be great. After all, it could be that parallelization speeds things up generally, but not in the one case I tried out. We could also profile the code to see if parallelizing speeds up ICA but slows down other elements. This is purely conjectural but it seemed like the spatial clustering took longer when I limited |
Well, I don't know about "all available," since on my server that would be 36 cores and probably violates a ton of assumptions typically made. But I think 1, 4, and 8 should tell us a lot, or even just 1 and 4. |
By "all available" I meant more that it would use whatever resources are allocated to the job. If you use a scheduler, you should be able to set the number of cores you want to reserve to run the job. We just need to make sure that tedana respects that value. I believe that right now it will use any cores it can access, which we definitely do not want. For the analyses, 1 and 4 seem reasonable to me. |
This seems to be a shockingly difficult problem from reading the open NumPy issue, where a large discussion basically leads me to conclude that we are hopelessly downstream of the problem. For a quick summary, it appears that the issue rests in the BLAS libraries that NumPy may or may not use as they are system and even CPU dependent (yikes). Furthermore, the NumPy commenters appear to believe that this is not a priority for them or their users, reducing the likelihood that this problem will be solved in the near future. As much as this frustrates me, I have to conclude that we should stick with @tsalo's solution of reducing default core count to 1, and hoping that some day this will be fixed upstream of us. |
Summary
Currently tedana uses all available cores or just one (#215). Users should be allowed to control more precisely the number of threads.
Goals
Strategy
Additional/Summary
Discovered this problem when I redlined my image processing server, triggering angry e-mails from IT. PR #215 prevents tedana from blowing up clusters but uses a hardcoded one-core limit via configuration file. (Thanks @tsalo!)
The text was updated successfully, but these errors were encountered: