-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Heterogeneous Memory Pool #37952
base: master
Are you sure you want to change the base?
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37952/30020
|
A new Pull Request was created by @VinInn (Vincenzo Innocente) for master. It involves the following packages:
@malbouis, @yuanchao, @makortel, @slava77, @clacaputo, @cmsbuild, @fwyzard, @jpata, @tvami, @francescobrivio can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild , please test |
enable gpu |
-1 Failed Tests: UnitTests Unit TestsI found errors in the following unit tests: ---> test cpuVertexFinderByDensity_t had ERRORS ---> test cpuVertexFinderIterative_t had ERRORS GPU Comparison SummarySummary:
Comparison Summary@slava77 comparisons for the following workflows were not done due to missing matrix map:
Summary:
|
@cmsbuild , please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37952/30028
|
-reconstruction |
Milestone for this pull request has been moved to CMSSW_14_1_X. Please open a backport if it should also go in to CMSSW_14_0_X. |
ping |
Milestone for this pull request has been moved to CMSSW_14_2_X. Please open a backport if it should also go in to CMSSW_14_1_X. |
ping (to make bot change milestone) |
This PR replaces the old "notcub" cache allocator with a memory pool featuring
lockfree operations
backend agnostic implementation
The data interface is based on a simple Buffer that is completely backend agnostic
The allocation interface (makeBuffer) currently depends on cudaStream_t that can be easily hidden behind
void *
or a light opaque structA new feature is a "Bundle deleter": buffers can be bundle together and then freed in just one operation: this reduces the number of cuda calls.
All previous users of the cache allocator (at least for Pixel wf) have been migrated.
Tests passes: it is not slower than previous implementation. Need a free machine to make definitive tests.
Some cleanup is still required to remove debug statements.
Purely technical no regression expected.
Draft Slides for a possible presentation available @ https://cernbox.cern.ch/index.php/s/Ax4NHYGLHbG8N1C