You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Being able to run tests multi-threaded simply by adding DIVAN_THREADS=XXX to get a feeling for contention is super nice.
However it seems like every benchmark run is using scoped threads under the hood.
Running a threaded benchmark through samply record, I end up with well beyond 6k "tracks", each of which is extremely short lived and it is pretty much impossible to select any of the background threads to do proper profiling.
It also appears that a large portion of the main thread time is actually spent creating / destroying threads themselves, at least on macOs where I tested this:
The text was updated successfully, but these errors were encountered:
Hey @Swatinem sorry for taking forever on this. As part of Divan's JSON work, I am refactoring Divan's internals and part of that refactor was adding a thread pool. I decided to extract the thread pool out to address your issue, and this fix is now available in v0.1.16.
The main thread creates a Task, which is a pointer to a TaskShared pinned on the stack. TaskShared stores the function to run, along with other fields for coordinating threads.
New threads are spawned if the requested amount is not available. Each receives tasks over an associated channel.
The main thread sends the Task over the channels to the requested amount of threads. Upon receiving the task, each auxiliary thread will execute it and then decrement the task's reference count.
The main thread executes the Task like auxiliary threads. It then waits until the reference count is 0 before returning.
Being able to run tests multi-threaded simply by adding
DIVAN_THREADS=XXX
to get a feeling for contention is super nice.However it seems like every benchmark run is using scoped threads under the hood.
Running a threaded benchmark through
samply record
, I end up with well beyond 6k "tracks", each of which is extremely short lived and it is pretty much impossible to select any of the background threads to do proper profiling.It also appears that a large portion of the main thread time is actually spent creating / destroying threads themselves, at least on macOs where I tested this:
The text was updated successfully, but these errors were encountered: