-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No coordinator work #794
No coordinator work #794
Conversation
Remove the concept of "coordinator work packets". Now ScheduleCollection and StopMutators are both executed on ordinary workers. The only work packet executed by the coordinator is EndOfGC. Simplified the interaction between the coordinator and the workers. The coordinator only responds to the event that "all workers have parked". Removed the workers-to-coordinator channel. WorkerMonitor now has two Condvars, one (the existing one) for notifying workers about more work available, and another for notifying the coordinator that all workers have parked.
1059667
to
4515e9b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor issues. It simplifies the code a lot, which is great. We may need to be careful about any synchronisation cost this change may bring, and we need to run binding tests to make sure this works.
src/scheduler/controller.rs
Outdated
let event = self.receiver.poll_event(); | ||
let finished = self.process_event(event); | ||
self.scheduler.worker_monitor.resume_and_wait(); | ||
let finished = self.on_all_workers_parked(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just remove on_all_workers_parked()
-- it is simple enough to be included in the main loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
src/scheduler/worker.rs
Outdated
/// The state of the worker group. The worker group alternates between the `Sleeping` and the | ||
/// `Working` state. Workers execute work packets in the `Working` state, but once workers entered | ||
/// the `Sleeping` state, they can not continue until the coordinator explicitly transitions the | ||
/// state back to `Working` after it found more work for workers to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better make it clear that as long as one worker is executing, the group state is Working
, and when none of the workers is executing, the group state is Sleeping
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cause and effect is the opposite. The state dictates the behaviour instead of behaviour determining the state. The Working
state allows workers to execute and forbids the coordinator from opening new buckets, while the Sleeping
state forbids workers from executing packets and allows the coordinator to open buckets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Wow, some benchmarks showed 40% improvement in GC time by removing a mechanism that we barely used. |
Very cool! I think this is still with the improper min nursery size, right? We should investigate what is a proper min nursery size soon. |
This is still based on I am not sure why the improvement is so significant.
Anyway, it shows this change does not have negative impact on the performance. But we should investigate why this is happening using more advanced tools. |
It's only 20% for GenImmix, and 10% for Immix. The zero point of the plots generated by Plotty is misleading. |
I re-ran
I ran it on both bobcat.moma and vole.moma. The times are normailsed to build3. Here are the results: STW times on bobcat: STW times on vole: Total time on bobcat: Total time on vole: I think the results are consistent between bobcat and vole. My last PR (782) introduced performance regression, but this PR improves the performance instead, and it has improvement over the original state of build3. |
Why do the number of GCs change for avrora for GenImmix? This PR shouldn't change the GC triggering, right? |
I re-ran Number of GC (Immix, not normalised): Number of GC (GenImmix, not normalised): Number of major GCs (GenImmix, not normalised): STW time (Immix, not normalised): STW time (GenImmix, not normalised): Total time (Immix, not normalised): Total time (GenImmix, not normalisd): In this run, GenImmix triggered about 272 GCs, among which there is at most 1 major GC, and the GC time ranges from 480ms (build2) to 640ms (build1), with build3 in the middle (560ms). 480ms is still about 14% time reduction from 560ms. I don't know how many of the 272 GCs are "kind 1" or "kind 2" GCs. Assume "kind 1" GC only takes 1ms and "kind 2" GC takes 15ms. There needs to be about 20 "kind 2" GCs among them so that the STW time will be 560ms. So from this result, I think we didn't really made |
I think it is just non-determinism. As we discussed in today's meeting, if one GC is triggered at a very unfortunate moment, it will impact the set of live objects, and transitively the time to trigger the next GC. This influence can go on and on and eventually resulting in one run triggering more GC than another. |
Could you try avrora on a tighter heap (or all benchmarks on a tighter heap for that matter) to confirm this? |
I re-ran avrora and a few other benchmarks that have low GC counts again, with a smaller heap. Note: Plotty link: |
Here is the result of jython after this commit wenyuzhao/mmtk-openjdk@1c384fd is cherry-picked to the GC is triggered for about 600 times for GenImmix and about 320 times for Immix. After this commit is applied, build3 no longer crashes, and build2 still has improvements over build3 in STW time by a significant amount. |
The reason of the performance anomaly with the
For reasons unknown, when running the Since we know the anomaly is not caused by this PR, I will merge this PR and leave the two bugs mentioned above to be fixed later. |
This reverts commit 43e8a92.
Remove the concept of "coordinator work packets". Now ScheduleCollection and StopMutators are both executed on ordinary workers. The only work packet executed by the coordinator is EndOfGC.
Simplified the interaction between the coordinator and the workers. The coordinator only responds to the event that "all workers have parked". Removed the workers-to-coordinator channel. WorkerMonitor now has two Condvars, one (the existing one) for notifying workers about more work available, and another for notifying the coordinator that all workers have parked.
This PR completely removes the concept of "coordinator work". The primary goal is fixing the issue #792
The
WorkerGroupState
is basically equivalent to thegroup_sleep
Boolean variable.Known issue
I should probably fix #793 in this PR, too, given that I am making changes to the synchronisation mechanism.That's a bit complicated. I am not fixing that in this PR.const COORDINATOR_ONLY_STW: bool
inCollection
: It is no longer useful. I'll do a multi-repo refactoring to remove it frommmtk-core
,mmtk-openjdk
andmmtk-ruby
.