Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not do any significant amount of work in Release #1147

Open
wks opened this issue Jun 11, 2024 · 1 comment
Open

Do not do any significant amount of work in Release #1147

wks opened this issue Jun 11, 2024 · 1 comment

Comments

@wks
Copy link
Collaborator

wks commented Jun 11, 2024

In #1146, we observed that when using the native MarkSweep and the eager sweeping is enabled, no ReleaseMutator work packet is executed until Release finishes. It can be seen from the eBPF timeline:

image

That is because the Release work packet calls Plan::release before spawning any ReleaseMutator work packet. See:

impl<C: GCWorkContext + 'static> GCWork<C::VM> for Release<C> {
fn do_work(&mut self, worker: &mut GCWorker<C::VM>, mmtk: &'static MMTK<C::VM>) {
trace!("Release Global");
mmtk.gc_trigger.policy.on_gc_release(mmtk);
// We assume this is the only running work packet that accesses plan at the point of execution
let plan_mut: &mut C::PlanType = unsafe { &mut *(self.plan as *const _ as *mut _) };
plan_mut.release(worker.tls);
for mutator in <C::VM as VMBinding>::VMActivePlan::mutators() {
mmtk.scheduler.work_buckets[WorkBucketStage::Release]
.add(ReleaseMutator::<C::VM>::new(mutator));
}
for w in &mmtk.scheduler.worker_group.workers_shared {
let result = w.designated_work.push(Box::new(ReleaseCollector));
debug_assert!(result.is_ok());
}
#[cfg(feature = "count_live_bytes_in_gc")]
{
let live_bytes = mmtk
.scheduler
.worker_group
.get_and_clear_worker_live_bytes();
mmtk.state.set_live_bytes_in_last_gc(live_bytes);
}
}
}

This is not a bug, because Release is designed to have exclusive access to the Plan when executed. This means we shouldn't do any high-intensity work in Plan::release unless we don't need to release mutators.

This affects all plans.

MarkSweep does a non-trivial amount of work in Plan::release when doing eager sweeping.

Here is a zoomed-in timeline for Immix. ImmixSpace.release generates SweepChunk work packets, so they start executing before ReleaseMutator work packets start.

image

This is SemiSpace:

image

It looks OK. But after I turn on the VO bits:

image

Clearly the Release work packet is spending a significant amount of time clearing the VO bit.

We should revisit all plans and all spaces, and see if they do any significant amount of work in Plan.release and Space.release. If they do, they should be off-loaded to dedicated work packets so that they can be parallelized.

@wks
Copy link
Collaborator Author

wks commented Jun 11, 2024

It raises another question: If Release is supposed to have exclusive access to the Plan, and if it can spawn work packets (such as SweepChunk) that can start executing immediately, then it may not have exclusive access to the plan. because work packets such as SweepChunk may access the space which, according to Rust's concept of ownership, is part of the plan.

github-merge-queue bot pushed a commit that referenced this issue Jun 28, 2024
This PR re-introduce the `SweepChunk` work packet to the native
`MarkSweepSpace` when using eager sweeping. The main intention of this
PS is fixing a pathological case where there is only one mutator and the
single `ReleaseMutator` work packet cannot be parallelized.

The algorithm is unchanged for lazy sweeping.

When doing eager sweeping,

1. We first release all unmarked blocks in `MarkSweepSpace::abandoned`
and each mutator.
2. And then we sweep blocks in parallel using `SweepChunk` work packets.
3. We then go through all "consumed" blocks and move them into
"available" lists if they have any free cells.

In step (1), we release blocks for each mutator in `ReleaseMutator`.
Releasing blocks is very fast, so parallelism is not a bottleneck.
During step (2), all blocks are either unallocated or marked, so we
don't need to move any block out of linked lists, avoiding the data race
we observed in #1103. Step (3)
is done by one thread, but it is fast enough not to be a performance
bottleneck.

We also introduced a work packet `ReleaseMarkSweepSpace` which does what
`MarkSweepSpace::release` did, but is executed concurrently with
`ReleaseMutator` work packets. In this way, we don't do too much work in
the `Release` work packet, which is a problem we discussed in
#1147.

We removed the constant `ABANDON_BLOCKS_IN_RESET` because it is
obviously necessary to do so. Otherwise one mutator will keep too many
blocks locally, preventing other mutators to get available blocks to
use.

We added an USDT tracepoint in `SweepChunk` in both `MarkSweepSpace` and
`ImmixSpace` so that we can see the number of allocated blocks a
`SweepChunk` work packet visited in the timeline generated by eBPF
tracing tools.

We also changed `immixspace::SweepChunk` so that it now *asserts* that
the chunk is allocated rather than skipping unallocated chunks.
`ChunkMap::generate_tasks` already filters out unallocated chunks.

Fixes: #1146
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant