Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale forwarding bits in nursery GC in StickyImmix #1118

Closed
wks opened this issue Apr 16, 2024 · 4 comments · Fixed by #1128
Closed

Stale forwarding bits in nursery GC in StickyImmix #1118

wks opened this issue Apr 16, 2024 · 4 comments · Fixed by #1128
Labels
P-high Priority: High. A high-priority issue should be fixed as soon as possible.

Comments

@wks
Copy link
Collaborator

wks commented Apr 16, 2024

Currently in StickyImmix, we clear both on-the-side mark bits and on-the-side forwarding bits (the Ruby binding currently uses on-the-side forwarding bits) in the Prepare stage of major GC, and we clear neither of them in nursery GC. For Immix and non-moving StickyImmix, this works fine because (1) every GC is a major GC for Immix, and (2) we never forward anything if StickyImmix never moves object.

However, if we use StickyImmix and enable object movement, the forwarding bits set in a full-heap GC will remain set until the next full-heap GC. Some of them are set for from-space objects that have already been moved. If a nursery GC happens in between, and there is an object happened to be allocated at the place where a stale forwarding bits is set, the nursery GC will erroneously consider the object as moved. When this error manifests, weak reference processors will see a non-moved object have a forwarding pointer, and its value is whatever currently stored in the object at the offset for storing forwarding reference.

@qinsoon
Copy link
Member

qinsoon commented Apr 17, 2024

Pin bit is not cleared as well. We should approach this issue systematically.

@udesou udesou added the P-high Priority: High. A high-priority issue should be fixed as soon as possible. label May 15, 2024
github-merge-queue bot pushed a commit that referenced this issue May 16, 2024
The major changes include:

-   Adding a method `Plan::current_gc_may_move_object`.
-   Immix-based plans now reset the in-defrag state at the end of GC.

The main change of this PR is allowing querying if the current GC may
move any object.

Whether a plan moves objects or not is not a simple "yes or no"
question. It may vary in each GC.

-   Some plans, such as MarkSweep, never moves any object.
- Some plans, such as SemiSpace and MarkCompact, moves objects in every
GC.
- Some plans, such as Immix, moves object in defrag GC, while "fast" GCs
are non-moving.
- Some plans, such as Immix and StickyImmix, depends on Cargo features.
"immix_non_moving" prevents all movement, and
"sticky_immix_non_moving_nursery" prevents movement in nursery GCs.

The information of whether the current GC is useful inside mmtk-core.
For example, we can skip clearing on-the-side forwarding bits in
non-moving GCs because no objects will be forwarded. Because this should
happen during the Release stage, we postponed the time for Immix-based
plans to reset the in-defrag state to the end of GC so that
`Plan::current_gc_may_move_object` remains callable during the Release
stage. *(Note that this PR does not fix
#1118, but this mechanism
introduced in this PR can be used to implement the clearing of side
forwarding bits efficiently.)*

The information is also useful for VM bindings. Therefore,
`Plan::current_gc_may_move_object` is part of the public API.

- For VMs that have "potential pinning parents" (PPP, i.e. non-root
objects that have fields that cannot be updated), the VM binding must
pin the children of those fields before starting tracing. If the VM
binding knows the current GC never moves any object, it can skip pinning
the children of PPPs.
- For VMs that maintain weak tables (tables that contain weak references
to heap objects) in the runtime, the VM binding needs to scan the weak
tables, updating entries for objects that are moved, and deleting
entries for unreachable objects. (Note the cost of updating entries may
be high if the table is hashed by address.) If the VM binding knows the
current GC never moves any object, it will not need to update the
entries for moved live objects. And if the current GC is also a nursery
GC, the VM binding will only need to scan entries added before the last
GC.

This new method complements existing mechanisms for checking if objects
can be moved. They are useful for different purposes and cannot replace
each other.

- The scope of the new `Plan::current_gc_may_move_object` is one GC. It
is useful for the VM to decide whether to pin PPPs objects before a GC
and how to process weak references or weak tables.
- `PlanConstraints::moves_objects` is a per-plan constant. It is useful
for the VM to determine the object layout w.r.t. address-based hashing.
- The scope of `ObjectReference::is_movable` is a single object. It is
useful for the VM to skip pinning certain objects when interacting with
native code.
- The scope of `PlanTraceObject::may_move_object` is a trace. It is an
internal mechanism for MMTk core to specialize work packets for
different kinds of traces.
@wks wks closed this as completed in #1128 May 16, 2024
@qinsoon
Copy link
Member

qinsoon commented May 16, 2024

#1128 stated that it did not fix this. Github probably captured 'fix it' and closed this issue.

@qinsoon qinsoon reopened this May 16, 2024
@qinsoon
Copy link
Member

qinsoon commented Jul 18, 2024

#1169 and the discussion in https://mmtk.zulipchat.com/#narrow/stream/262673-mmtk-core/topic/Nursery.20GC.20recycling.20old.20objects is another example of stale side log bits.

@wks
Copy link
Collaborator Author

wks commented Jul 19, 2024

Fixed in #1138

@wks wks closed this as completed Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-high Priority: High. A high-priority issue should be fixed as soon as possible.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants