-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting disjoint objects #656
Comments
A few comments about the abstractions...
Summarizing the terminology:
It is not our goal to implement Ruby's disjoint objects---one of the primary reasons for moving to MMTk is to avoid the limitation in Ruby that requires disjoint objects. Languages will likely use disjoint objects to implement arrays that may be resized. |
@steveblackburn What name should we give to the thing returned by |
I suggested using a special policy for buffers. I think it does not conflict with Steve's description about the disjoint objects. It is indeed one way to implement disjoint objects. Instead of having each policy deal with the special buffer object, we could make it only known to the buffer policy. The following is a comparison of implementing buffers in a special policy and allowing buffer in any policy.
|
With disjoint objects, in some places where we refer to objects, we need to differentiate 'object' (parent + buffer) from just 'parent'. For example, when we ask a binding for object sizes:
|
Some conclusions from our discussion (based on the table above):
A few other things that we discussed:
|
Just to record my notes for the above discussion:
|
Updates
Recent discussions changed my initial thoughts. I list the current status in this section.
Concepts
alloc
call creates an allocation.An implication of the retain and reclaim semantics is that
To summarise in simpler words:
Implementation
Allocation
Both objects and buffers are allocations. Buffers are allocated using the same
Mutator::alloc
function. But we either do not needpost_alloc
, or we use a differentpost_alloc
.Retaining and copying buffers
When scanning an object, the VM identify pointers in the header object that points to buffers. mmtk-core provides an API for the VM. The signature of the function looks like this:
The VM needs to provide the size and the alignment to MMTk core because MMTk core cannot get those information from the buffer alone (as there is no such information in the buffer). If it is copying GC, MMTk core will use that information to copy the buffer. The buffer is copied like copying ordinary objects, via
ObjectModel::copy
provided by the VM. The new address of the buffer is returned.Question: Should we provide another
ObjectMode::copy_buffer
?ObjectModel::copy
takes anObjectReference
which assumes it is a reference to an object, not a buffer.However, if the VM doesn't want to retain the buffer, it simply ignore the buffer, and MMTk will treat the buffer as dead.
Original thoughts
The rest of this post are my initial thoughts
"Naked" objects
Not all objects are created equal. In some virtual machines, some objects are wholly owned by other objects. For example:
Given that more than one VM have such objects, it may be worth adding support to such "naked" objects in mmtk-core.
Primary and subsidiary ("naked") objects
An object can be either primary or subsidiary. Primary objects are the objects we know before. Subsidiary objects are what we called "naked" objects.
Both kinds of objects are allocated in the GC heap.
alloc
).Their differences are,
klass
field, etc.)Difference from vanilla Ruby's buffers
A difference between the "subsidiary" objects defined here and the buffers of Array and String in vanilla Ruby is that both the primary and the subsidiary objects defined here are managed by the MMTk GC, while Ruby's buffers are allocated by
malloc
, and arefreed
by finalizers (obj_free
).Object graph and subsidiary objects
An object graph contains nodes and edges.
Without subsidiary objects, all object are primary. A node is a (primary) object. An object contains many reference fields, each of which represents an edge to another objects.
With primary and subsidiary objects, a node is an object group, i.e. one primary object plus all subsidiary objects it owns. Reference fields in both the primary and its subsidiary objects are the edges of the node. Edges only point to primary objects. The pointer from a parent to a subsidiary is not considered an edge in the object graph -- it's internal to a node.
Opportunity of object merging/splitting during copying
During copying, the GC has the opportunity to resize objects, and the opportunity to merge or split objects in a group.
For example,
As in the current MMTk interface, the VM is responsible for copying objects during copying GC. Some VMs are already using this opportunity to implement address-based hashing. We can extend this mechanism and let the VM decide whether to resize, split or merge objects.
However, in concurrent copying GC, it is the VM's responsibility to handle the synchronization between the mutator and the GC.
The text was updated successfully, but these errors were encountered: