You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's probably time to implement dispatch.bringOutYourDead, as explored in #1872:
Another approach would be to add an explicit delivery type whose only job is to check the finalizer flags and report any dead references. We might call this immediately after every crank finishes. I don't like the overhead this would represent, but it's hard for me to resist the idea of adding a perfectly-named dispatch.bringOutYourDead method. It might be sufficient to call it once every N cranks or something. Originally posted by @warner in #1872 (comment)
Currently, we confine the consequences of GC into a function named finish() that happens inside liveslots at the end of every crank. This function does:
gcAndFinalize(), to force a complete GC cycle, to ensure all UNREACHABLE objects are moved at least to the COLLECTED state, and to allow finalizers to run, which moves them into FINALIZED
examines the deadSet (which is populated by the finalizers), which contains the vrefs of objects which have just lost their in-memory references (i.e. a Presence, Remotable, or Representative has gone away)
for the subset of these vrefs that are not kept alive by other references (exports, virtualized data), the vrefs are deleted
this may emit syscalls like dropImports, retireImports, and retireExports
it may also delete virtual objects, which can free up more data
For the sake of consensus, it is important to conceal the consequences of GC (e.g. finalizers) until the same point among all validators. We can tolerate "organic" (non-forced) GC happening at any time, however the results must be hidden (from userspace, syscalls, and metering) until the consensus-accepted moment. Likewise, when we do reveal GC, we need to make it complete: no UNREACHABLE object left behind, which requires us to force GC during this point.
Forcing GC on every delivery is expensive. My recent study of the phase4.5 testnet data (replaying vat-mints locally) runs about 7x faster if we never force GC and simply let organic GC happen. This would not maintain consensus, so we need a middle ground.
The proposal is to add a new kind of dispatch (next to dispatch.deliver, dispatch.notify, dispatch.dropExports, etc) named dispatch.bringOutYourDead (evoking a scene from Monty Python's Holy Grail). When this method is invoked, the vat will force GC and reveal the consequences. It will conceal the consequences until this point. From the kernel's point of view, no GC happens at all until the vat is asked to do so. From the vat's point of view, organic GC happens normally, but the results are concealed in deadSet until it is asked (and the vat does an extra forced GC just beforehand).
This will allow the memory usage of the vat to drop normally, however whatever imports it is holding will be retained until the bringOutYourDead point, which means memory usage in other vats will be elevated longer than it would have if we still did GC at the end of every crank. In exchange for this, the CPU time spent on GC should be signficantly less.
We'll need to build some sort of kernel policy for deciding exactly when to call dispatch.bringOutYourDead for each vat. As a starting point, we should simply call it once every 10 deliveries to each vat. A likely second step is to keep track of how many kernel-wide cranks have taken place since the last bringOutYourDead, and schedule calls for the least-recently called vats. If we imagine that a vat is "clean" immediately after bringOutYourDead (no garbage), then it becomes progressively "dirtier" as we make more userspace-visible calls (so dispatch.deliver and dispatch.notify, but not dispatch.dropExports). If we reached a particular level of "dirt", or if a dirtied vat has gone long enough (as measured in cranks, which might involve any vat) without cleaning, then we perform a bringOutYourDead. The goal is for inactive vats to get cleaned up eventually, and not linger in a dirty state forever.
We can even imagine a scheme in which the contract owner (or some delegated subparty), or the contract itself, explicitly requests a bringOutYourDead call. The runtime costs of the vat might be tied to the "dirty" status, and bringOutYourDead is the way to bring them back down. This would let vat owners decide for themselves what the appropriate frequency of GC should be. We've also explored how the memory-usage meter could interact with this, which will be covered in a separate ticket.
Starting Steps
change liveslots.js to create dispatch.bringOutYourDead, moving the call to finish() out of dispatch()
add a pair of kvstore keys for each vat: one to track the number of deliveries remaining before the next bringOutYourDead (decremented on every delivery), and a second to record the frequency of cleanup
change either vat-warehouse or deliverAndLogToVat to decrement the first, if zero then perform a bringOutYourDead and reset to the second, with suitable commits and slog events
the bringOutYourDead should be performed in its own crankBuffer commit transaction: it is its own crank
however the cleanup should not be performed if the target vat has been deleted
the best approach might be to introduce a new gcAction, and have the countdown-checker simply push a bringOutYourDead(vatID) onto the gcActions queue
A handful of unit tests believe they know how many cranks will happen, given a set of inputs (usually queueToVatRoot). This may change with the introduction of bringOutYourDead deliveries, as it did with the addition of gcActions, so those tests will need to be updated.
The text was updated successfully, but these errors were encountered:
It's probably time to implement
dispatch.bringOutYourDead
, as explored in #1872:Another approach would be to add an explicit delivery type whose only job is to check the finalizer flags and report any dead references. We might call this immediately after every crank finishes. I don't like the overhead this would represent, but it's hard for me to resist the idea of adding a perfectly-named
dispatch.bringOutYourDead
method. It might be sufficient to call it once every N cranks or something.Originally posted by @warner in #1872 (comment)
Currently, we confine the consequences of GC into a function named
finish()
that happens inside liveslots at the end of every crank. This function does:gcAndFinalize()
, to force a complete GC cycle, to ensure all UNREACHABLE objects are moved at least to the COLLECTED state, and to allow finalizers to run, which moves them into FINALIZEDdeadSet
(which is populated by the finalizers), which contains the vrefs of objects which have just lost their in-memory references (i.e. a Presence, Remotable, or Representative has gone away)dropImports
,retireImports
, andretireExports
For the sake of consensus, it is important to conceal the consequences of GC (e.g. finalizers) until the same point among all validators. We can tolerate "organic" (non-forced) GC happening at any time, however the results must be hidden (from userspace, syscalls, and metering) until the consensus-accepted moment. Likewise, when we do reveal GC, we need to make it complete: no UNREACHABLE object left behind, which requires us to force GC during this point.
Forcing GC on every delivery is expensive. My recent study of the phase4.5 testnet data (replaying
vat-mints
locally) runs about 7x faster if we never force GC and simply let organic GC happen. This would not maintain consensus, so we need a middle ground.The proposal is to add a new kind of dispatch (next to
dispatch.deliver
,dispatch.notify
,dispatch.dropExports
, etc) nameddispatch.bringOutYourDead
(evoking a scene from Monty Python's Holy Grail). When this method is invoked, the vat will force GC and reveal the consequences. It will conceal the consequences until this point. From the kernel's point of view, no GC happens at all until the vat is asked to do so. From the vat's point of view, organic GC happens normally, but the results are concealed indeadSet
until it is asked (and the vat does an extra forced GC just beforehand).This will allow the memory usage of the vat to drop normally, however whatever imports it is holding will be retained until the
bringOutYourDead
point, which means memory usage in other vats will be elevated longer than it would have if we still did GC at the end of every crank. In exchange for this, the CPU time spent on GC should be signficantly less.We'll need to build some sort of kernel policy for deciding exactly when to call
dispatch.bringOutYourDead
for each vat. As a starting point, we should simply call it once every 10 deliveries to each vat. A likely second step is to keep track of how many kernel-wide cranks have taken place since the lastbringOutYourDead
, and schedule calls for the least-recently called vats. If we imagine that a vat is "clean" immediately afterbringOutYourDead
(no garbage), then it becomes progressively "dirtier" as we make more userspace-visible calls (sodispatch.deliver
anddispatch.notify
, but notdispatch.dropExports
). If we reached a particular level of "dirt", or if a dirtied vat has gone long enough (as measured in cranks, which might involve any vat) without cleaning, then we perform abringOutYourDead
. The goal is for inactive vats to get cleaned up eventually, and not linger in a dirty state forever.We can even imagine a scheme in which the contract owner (or some delegated subparty), or the contract itself, explicitly requests a
bringOutYourDead
call. The runtime costs of the vat might be tied to the "dirty" status, andbringOutYourDead
is the way to bring them back down. This would let vat owners decide for themselves what the appropriate frequency of GC should be. We've also explored how the memory-usage meter could interact with this, which will be covered in a separate ticket.Starting Steps
dispatch.bringOutYourDead
, moving the call tofinish()
out ofdispatch()
bringOutYourDead
(decremented on every delivery), and a second to record the frequency of cleanupdeliverAndLogToVat
to decrement the first, if zero then perform abringOutYourDead
and reset to the second, with suitable commits and slog eventsbringOutYourDead
should be performed in its own crankBuffer commit transaction: it is its own crankgcAction
, and have the countdown-checker simply push abringOutYourDead(vatID)
onto thegcActions
queueA handful of unit tests believe they know how many cranks will happen, given a set of inputs (usually
queueToVatRoot
). This may change with the introduction ofbringOutYourDead
deliveries, as it did with the addition ofgcActions
, so those tests will need to be updated.The text was updated successfully, but these errors were encountered: