implement dispatch.bringOutYourDead #3767

warner · 2021-08-27T00:43:31Z

It's probably time to implement dispatch.bringOutYourDead, as explored in #1872:

Another approach would be to add an explicit delivery type whose only job is to check the finalizer flags and report any dead references. We might call this immediately after every crank finishes. I don't like the overhead this would represent, but it's hard for me to resist the idea of adding a perfectly-named dispatch.bringOutYourDead method. It might be sufficient to call it once every N cranks or something.
Originally posted by @warner in #1872 (comment)

Currently, we confine the consequences of GC into a function named finish() that happens inside liveslots at the end of every crank. This function does:

gcAndFinalize(), to force a complete GC cycle, to ensure all UNREACHABLE objects are moved at least to the COLLECTED state, and to allow finalizers to run, which moves them into FINALIZED
examines the deadSet (which is populated by the finalizers), which contains the vrefs of objects which have just lost their in-memory references (i.e. a Presence, Remotable, or Representative has gone away)
for the subset of these vrefs that are not kept alive by other references (exports, virtualized data), the vrefs are deleted
- this may emit syscalls like dropImports, retireImports, and retireExports
- it may also delete virtual objects, which can free up more data

For the sake of consensus, it is important to conceal the consequences of GC (e.g. finalizers) until the same point among all validators. We can tolerate "organic" (non-forced) GC happening at any time, however the results must be hidden (from userspace, syscalls, and metering) until the consensus-accepted moment. Likewise, when we do reveal GC, we need to make it complete: no UNREACHABLE object left behind, which requires us to force GC during this point.

Forcing GC on every delivery is expensive. My recent study of the phase4.5 testnet data (replaying vat-mints locally) runs about 7x faster if we never force GC and simply let organic GC happen. This would not maintain consensus, so we need a middle ground.

The proposal is to add a new kind of dispatch (next to dispatch.deliver, dispatch.notify, dispatch.dropExports, etc) named dispatch.bringOutYourDead (evoking a scene from Monty Python's Holy Grail). When this method is invoked, the vat will force GC and reveal the consequences. It will conceal the consequences until this point. From the kernel's point of view, no GC happens at all until the vat is asked to do so. From the vat's point of view, organic GC happens normally, but the results are concealed in deadSet until it is asked (and the vat does an extra forced GC just beforehand).

This will allow the memory usage of the vat to drop normally, however whatever imports it is holding will be retained until the bringOutYourDead point, which means memory usage in other vats will be elevated longer than it would have if we still did GC at the end of every crank. In exchange for this, the CPU time spent on GC should be signficantly less.

We'll need to build some sort of kernel policy for deciding exactly when to call dispatch.bringOutYourDead for each vat. As a starting point, we should simply call it once every 10 deliveries to each vat. A likely second step is to keep track of how many kernel-wide cranks have taken place since the last bringOutYourDead, and schedule calls for the least-recently called vats. If we imagine that a vat is "clean" immediately after bringOutYourDead (no garbage), then it becomes progressively "dirtier" as we make more userspace-visible calls (so dispatch.deliver and dispatch.notify, but not dispatch.dropExports). If we reached a particular level of "dirt", or if a dirtied vat has gone long enough (as measured in cranks, which might involve any vat) without cleaning, then we perform a bringOutYourDead. The goal is for inactive vats to get cleaned up eventually, and not linger in a dirty state forever.

We can even imagine a scheme in which the contract owner (or some delegated subparty), or the contract itself, explicitly requests a bringOutYourDead call. The runtime costs of the vat might be tied to the "dirty" status, and bringOutYourDead is the way to bring them back down. This would let vat owners decide for themselves what the appropriate frequency of GC should be. We've also explored how the memory-usage meter could interact with this, which will be covered in a separate ticket.

Starting Steps

change liveslots.js to create dispatch.bringOutYourDead, moving the call to finish() out of dispatch()
add a pair of kvstore keys for each vat: one to track the number of deliveries remaining before the next bringOutYourDead (decremented on every delivery), and a second to record the frequency of cleanup
change either vat-warehouse or deliverAndLogToVat to decrement the first, if zero then perform a bringOutYourDead and reset to the second, with suitable commits and slog events
- the bringOutYourDead should be performed in its own crankBuffer commit transaction: it is its own crank
- however the cleanup should not be performed if the target vat has been deleted
- the best approach might be to introduce a new gcAction, and have the countdown-checker simply push a bringOutYourDead(vatID) onto the gcActions queue

A handful of unit tests believe they know how many cranks will happen, given a set of inputs (usually queueToVatRoot). This may change with the introduction of bringOutYourDead deliveries, as it did with the addition of gcActions, so those tests will need to be updated.

The text was updated successfully, but these errors were encountered:

warner assigned FUDCo Aug 27, 2021

warner added the SwingSet package: SwingSet label Aug 27, 2021

warner mentioned this issue Aug 27, 2021

spacebank: vat storage rental economics #2631

Open

FUDCo mentioned this issue Sep 8, 2021

Implement Bring Out Your Dead as a kernel-driven operation #3801

Merged

warner mentioned this issue Sep 16, 2021

increasing end-of-crank time consumed in testnet4.5 vattp trace, promises not getting GCed? XS bug? #3839

Closed

warner mentioned this issue Sep 30, 2021

does XS allow finalizers to be called outside of gcAndFinalize? #3810

Closed

mergify bot closed this as completed in #3801 Nov 3, 2021

mhofman mentioned this issue Jan 13, 2023

Suspected liveslots sensitivity to engine allocations #6784

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement dispatch.bringOutYourDead #3767

implement dispatch.bringOutYourDead #3767

warner commented Aug 27, 2021 •

edited

Loading

implement dispatch.bringOutYourDead #3767

implement dispatch.bringOutYourDead #3767

Comments

warner commented Aug 27, 2021 • edited Loading

Starting Steps

warner commented Aug 27, 2021 •

edited

Loading