API to fork a vat (create "zygote vat") #2268

warner · 2021-01-27T00:02:22Z

What is the Problem Being Solved?

Once we have XS snapshots working (#511, #2138), we should be able to reload a saved vat from snapshot faster than replaying the vat's entire transcript.

Most of the vats on our platform will be built in the same way:

create the XS engine (resulting in a blank environment)
evaluate the SES shim (resulting in a blank SES environment)
eval liveslots (ready to load a vat)
eval the ZCF wrapper (ready to load a contract)
eval the contract code (ready to instantiate a contract)
instantiate the contract code (resulting a new instance of some contract)

With snapshots, we ought to be able to interrupt that sequence at any point, save the resulting XS state, and re-use it multiple times. This won't save any memory (each vat still gets its own XS engine, and its own heap), but it should save a lot of startup time.

Description of the Design

We'll need an API for this. What I'm thinking is that the vat-admin facet for a dynamic vat should acquire a fork() method. If you hold this facet, you can call [newrefs] = await admin~.fork([oldrefs]). For each oldref that was a Presence whose Remotable lived on the old vat, you'll get a new Presence that points to the corresponding object in the new vat. Any other exports will be unreachable (unless the new vat chooses to re-export them at some point). Any import that the old vat had access to will also be available to the new vat (for example, it might hold a reference to a timer service).

This method is named after the Unix fork() syscall, which treats file descriptors in a similar way. When a process forks, the FD table is duplicated, so both parent and child have access to the same descriptors at the same offsets. The Unix fork() is executed from the inside, however, whereas the swingset one would be applied from the outside.

We might also name this diverge() after the E method on Arrays and Maps(?) which makes a copy of the data into a new, separately-mutable object.

Zoe would create one new dynamic vat with the ZCF bundle, giving it access to the root object, but would refrain from ever sending any messages to it, preserving its independence from any specific contract. Then, when a new contract is installed (zoe~.install(contractBundle)), Zoe would fork() that undifferentiated "ZCF zygote" vat, creating a vat for a specific contract. Then Zoe would send a command to the root object to evaluate the contract bundle, returning an object that represents the per-contract but not per-instance state (which would include the ZCF root object).

Zoe would keep that "contract zygote" vat pristine too, keeping a table that maps from contract identity to a tuple of (admin facet, per-contract object). Then, when Zoe is asked to finally instantiate that contract (zoe~.startInstance(installation, ...)), Zoe would do something like:

const [ admin, zcfObjectRoot, perContractObjectRoot ] = contracts.get(installation);
const [ zcf, perContractObject ] = await admin~.fork([zcfRoot, perContractObjectRoot]);

and then figure out the per-instance objects by asking perContractObject to do per-instance things, finally getting objects that are indistinguishable from what it would have gotten if it had built everything from scratch each time.

It might be useful to couple this with some sort of explicit "offline vat" control. When Zoe creates the ZCF zygote vat, it's not going to interact with it, so it might tell the admin facet to save RAM by taking a snapshot and unloading the vat from memory, leaving only disk space in use (perhaps admin~.takeOffline()). Or, the worker scheduler might do this automatically when it sees the fork() command, or when it notices the zygote vat hasn't had any messages sent to it for a while.

Internally, the kernel needs to populate the new vat's c-list from the old one. Each import of the old vat is replicated (same kref, same vref, to match the state of the liveslots table that lives in the snapshot's heap state). The oldrefs are mapped to krefs, and then each export is compared against the set of krefs: for each match, a new c-list entry is created with the old vref and a newly-allocated kref. Then the krefs are put into resolution data for the fork result promise, where they'll be translated into new import vrefs for the vat doing the fork.

All per-vat data secondary storage needs to be duplicated (i.e. virtual object tables). Ideally this storage will be pretty empty, because the zygote vat should not have had any interaction with the outside world yet.

Security Considerations

Assuming the implementation is correct, I don't think the new vat will have any more authority than the one it was copied from, nor should the vat directing the fork be able to amplify its authority by forking the vat it created. There is a question of metering and resource allocation, of course, to avoid allowing a DoS attack by virtue of a forkbomb or something similar.

cc @erights @dtribble @FUDCo @Chris-Hibbert @katelynsills for your consideration

The text was updated successfully, but these errors were encountered:

warner · 2021-01-27T00:54:55Z

@katelynsills points out that currently Zoe's "install" merely stores the contract code in a table, while the "instantiate" step does all of (create a new vat, load ZCF into it, send the contract code to ZCF, ZCF evaluates the contract code, ZCF instantiates the contract). We'd need to split up that process just after the evalContractBundle() and just before the start() here:

agoric-sdk/packages/zoe/src/contractFacet/contractFacet.js

Lines 460 to 476 in 60a4adf

    
           const contractCode = evalContractBundle(bundle); 
        
           // Don't trigger Node.js's UnhandledPromiseRejectionWarning. 
        
           // This does not suppress any error messages. 
        
           contractCode.catch(() => {}); 
        
           // Next, execute the contract code, passing in zcf 
        
           /** @type {Promise<ExecuteContractResult>} */ 
        
           const result = E(contractCode) 
        
             .start(zcf) 
        
             .then(({ creatorFacet, publicFacet, creatorInvitation }) => { 
        
               return harden({ 
        
                 creatorFacet, 
        
                 publicFacet, 
        
                 creatorInvitation, 
        
                 addSeatObj, 
        
               }); 
        
             });

FUDCo · 2021-01-27T00:56:27Z

My instinct would be to follow the Unix fork(2) model much more directly, and actually work from the inside. The Unix fork() call returns 0 if the return is in the child process or the pid of the child process if the return is in the parent process. The analogous admin~.fork() would return nullish in the child vat and a reference to a snapshot object in the parent vat. The snapshot object would have a resume method that could be invoked to create a new vat, load it from the snapshot, and set it going (probably there's a better name than resume since the snapshot could be started from more than once). The new vat would experience the nullish return from fork and go about its business accordingly. The child vat would inherit all of the parent vat's imports as of the moment of snapshotting. Everything else you need could be bootstrapped from this.

erights · 2021-01-27T00:59:31Z

after the evalContractBundle() and just before the start() here:

Yes

warner · 2021-01-27T01:01:19Z

One open question: how should Promises be handled, if at all? Surely we cannot allow both parent and child vats to have Decider authority over the same promise. We might need to declare that any unresolved promises owned by the parent vat will be rejected during the fork. Maybe we can have the forker list those promises in oldrefs and create new copies of them, but then we'd need an answer about what should happen to any messages pipelined to the old one.. should we duplicate the pending messages?

@FUDCo hm, the advantage of forking from the outside is that we know the vat is idle. If we let vats fork themselves from the inside, we'd need to.. hm, replicate the crank that did the fork? And have it return different values in the replica? So the syscalls could diverge starting from that point (we'd replicate the delivery, but not expect the transcript to match). Interesting. I'll have to noodle on that.

FUDCo · 2021-01-27T01:18:18Z

@warner You make a good point about wanting to snapshot between cranks. Perhaps (now getting really speculative here) fork always returns in the parent vat with the snapshot understood to happen at the end of whatever crank it's called in (in which case it might be better named something other than fork, maybe actually snapshot). Launching the snapshot would send the child vat a message to start it up, similar in spirit to bootstrap but obviously with a different name. Indeed, perhaps the "run after load from snapshot" message could be an arg to the resume message (or whatever it's called) that gets sent to the snapshot object, possibly also with args. Having the name of the launching message be parameterizable might be helpful in the case of vats that get snapshotted multiple times at different stages in their life cycle (e.g., a zcf vs. a zcf parameterized with a particular contract bundle vs. a zcf with a particular contract bundle parameterized with a particular set of contract instance parameters -- levels of particularity strike again!)

erights · 2021-01-27T01:25:39Z

I think the promise issue is a symptom of a deeper issue that disqualifies a unix-like fork model. I think Moddable's build-time vs run-time split is the better fit. This is not part of xsnap btw, but is a good model we can follow with snapshots using xsnap.

The deeper issue is that the execution prior to snapshot cannot have general entanglement with the outside world. As bad as the decider problem is, the problem of incoming references to objects hosted by the vat is at least as bad. Which of the two descendants of the fork is designated by that exported capability? During XS build-time, the capabilities to I/O devices are in scope but inactive. They don't lead to anything. After build-time is when they cut a ROM. Then each device gets a separate copy of the ROM. Only in the device do these I/O caps control actual devices. These devices didn't even exist in the build-time environment.

Translating this into the running of zcf initialization up through evalContractBundle but not beyond it will be interesting. Examining that will give us a concrete idea of how little connectivity we can allow the zygote before snapshot. I think the answer will be very little.

Chris-Hibbert · 2021-01-27T01:42:31Z

@erights, does this imply that the vat needs to be marked as snapshottable when it is started, so the kernel knows not to give it access to capabilities that should be reserved for the children?

erights · 2021-01-27T01:44:40Z

Yes, I think so. Slogan: "Zygotes are not yet in the world"

FUDCo · 2021-01-27T01:49:58Z

@erights My concept would be that when the vat launched from a snapshot starts up, there are no references anywhere else that point into it (with the possible exception of a root reference that is a product of the launch-from-snapshot operation itself), nor is it at that point the decider for any promise. It could, however, hold references in its snapshotted state to outside objects to which it could send messages to establish whatever connectivity it needed to do its job. I'm not sure how that compares to Moddable's model since I'm not yet familiar with that.

warner · 2021-01-27T02:03:32Z

the problem of incoming references to objects hosted by the vat is at least as bad. Which of the two descendants of the fork is designated by that exported capability?

In the API I wrote up above, the original exports remain firmly attached to the original vat. The act of calling fork() gives the caller back a new set of references to things in the new vat, and only for exports which correspond to things the caller already had access to; they must pass the parent's exports in as oldrefs, and get back the corresponding (comity-ing?) new exports as newrefs.

I concur that the amount of connectivity before snapshot should be minimal. Certainly whatever the vat-to-be-forked has access to needs to be non-specific to the variable descendants. If each contract instance has a distinct ZCF agent, and Zoe offers a different internal facet to each such ZCF agent, then the instantiation process needs to give Zoe a moment where it can create a new such facet. That might suggest changing the zoe/zcf relationship to avoid early backpointers.

My concern with the Moddable ROM model is it might limit us to "linear lineages" of vats, in which you can't re-use snapshots for multiple children: you could evaluate the code early, perhaps while you're idle, but you couldn't amortize the evaluation/startup costs among multiple vats which share an early upbringing. I think you need something like fork() to get such a "tree-shaped lineage".

The "IO devices which are new on each copy" are the moral equivalent of a vat import which needs to be replaced by some instance-specific version (the comity question). Rather than answering that question, I'd make the imports shared, and duplicate/replace the exports, which does require some careful planning to make sure the pre-fork vat doesn't have anything you want to withhold from the post-fork vat. During the fork, whoever drives the fork should create the new authorities that should only be held by the child, and deliver them after fork(). The handle with which you deliver them (to the child, and not to the zygote parent) is the newref.

erights · 2021-01-27T02:52:21Z

This is all plausible if necessary. This is a large design space all by itself. I suggest that we first figure out how little connectivity we can get away with specifically for Zoe installations. That will let us know which subproblem we need to cover first. I am all in favor of more generality, but this will give us a more concrete notion of what we're generalizing from.

I am hopeful that the answer is very little connectivity but we don't know yet. We also don't yet know what the shape of the very-little-connectivity is. I think we can figure this out quickly.

dckc · 2021-01-27T21:03:37Z

snapshot after SES shim + lockdown(): some performance numbers

I made a few performance measurements: snapshots should save ~4x on SES startup time:

	ms
eval SES shim bundle; lockdown()	88
restore snapshot with SES	23

When considering how often to snapshot vs. how much of a transcript to replay, the time cost of writing a snapshot is relevant:

	ms
eval SES shim bundle; lockdown()	88
write snapshot	46
restore snapshot with SES	23

TODO: measure duration of some cranks.

Code to do the measuring is on a branch:

https://github.com/Agoric/agoric-sdk/tree/2268-xs-ses-perf 9075673

It writes something like...

$ yarn test test/test-xs-ses-perf.js
yarn run v1.22.5
$ ava test/test-xs-ses-perf.js

{e: "issueCommand...", dur_us: 1207, byteLength: 0}
{e: "(function (f...", dur_us: 55881, byteLength: 0}
{e: "lockdown()...", dur_us: 1161, byteLength: 0}
{
  'boot SES on XS': 78,
  'spawn + trivial eval': 20,
  'eval bundle': 56,
  'lockdown()': 2
}
{e: "1+1...", dur_us: 1335, byteLength: 0}
{e: "globalThis.c...", dur_us: 2241, byteLength: 0}
{e: "c1.evaluate(...", dur_us: 1668, byteLength: 0}
{e: "new Compartm...", dur_us: 2352, byteLength: 0}
{ 'write snapshot post lockdown()': 47 }
{
  'start SES from snapshot': 22,
  'eval in new Compartment': 2,
  'eval in same Compartment': 3,
  'eval in another new Compartment': 2
}

I made a spreadsheet with a goofy timeline chart.

See also #1318 (comment) for some notes on timing diagrams from @warner .

One possible approach to drawing these diagrams: Slope Chart / D3 / Observable

dckc · 2021-11-29T06:59:55Z

I think I see a way to approach this at the vat manager level: split setBundle into... let's call them loadBundle and start...

It seemed like we should be able to load and evaluate all the code as pure modules and take a snapshot and then thread the parts together after resuming. But that's awkward: the communications port gets curried into a syscall object which gets passed to makeLiveSlots whose output is put into the loaded compartment's globals.

The code in vatManager/supervisor-subprocess-xsnap.js that builds the syscall object could make it so the issueCommand (the C host function that writes to the pipe) can be replaced by start().

The C code can replace globalThis.issueCommand after reloading the snapshot:

	xsResult = xsNewHostFunction(xs_issueCommand, 1);
	xsDefine(xsGlobal, xsID("issueCommand"), xsResult, xsDontEnum);

And start() should replace the globalThis.handleCommand that is called by the C code.
It's already late-bound in the C code: the C code reads it off globalThis before each call.

Once setBundle in the worker is split between loadBundle() and start(), the code on the kernel side that does setBundle can check whether it has already done loadBundle() on that bundle, and if so, use the snapshot from last time.

warner · 2023-04-13T17:18:28Z

note to self: this will require removing forVatID from the makeLiveslots() arguments, because the XS heap snapshot includes a copy of that, and the proposal is to spawn multiple vats (with distinct identities) from the same shared heap snapshot.

We use forVatID to add details to log messages produced by the vat worker, which is especially useful during tests when multiple workers are emitting messages to a shared stdout. I've also used it to selectively enable new log messages while debugging, although I never commit such things.

warner added enhancement New feature or request SwingSet package: SwingSet needs-design labels Jan 27, 2021

dckc self-assigned this Jan 27, 2021

warner mentioned this issue Feb 11, 2021

share Zoe/ERTP libraries among contracts #2391

Open

dckc added the xsnap the XS execution tool label Apr 28, 2021

warner mentioned this issue May 10, 2021

vat upgrade by replacing virtual-object behavior record #3062

Closed

This was referenced Sep 22, 2021

Epic: Evolution of Zoe Installations #3871

Closed

verifiability of vote counters requires one vat per question? #3907

Closed

warner mentioned this issue Dec 22, 2021

kernel API for upgrading vats #1848

Closed

warner mentioned this issue Jan 25, 2022

allow bundlecaps in vatParameters #4381

Closed

warner mentioned this issue Apr 13, 2022

design contract behavior upgrade semantics / API / UX #3272

Closed

dckc mentioned this issue Apr 27, 2022

swingset-core-eval should not be subject to crank limit #5047

Open

warner mentioned this issue Oct 19, 2022

idea for giving each worker its own (local) SQLite vatStore DB #6254

Open

Tartuffo added migrate-icebox and removed migrate-icebox labels Nov 17, 2022

mhofman mentioned this issue Sep 30, 2024

ZCF does not enforce start completion in first incarnation #10174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API to fork a vat (create "zygote vat") #2268

API to fork a vat (create "zygote vat") #2268

warner commented Jan 27, 2021 •

edited

Loading

warner commented Jan 27, 2021

FUDCo commented Jan 27, 2021

erights commented Jan 27, 2021

warner commented Jan 27, 2021

FUDCo commented Jan 27, 2021 •

edited

Loading

erights commented Jan 27, 2021 •

edited

Loading

Chris-Hibbert commented Jan 27, 2021

erights commented Jan 27, 2021

FUDCo commented Jan 27, 2021

warner commented Jan 27, 2021

erights commented Jan 27, 2021

dckc commented Jan 27, 2021 •

edited

Loading

dckc commented Nov 29, 2021 •

edited

Loading

warner commented Apr 13, 2023

API to fork a vat (create "zygote vat") #2268

API to fork a vat (create "zygote vat") #2268

Comments

warner commented Jan 27, 2021 • edited Loading

What is the Problem Being Solved?

Description of the Design

Security Considerations

warner commented Jan 27, 2021

FUDCo commented Jan 27, 2021

erights commented Jan 27, 2021

warner commented Jan 27, 2021

FUDCo commented Jan 27, 2021 • edited Loading

erights commented Jan 27, 2021 • edited Loading

Chris-Hibbert commented Jan 27, 2021

erights commented Jan 27, 2021

FUDCo commented Jan 27, 2021

warner commented Jan 27, 2021

erights commented Jan 27, 2021

dckc commented Jan 27, 2021 • edited Loading

snapshot after SES shim + lockdown(): some performance numbers

dckc commented Nov 29, 2021 • edited Loading

warner commented Apr 13, 2023

warner commented Jan 27, 2021 •

edited

Loading

FUDCo commented Jan 27, 2021 •

edited

Loading

erights commented Jan 27, 2021 •

edited

Loading

dckc commented Jan 27, 2021 •

edited

Loading

dckc commented Nov 29, 2021 •

edited

Loading