design vatWarehouse API for demand-paged vats #2277

warner · 2021-01-28T20:23:50Z

What is the Problem Being Solved?

We currently maintain a VatManager in RAM for all non-terminated vats, both static and dynamic. These VatManager objects are tracked in ephemeral.vats, a Map from the vat ID string to a record that includes the manager:

agoric-sdk/packages/SwingSet/src/kernel/kernel.js

Lines 125 to 129 in 65105f7

    
           const ephemeral = { 
        
             vats: new Map(), // vatID -> { manager, enablePipelining } 
        
             devices: new Map(), // deviceID -> { manager } 
        
             log: [], 
        
           };

A new manager is created at startup (for all static+dynamic vats in the DB), and also when a new dynamic vat is created. The manager is destroyed for dynamic vats that terminate. Each time we need to deliver a message or promise resolution notification into a vat, we pull the manager off ephemeral.vats and use it for the delivery, with something like:

    const vat = ephemeral.vats.get(vatID);
    const deliveryResult = await vat.manager.deliver(vatDelivery);

(look at the implementation of deliverAndLogToVat and deliverToVat, among others).

We can reduce kernel memory usage by evicting idle managers from memory. We can reduce overall system memory usage by terminating the corresponding workers (e.g. telling the XS worker to exit).

At startup, we can reduce memory and CPU usage by not creating managers for vats that are not yet in use (lazy-loading vats on-demand). There are policy / performance heuristic questions to answer: there's a tradeoff between latency and overall performance. If we correctly predict that certain vats are likely to be used right away (e.g. the fundamental economic vats, comms/vattp infrastructure), we might want to load them at startup instead of waiting for someone to send them a message. Likewise, if we can predict that a vat is not likely to be used for a while, we can evict it.

Description of the Design

We need something to coordinate the creation/destruction of these VatManagers. The first issue is

what it should be called.

It needs

an API call which, at least, ~~returns~~ delivers to a VatManager for a given vatID.

This call probably needs to be async, which means the rest of the kernel (deliverToVat, etc) must be prepared to accept a Promise back from this call.

The call should use an existing manager, if there is one, and
if not it should create one (which entails
- creating a worker, loading the vat state into it
  - by loading a snapshot and/or
- replaying a transcript).

Our name for this create-or-return-cached-object pattern is "provide", so this API should probably be spelled manager = await provideVatManager(vatID).

It also needs

an API call to create a vat in the first place (initialize whatever offline state will be necessary for a new vat).

This does not need to return a VatManager, but

it should enable a future provideVatManager call.
It will be called during initializeSwingset for all static vats, and
during the dynamic vat creation call for dynamic vats.
It should be matched with a similar destroy() call for when a dynamic vat is terminated.

The vat manager manager should have

the ability to evict VatManagers that are idle.

It needs a way to know that the caller of provideVatManager is no longer using the object that method returned.

Once that is the case, it should be allowed to destroy the VatManager at any moment (and terminate/cleanup the associated worker), as long as it can
restore the worker's state later.

possible future issues:

"what vats are online?" port
terminate stuff in its own crank like creation?

We need to coordinate changes to the transcript with changes to any snapshots that were stored. We might consider having a special crank type to record a snapshot and truncate the transcript: no user code would run during this crank, but the transcripts (in the kernel DB) would be atomically truncated in the same transaction that changes the snapshot ID.

Depending upon what sort of heuristics and eviction policy we use, we might also want an API to communicate hints about usage frequency to this new manager thing. We might record a property with each vat to say whether it is likely to be frequently used or not, which this manager thing could use to make its eviction decisions. Alternatively, the manager could rely exclusively upon its own observations of provideVatManager to decide which vats are deserving of RAM and which should be kept offline as much as possible.

Consensus Notes

The presence or absence of a VatManager in RAM should not be part of the consensus state. Some members of a chain may choose to allocate more memory than others, and this does not affect the equivalence of their vats' behavior.

Snapshots are also not part of the consensus state, because we don't want to rely upon the internal serialization details of the JS engine. As a result, the truncated transcript is also not part of consensus state (it gets truncated only when a snapshot is taken). The consensus state does contain the full un-truncated transcript, which is necessary for vat migration as well as off-chain debugging replay of a vat. Snapshots and truncated transcripts are for the benefit of the local node, to make reloading a single vat (or the entire process) faster and cheaper.

cc @FUDCo @erights @dtribble @dckc

Notes on PR #2784

make VatWarehouse responsible for the kernelSlog.delivery() call src
move provideVatSlogger closer to clist translation src
move vat-warehouse.js from kernel/vatManager/ to kernel/ src
replace kk.getVatKeeper with a proper kernelKeeper.provideVatKeeper(vatID) src
split vatKeeper.getSourceAndOptions into smaller pieces #3280 split getSourceAndOptions to avoid accessing large source bundle to get enablePipelining option src

The text was updated successfully, but these errors were encountered:

dckc · 2021-01-29T17:42:47Z

What is the Problem Being Solved?
... We can reduce kernel memory usage by evicting idle managers from memory. ...

That looks like an opportunity for optimization, but it's not clearly a problem. Would you please state the problem as a problem?

Or perhaps change the heading? Clearly a bug issue should start with what the problem is, but for an enhancement, it seems a little awkward. Oh... but this is also a design issue... That hurts my head. In my way of looking at things, the two are exclusive.

Description of the Design
We need something to coordinate the creation/destruction of these VatManagers.

We do? Why? It's not entirely clear.

I'm used to enhancements being sketched in terms of user stories, and in considering designs, often implicit requirements turn up that make it clear which designs are better or worse. I'm struggling to get oriented here. I can probably muddle through, but starting from user stories would help.

warner · 2021-02-08T22:58:28Z

Some ideas from our meeting today that might get folded into this, or into some related ticket:

in one sense, all swingsets must retain the full transcript of all (surviving) vats, in case they're asked to provide a copy for someone that wants to clone their state, or if they're asked to rebuild the vat with a different kind of vat worker, or if we need to rebuild the vat with a version of XS that isn't snapshot-compatible
however, vats which use workers that can snapshot need to not replay the transcript entries that we already included in that snapshot
so when the vat manager is told to snapshot, it should record a pair of (snapshot-id, transcript-offset) as an atomic update
- when the vat manager is asked to reload that vat, it loads the snapshot, and then replays the offset tail of the transcript, instead of the whole thing
the consensus state of a vat (for validated export, to help new validators catch up) includes the full transcript, but must explicitly not include a snapshot-id or transcript-offset, the pair of which are local to each swingset
so we might need to partition our kernelDB into a keyspace that is included in the consensus Merkle hash, and a keyspace that is not, and put the snapshot-related items in the latter keyspace

And an idea I didn't want to forget: when a new validator is being set up, we know it can't safely rely upon snapshots created by existing validators (because they aren't part of the consensus, so the new validator can't tell whether they match the consensus transcripts). But, it could spend an arbitrarily long time converting downloaded+validated transcripts into locally-reliable snapshots, by just replaying each downloaded vat history one at a time. The new validator could refrain from going online (and becoming obligated to keep up with the chain) until it had "converted" all of the important vats into fast-loading snapshots. It wouldn't have to be entirely up-to-date before it goes online, just enough to let it catch up (replay the last few days worth of transcript entries onto the previously-generated snapshot) fast enough to meet the block-time requirements. The validator could decide to go online after converting the major vats, and then continue the conversion process for the remaining vats in the background as it runs (or farm it out to other machines: that phase is quite parallelizable).

@FUDCo pointed out that, if it isn't feasible or economical to convert full transcripts into locally-reliable snapshots, validators may choose to buy/rely-upon snapshots from other validators, even though they can't be confirmed to match the official transcript. We might be able to facilitate (or at least not impair) this, maybe by giving a way for each validator to sign a statement about what its own snapshot contains. Maybe there's some way to achieve economic incentives for these "snapshot shops" to publish correct snapshots.

dckc · 2021-02-09T20:16:57Z

I edited the description to tease out testable aspects of the design.

The use cases aren't written in the form I'm used to, but it's reasonably clear, in any case, that they are, for example...

Vince, a validator operator, is happy to see roughly constant memory usage when 100 or 10,000 short-lived vats come and go, a few at a time.

Vince decides to resize his validator, so he shuts it down, adds RAM / disk / CPU / etc., and starts it up again. The restart time is not so long that he is slashed for truancy.

Hm.... testing those looks fairly involved.

dckc · 2021-02-09T20:52:08Z

names:

vatCoop
vatFreezer
vatCellar
vatShelf

experimenting with GitHub Polls ...

warner · 2021-02-09T21:22:17Z

I'm leaning towards vatWarehouse at the moment. Although "factory" and "provider" might be close too. Freezer has some of the right implications, but I'd also like the name to suggest that this thing gets to make its own decisions between freeze+thaw and just-keep-it-in-RAM. Like a library that can choose between leaving books on the front shelf vs moving the unloved ones to the basement.

vatCellar sounds too much like vatSeller

FUDCo · 2021-02-09T21:22:30Z

names:

vatCoop

vatFreezer

vatCellar

vatShelf

vatWrangler? vatSquad? vatMobile? vatican? vatFleet?

FUDCo · 2021-02-09T21:25:29Z

vatBrigade? vatDirector? vatBoss? invatory? vatory?

FUDCo · 2021-02-09T21:30:37Z

Another approach would be to call it the vatManager and rename the current vatManager to something else. I kind of like this approach, actually.

dckc · 2021-02-11T03:08:12Z

broke ground: https://github.com/Agoric/agoric-sdk/blob/2277-vatmgrmgr/packages/SwingSet/test/workers/test-vatbutler.js

This removes one of the two noisy log messages from kernel startup. This mostly gets noticed in tests like test-message-patterns.js which launches dozens of swingsets in a single test. We included it because sometimes errors occur (and get logged to the console) during transcript replay, and it wasn't obvious when/why the vat was being invoked so early. But to be properly useful for that, we'd need to announce the replay phase of each vat separately, as well as the *end* of replay for each vat, multiplying the noise by at least an order of magnitude. Instead, just remove the message, and if/when someone needs to figure this out, go ahead and temporarily add it back. Besides, transcript replay is going to change drastically with the introduction of the vat-manager-manger (#2277), which would have removed the place where this message was emitted anyways.

) This removes one of the two noisy log messages from kernel startup. This mostly gets noticed in tests like test-message-patterns.js which launches dozens of swingsets in a single test. We included it because sometimes errors occur (and get logged to the console) during transcript replay, and it wasn't obvious when/why the vat was being invoked so early. But to be properly useful for that, we'd need to announce the replay phase of each vat separately, as well as the *end* of replay for each vat, multiplying the noise by at least an order of magnitude. Instead, just remove the message, and if/when someone needs to figure this out, go ahead and temporarily add it back. Besides, transcript replay is going to change drastically with the introduction of the vat-manager-manger (#2277), which would have removed the place where this message was emitted anyways.

dckc · 2021-02-12T22:57:53Z

@warner agreed with my request to just go with vatWarehouse .

test-vatwarehouse.js 48a548e

FUDCo · 2021-02-13T00:55:19Z

"You're going to like the way you compute. I guarantee it."

dckc · 2021-05-18T18:57:24Z

Starting kernel integration

I mostly revivied #2784 , but as I look to integrate it with kernel.js, I wonder what was the motivation for...

@param { Record<string, VatManagerFactory> } factories

https://github.com/Agoric/agoric-sdk/pull/2784/files#diff-b96e58d958e755f6aeefba6dc4ce7dfd5a5e90e186153d0d69541293a81d1016R8

The status quo API seems to have just one factory for all manager types:

  function vatManagerFactory(vatID, managerOptions, vatSyscallHandler) {

agoric-sdk/packages/SwingSet/src/kernel/vatManager/factory.js

Line 80 in 97d52ce

function vatManagerFactory(vatID, managerOptions, vatSyscallHandler) {

May 28 update:

cf18e52 only 2 tests fail, and they're outdated vatWarehouse tests.
360 tests passed
12 known failures
2 tests skipped
4 unhandled rejections

Previously, the concept of "deliveryNum" (a counter of how many deliveries have been made to any particular vat) only existed within the slogger, which used an internal (ephemeral) counter, and attached the count to each slogfile delivery record. That meant two successive slogfiles, created by two successive launches of the same kernel (one building upon the saved state of the other), would get overlapping delivery numbers. The problem would get worse with the #2277 VatWarehouse, which will create a new `vatSlogger` each time the vat is paged in (multiple times per kernel process). This moves the `deliveryNum` counter into the kernelDB's durable `kvStore`, in a new `$vatID.nextDeliveryNum` key. It is incremented for each delivery by `deliverAndLogToVat()`. The slogger simply receives and remembers `deliveryNum` (just like what it's always done with `crankNum`), and no longer attempts to increment a counter itself. closes #3254

dckc · 2021-06-09T00:35:10Z

@warner thanks for review of PR #2784. I added the outstanding notes to the checklist in the description above, with the exception of the crank commit complications around evict, which seem to fit better in #2422.

warner added enhancement New feature or request SwingSet package: SwingSet needs-design labels Jan 28, 2021

warner assigned dckc Jan 28, 2021

warner mentioned this issue Feb 11, 2021

fix: hush "replaying transcripts" message during swingset startup #2394

Merged

warner mentioned this issue Feb 13, 2021

vatWarehouse transcript replay must commit changes, and coordinate with crank success/fail rewind #2422

Closed

dckc changed the title ~~design (and name) VatManager-manager API~~ design vatWarehouse API for demand-paged vats Feb 15, 2021

warner mentioned this issue Mar 6, 2021

more efficient/transactional state storage #54

Closed

This was referenced Mar 31, 2021

refactor dispatch to be a function, not an object #2674

Closed

feat: vat warehouse for LRU demand paged vats #2784

Merged

dckc added this to the Testnet: Stress Test Phase milestone Apr 9, 2021

dckc mentioned this issue Apr 16, 2021

efficient validator restart from snapshots #2138

Closed

2 tasks

dckc added the xsnap the XS execution tool label Apr 28, 2021

warner mentioned this issue May 10, 2021

store transcripts in linear auxillary file, not LMDB #3065

Closed

warner mentioned this issue Jun 4, 2021

track deliveryNum properly #3254

Closed

warner mentioned this issue Jun 4, 2021

chore(swingset): track vat deliveryNum persistently #3255

Merged

dckc closed this as completed in #2784 Jun 9, 2021

dckc reopened this Jun 9, 2021

dckc mentioned this issue Jun 9, 2021

split vatKeeper.getSourceAndOptions into smaller pieces #3280

Closed

dckc modified the milestones: Testnet: Stress Test Phase, testnet4-b Jun 21, 2021

warner closed this as completed in cb239cb Jun 26, 2021

This was referenced Jul 22, 2021

XS vat worker should host multiple vats #1845

Closed

cross-machine calls in vm #803

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design vatWarehouse API for demand-paged vats #2277

design vatWarehouse API for demand-paged vats #2277

warner commented Jan 28, 2021 •

edited by dckc

Loading

dckc commented Jan 29, 2021

warner commented Feb 8, 2021 •

edited

Loading

dckc commented Feb 9, 2021

dckc commented Feb 9, 2021

warner commented Feb 9, 2021

FUDCo commented Feb 9, 2021

FUDCo commented Feb 9, 2021

FUDCo commented Feb 9, 2021

dckc commented Feb 11, 2021

dckc commented Feb 12, 2021

FUDCo commented Feb 13, 2021

dckc commented May 18, 2021 •

edited

Loading

dckc commented Jun 9, 2021

design vatWarehouse API for demand-paged vats #2277

design vatWarehouse API for demand-paged vats #2277

Comments

warner commented Jan 28, 2021 • edited by dckc Loading

What is the Problem Being Solved?

Description of the Design

Consensus Notes

Notes on PR #2784

dckc commented Jan 29, 2021

warner commented Feb 8, 2021 • edited Loading

dckc commented Feb 9, 2021

dckc commented Feb 9, 2021

warner commented Feb 9, 2021

FUDCo commented Feb 9, 2021

FUDCo commented Feb 9, 2021

FUDCo commented Feb 9, 2021

dckc commented Feb 11, 2021

dckc commented Feb 12, 2021

FUDCo commented Feb 13, 2021

dckc commented May 18, 2021 • edited Loading

Starting kernel integration

May 28 update:

dckc commented Jun 9, 2021

warner commented Jan 28, 2021 •

edited by dckc

Loading

warner commented Feb 8, 2021 •

edited

Loading

dckc commented May 18, 2021 •

edited

Loading