Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: demand-paged vats in solo, chain #2848

Merged
merged 28 commits into from
Jun 26, 2021
Merged

feat: demand-paged vats in solo, chain #2848

merged 28 commits into from
Jun 26, 2021

Conversation

dckc
Copy link
Member

@dckc dckc commented Apr 9, 2021

fixes #2273

Postponed

p.s. on Performance

unloaded chain startup time with this landed is 60x faster than yesterday's non-snapshot build: it went from 5m11s to 4.76s

Historical Note

This has been in the making since Aug 2019: #47. See MadMode: Toward secure distributed computing with Agoric at SFBW '19

See also Chat with Dean Tribble & Dan Connolly in Agoric Community Call #8 - YouTube.

@dckc dckc force-pushed the 2273-snapstore branch from df58fe0 to dea5483 Compare April 9, 2021 18:22
@dckc dckc force-pushed the 2273-snapstore branch 6 times, most recently from c157c43 to 18e7e44 Compare May 19, 2021 23:39
@dckc dckc force-pushed the 2273-snapstore branch 4 times, most recently from f067780 to 6c03776 Compare June 15, 2021 20:42
@dckc dckc force-pushed the 2273-snapstore branch 6 times, most recently from f791bdf to 6be1e96 Compare June 16, 2021 05:18
@dckc dckc changed the title feat: initializeSwingset snapshots XS supervisor (WIP) feat: deman-paged vats in solo, chain Jun 20, 2021
@dckc dckc force-pushed the 2273-snapstore branch from b916585 to 6688127 Compare June 20, 2021 01:31
@dckc dckc requested review from kriskowal, FUDCo and warner June 20, 2021 01:35
@dckc dckc marked this pull request as ready for review June 20, 2021 01:43
@dckc
Copy link
Member Author

dckc commented Jun 20, 2021

@warner and co this is probably not ready to land, but I do want review. I updated the description.

@dckc dckc changed the title feat: deman-paged vats in solo, chain feat: demand-paged vats in solo, chain Jun 20, 2021
dckc and others added 17 commits June 25, 2021 16:19
Not only is this option not implemented now, but CM's analysis shows
that adding it would likely be harmful.
  - refactor: move snapshot decision up from vk.saveSnapshot() up
    to vw.maybeSaveSnapshot
integration testing shows this is closer to ideal
rebase / merge problem?!
With snapshotInitial at 2, there is little reason to snapshot after
loading the supervisor bundles. The code doesn't carry its own weight.

Plus, it seems to introduce a strange bug with marshal or something...

```
  test/test-home.js:37

   36:   const { board } = E.get(home);
   37:   await t.throwsAsync(
   38:     () => E(board).getValue('148'),

  getting a value for a fake id throws

  Returned promise rejected with unexpected exception:

  Error {
    message: 'Remotable (a string) is already frozen',
  }
```
In an attempt to avoid reading the lastSnapshot DB key if the
t.endPosition key was enough information to decide to take a snapshot,
the vatWarehouse was peeking into the vatKeeper's business.  Let's go
with code clarity over (un-measured) performance.
@dckc dckc force-pushed the 2273-snapstore branch from 2d31909 to 7e05a24 Compare June 25, 2021 21:40
@warner warner self-assigned this Jun 25, 2021
@warner warner added xsnap the XS execution tool SwingSet package: SwingSet labels Jun 25, 2021
@warner
Copy link
Member

warner commented Jun 25, 2021

I think I tracked down this CI failure:

vat-warehouse › warehouse › snapshot after deliveries

  Rejected promise returned by test. Reason:

  ExitCode (ErrorCode) @Error {
    code: 2,
    message: 'v3:bootstrap exited: I/O error',
  }

  › new ErrorCode (/home/runner/work/agoric-sdk/agoric-sdk/packages/xsnap/api.js:49:5)
  › ChildProcess.<anonymous> (/home/runner/work/agoric-sdk/agoric-sdk/packages/xsnap/src/xsnap.js:124:22)
  › Process.ChildProcess._handle.onexit (internal/child_process.js:277:12)

  ─

  1 test failed

to the fact that two different tests are using the same "temporary" directory name (fixture-xs-snapshots/) as a snapstorePath, and then using a t.teardown job to delete them at the end. This wasn't much of a problem when the tests were in different directories, but after moving both test-warehouse.js and test-reload-snapshot.js into test/vat-warehouse/, the "temp" directories collide, and one test is deleting it while the other test is still using it.

I'm changing those snapstorePath = ... lines to use different directories as a short-term fix. @dckc next week, please take another pass over this and find a more reliable solution, something to make a unique (random?) path, or just change the two pathnames to match the test file names, something that establishes a uniqueness convention we can rely upon to avoid this collision as new tests are added without needing to search through all existing tests first.

Another small cleanup to add later: the tests do both c.shutdown() and fs.rmdirSync(snapstorePath) during t.teardown. We should probably let the c.shutdown() finish before starting the delete, just to be tidy. But the order in which multiple t.teardown jobs changes between AVA 3 and AVA 4 (we're currently using 3), so either we should write our own handler and only call t.teardown once, or we should set the "hey AVA 3 please behave like AVA 4" flag and use t.teardown in the AVA 4 order, for future-proofing. It probably doesn't matter for now, because c.shutdown() is just killing off processes and not writing any snapshots, but I wouldn't want to assume that'll always be the case.

The "temporary" snapstore directories used by two different tests began to
overlap when the tests were moved into the same parent dir, and one test was
deleting the directory while the other was still using it (as well as
mingling files at runtime), causing an xsnap process to die with an IO error
if the test were run in parallel.

This changes the the two tests to use distinct directories. In the long run,
we should either have them use `mktmp` to build a randomly-named known-unique
directory, or establish a convention where tempdir names match the name of
the test file and case using them, to avoid collisions as we add more tests.
@dckc
Copy link
Member Author

dckc commented Jun 25, 2021

Ooh. Nice catch. Will do move to real random tmp dirs.

I started at some point but it got lost in the zillions of rebases this PR has been through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance related issues SwingSet package: SwingSet xsnap the XS execution tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

design "snapstore" API: immutable hash-named XS snapshot files
2 participants