Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add index of collections, improve stopVat deletion code #5090

Open
warner opened this issue Apr 12, 2022 · 0 comments
Open

add index of collections, improve stopVat deletion code #5090

warner opened this issue Apr 12, 2022 · 0 comments
Assignees
Labels
enhancement New feature or request liveslots requires vat-upgrade to deploy changes SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Apr 12, 2022

What is the Problem Being Solved?

In PR #5056 I needed to (temporarily) introduce an in-RAM table of all virtual/durable collections, so I could enumerate the virtual ones for deletion during stopVat. That's a problem for vats that create a large number of collections: memory usage will be proportional to the number of collections, when virtual collections should only cost DB space. To fix this, we need to be able to enumerate all virtual collections using only the information in the DB.

The current collectionManager.js gives each collection a prefix like vc.${collectionID}., followed by either metadata keys (|label, |entryCount, |nextOrdinal), ordinal maps (|${vref} -> ordinal), or entries (${encodedKey} -> valueCapdata). With some funky use of vatstoreGetAfter we could use this to enumerate all collection IDs, but this doesn't help us learn the KindID for each, and we need that to figure out whether the collection is durable or virtual.

@FUDCo and I worked out a DB scheme to avoid this, which should also let us delete virtual data more precisely during upgrade.

Description of the Design

First, we introduce a new section of the DB which holds a compact index of all virtual and durable collections. We're thinking vc.vcv.${collectionID} -> collectionVref (for the virtual collections) and vc.vcd.${collectionID} for the durables.

The stopVat code can then do a dense enumeration of all keys with the vc.vcv. prefix to find all the virtual collection IDs.

Second, we change stopVat's deleteAllVirtualCollections, which deletes all data for all virtual collections (including the metadata and probably the refcounts). That function violates the refcount invariants, and would fail if vc2 referenced vc1 but vc1 was deleted first: when it gets to vc2 and tries to delete its entries, it will try to decref vc1, but the vc1 refcount key will be missing.

Instead, we simulate a .clear() on all virtual collections (including the weak ones, which is not something userspace could do). This deletes all the keys and values, but does not delete the collection's metadata. By pretending userspace is just thoughtfully deleting everything itself during stopVat, we maintain the refcount and DB invariants.

Then, we add a similar function to the virtual-object internals, which enumerates all the state keys and replaces their values with undefined (or 0, which is simpler to serialize). This will clear all the references from VO state without violating any invariants. With this approach, the VO state is still legal (it still has the same property names), it's just empty. However this requires a DB write, even though we know we're just going to delete the data soon anyways.

During these deletions, the existing refcounting code will be invoked for imports and durables, causing them to be freed normally.

Once both are done, we'll be left with a brittle dry husk of the virtual data: all the refcount keys and metadata keys will be present, but none of them will point to anything. At this point, we can safely clobber all the DB entries for the virtuals without concern that we'll be losing a refcount to something.

Performance considerations

This approach is slower than the earlier one, partially because the old one wasn't paying attention to refcounts. For each collection entry, we'll do one DB read of the value, followed by a DB delete. For virtual objects, our new function will do a DB read of the state data, followed by a DB write of the empty data, followed by a DB delete during the subsequent cleanup pass. We're spending that extra DB write of empty data, in exchange for not needing to worry about refcount corruption.

Depending upon the shape of the reference graph, things might be better or worse: a pathological N-length linked list, being deleted either from the head or the tail, is probably informative.

Security Considerations

Test Plan

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Apr 12, 2022
@warner warner self-assigned this Apr 12, 2022
@Tartuffo Tartuffo added this to the Mainnet 1 milestone Apr 15, 2022
@warner warner removed this from the Mainnet 1 milestone May 11, 2022
@warner warner added the liveslots requires vat-upgrade to deploy changes label Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request liveslots requires vat-upgrade to deploy changes SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

2 participants