-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tolerate vatstore syscalls during vat startup #2910
Comments
In addition to #2911, we also need to rearrange the vatManager creation so that it gets the |
In #2911 (comment) @FUDCo said:
I'm not sure we can avoid supporting syscalls during startup. We wrote the comms vat, so we can impose arbitrary restrictions on ourselves. But userspace code can make virtual objects at the top-most module context, outside of any method invocation: const { makePurse } = makeKind(...);
const allPurses = new SpecialWeakMapThing();
const specialPurse = makePurse();
allPurses.set(specialPurse, { balance: 0 }); // or whatever
export function buildRootObject() {
...
} The |
Just wanted to question these requirements, as I think the requirements for Zoe and ERTP are actually different than this. Zoe contracts don't get run until a function in the ZCF contract instance vat is called, which is after Were you thinking that this would change in the future? Or that we are making issuers in cosmic-swingset during the startup phase? |
Good point, with our current contract-installation model, the only code with the opportunity for building virtual objects that early are static vats and the initial code bundle for dynamic vats (including ZCF). User-contributed contract code does not get that opportunity. When I put my swingset/kernel/vat-platform hat on, all userspace is adversarial: static vats, dynamic vats, ZCF, everything. From that perspective, I can't control what userspace does, so I can either allow it or forbid it. Forbidding it means killing the vat if it tries to use a syscall too early. Allowing it means wiring up syscalls early enough to handle them (and recording the resulting state changes correctly). Forbidding it makes the userspace model slightly harder to explain.. it adds a funny footnote to the virtual object docs ("BTW don't use this until after But with a higher-layer-of-abstraction-hat on (the Agoric/contract-platform one), yeah, we don't strictly need this: the people who write the code that gets to run this early (us) can obey the hard-to-justify restrictions imposed by those stubborn kernel people (also us). I guess I want to minimize the assumptions we build into the lower layers of the system, and avoid tying the hands of developers in a way that will cause problems. Your point about issuers in cosmic-swingset startup is a good one. The "virtual purses" feature (purses which represent low-level cosmos-sdk Bank module balances) need a special Issuer with some sort of device access. That will probably live in a static vat, and will probably involve virtual objects (for all the non-magic only-in-JS purses/payments created by that issuer). I don't know that it will have a need to create any of those early, though. I think I'll reduce the priority of this feature a bit. If it's convenient to support as a side-effect of fixing something else (e.g. #2908) then it might get fixed earlier. |
The comms vat has a bunch of counters that need to be initialized exactly once in the lifetime of the vat's durable state. There is a DB flag named `initialize` to track whether this has happened already or not. Since vats can only do syscalls during deliveries, not startup (see #2910), the comms vat must read this flag on every delivery, just in case this is the very first one, and the DB needs initialization. The comms vat was using an in-RAM flag (`needToInitializeState`) to cache the result, to avoid a DB read for every single delivery. However this caused the syscall behavior to change for the first delivery after a kernel restart. There were two extra syscalls taking place: a `vatstoreGet('initialized')`, and a `vatstoreSet('meta.o+0', 'true')`. The `vatstoreSet` was causing a DB change, which was picked up by the new kernel activityhash, and caused a consensus failure on the first post-restart comms message. This changes the comms vat to always read the DB flag, on every delivery, making its behavior consistent and independent of kernel restarts. It also moves the `vatstoreSet` to be guarded by the results of the `initialized` check, so it only happens once in the lifetime of the vat. To remove the need for the `vatstoreGet` on each delivery, we'll need to enhance the way `setup()` is called, to provide an external flag that tells the vat whether this is the very first time ever, or if it's merely a kernel restart that required the vat's `dispatch` object to be reconstructed. fixes #3726
The comms vat has a bunch of counters that need to be initialized exactly once in the lifetime of the vat's durable state. There is a DB flag named `initialize` to track whether this has happened already or not. Since vats can only do syscalls during deliveries, not startup (see #2910), the comms vat must read this flag on every delivery, just in case this is the very first one, and the DB needs initialization. The comms vat was using an in-RAM flag (`needToInitializeState`) to cache the result, to avoid a DB read for every single delivery. However this caused the syscall behavior to change for the first delivery after a kernel restart. There were two extra syscalls taking place: a `vatstoreGet('initialized')`, and a `vatstoreSet('meta.o+0', 'true')`. The `vatstoreSet` was causing a DB change, which was picked up by the new kernel activityhash, and caused a consensus failure on the first post-restart comms message. This changes the comms vat to always read the DB flag, on every delivery, making its behavior consistent and independent of kernel restarts. It also moves the `vatstoreSet` to be guarded by the results of the `initialized` check, so it only happens once in the lifetime of the vat. To remove the need for the `vatstoreGet` on each delivery, we'll need to enhance the way `setup()` is called, to provide an external flag that tells the vat whether this is the very first time ever, or if it's merely a kernel restart that required the vat's `dispatch` object to be reconstructed. fixes #3726
The comms vat has a bunch of counters that need to be initialized exactly once in the lifetime of the vat's durable state. There is a DB flag named `initialize` to track whether this has happened already or not. Since vats can only do syscalls during deliveries, not startup (see #2910), the comms vat must read this flag on every delivery, just in case this is the very first one, and the DB needs initialization. The comms vat was using an in-RAM flag (`needToInitializeState`) to cache the result, to avoid a DB read for every single delivery. However this caused the syscall behavior to change for the first delivery after a kernel restart. There were two extra syscalls taking place: a `vatstoreGet('initialized')`, and a `vatstoreSet('meta.o+0', 'true')`. The `vatstoreSet` was causing a DB change, which was picked up by the new kernel activityhash, and caused a consensus failure on the first post-restart comms message. This changes the comms vat to always read the DB flag, on every delivery, making its behavior consistent and independent of kernel restarts. It also moves the `vatstoreSet` to be guarded by the results of the `initialized` check, so it only happens once in the lifetime of the vat. To remove the need for the `vatstoreGet` on each delivery, we'll need to enhance the way `setup()` is called, to provide an external flag that tells the vat whether this is the very first time ever, or if it's merely a kernel restart that required the vat's `dispatch` object to be reconstructed. fixes #3726
The comms vat has a bunch of counters that need to be initialized exactly once in the lifetime of the vat's durable state. There is a DB flag named `initialize` to track whether this has happened already or not. Since vats can only do syscalls during deliveries, not startup (see #2910), the comms vat must read this flag on every delivery, just in case this is the very first one, and the DB needs initialization. The comms vat was using an in-RAM flag (`needToInitializeState`) to cache the result, to avoid a DB read for every single delivery. However this caused the syscall behavior to change for the first delivery after a kernel restart. There were two extra syscalls taking place: a `vatstoreGet('initialized')`, and a `vatstoreSet('meta.o+0', 'true')`. The `vatstoreSet` was causing a DB change, which was picked up by the new kernel activityhash, and caused a consensus failure on the first post-restart comms message. This changes the comms vat to always read the DB flag, on every delivery, making its behavior consistent and independent of kernel restarts. It also moves the `vatstoreSet` to be guarded by the results of the `initialized` check, so it only happens once in the lifetime of the vat. To remove the need for the `vatstoreGet` on each delivery, we'll need to enhance the way `setup()` is called, to provide an external flag that tells the vat whether this is the very first time ever, or if it's merely a kernel restart that required the vat's `dispatch` object to be reconstructed. fixes #3726
In the most recent upgrade plan, ZCF in all non-initial versions will need to install and instantiate contract code very early, during This will require (at least) |
@FUDCo is working on this now. The goal is to enable syscalls during |
Chip's work on #2910 discovered that the supervisor was not told about failures during liveslot's dispatch(). This could conceal some bugs in liveslots, as well as hiding userspace-caused failures during `buildRootObject()`. The contract between `dispatch()` and the calling supervisor code has changed through various bouts of refactoring, and it was ambiguous as to whether `dispatch()` was supposed to protect against userspace errors or not. This commit clears up the documentation to make this more explicit.
Chip's work on #2910 discovered that the supervisor was not told about failures during liveslot's dispatch(). This could conceal some bugs in liveslots, as well as hiding userspace-caused failures during `buildRootObject()`. The contract between `dispatch()` and the calling supervisor code has changed through various bouts of refactoring, and it was ambiguous as to whether `dispatch()` was supposed to protect against userspace errors or not. This commit clears up the documentation to make this more explicit.
I realized that syscalls during top-level module evaluation need to actually kill the vat, not just throw an error, because the new collection-manager's |
Note #3552 , which points out that top-level module code might use |
@FUDCo and I have a plan. We going with the "allow syscalls during top-level module code just in case" option, which means that the The changes we need to make are:
index 5b8fbed9c..eee7cf875 100644
--- a/packages/SwingSet/src/kernel/initializeKernel.js
+++ b/packages/SwingSet/src/kernel/initializeKernel.js
@@ -85,6 +85,7 @@ export function initializeKernel(config, hostStorage, verbose = false) {
logStartup(`assigned VatID ${vatID} for genesis vat ${name}`);
const vatKeeper = kernelKeeper.provideVatKeeper(vatID);
vatKeeper.setSourceAndOptions({ bundle, bundleName }, creationOptions);
+ vatKeeper.addToTranscript(['startup', vatParameters]);
vatKeeper.initializeReapCountdown(creationOptions.reapInterval);
if (name === 'vatAdmin') {
// Create a kref for the vatAdmin root, so the kernel can tell it
diff --git a/packages/SwingSet/src/kernel/kernel.js b/packages/SwingSet/src/kernel/kernel.js
index e1eceb4b7..c7c312121 100644
--- a/packages/SwingSet/src/kernel/kernel.js
+++ b/packages/SwingSet/src/kernel/kernel.js
@@ -701,6 +701,7 @@ export default function buildKernel(
options.reapInterval = kernelKeeper.getDefaultReapInterval();
}
vatKeeper.setSourceAndOptions(source, options);
+ vatKeeper.addToTranscript(['startup', options.vatParameters]);
vatKeeper.initializeReapCountdown(options.reapInterval);
function makeSuccessResponse() {
This will allow syscalls to be executed (and, more importantly, have their results recorded in the transcript) ... .. results recorded .. .. oh fudge Ok, so injecting a transcript entry doesn't help: what we need are the results of that delivery, and we can't get that until we have a worker to deliver them to. This plan wouldn't make that happen during Hrm, we could push something onto the kernel run-queue, as @FUDCo 's #4358 PR does, but I don't want the arbitrary delay between worker creation and Ok, gotta think about this some more. I'll add this comment even though it's not going to work in the form that we figured out today, sorry @FUDCo . |
So the requirement is to actually make that For a dynamic vat, I think that time is right after the manager is created: --- a/packages/SwingSet/src/kernel/vatManager/vat-warehouse.js
+++ b/packages/SwingSet/src/kernel/vatManager/vat-warehouse.js
@@ -126,6 +126,9 @@ export function makeVatWarehouse(kernelKeeper, vatLoader, policyOptions) {
}
};
const manager = await chooseLoader()(vatID, source, translators, options);
+ if (!recreate) {
+ await manager.deliver(['startup', options.vatParameters]);
+ }
// TODO(3218): persist this option; avoid spinning up a vat that isn't pipelined
const { enablePipelining = false } = options; For a static vat.. maybe the best answer is the run-queue event that @FUDCo already implemented, pushed onto the run-queue during An alternative would be for vat-warehouse.js to be aware that it is bringing a static vat online for the first time, and to perform the |
Oh, the "first time" predicate should be made explicit if (!vatKeeper.hasBeenInitialized()) {
await manager.deliver(['startup', options.vatParameters]);
}
|
@FUDCo https://github.com/Agoric/agoric-sdk/tree/2910-dispatch-startvat has the work we paired on The current plan is to use Chip's We want to ensure (test) that failures at various points of dynamic vat creation all result in a rejected promise going back to the parent, and all the state getting cleaned up. |
…call and transcript logging Closes #2910
…call and transcript logging Closes #2910
…call and transcript logging Closes #2910
…call and transcript logging Closes #2910
Describe the bug
I realized the other day that, with the introduction of virtual objects, vats might attempt to create some during their startup phase: at the top-level module context, or inside
buildRootObject()
but outside of any remote message handler (methods of that root object). This will causevatstoreSet()
syscalls to happen much earlier than before. Our vat managers aren't currently prepared to honor syscalls this early.We need to fix that, since we want to encourage Issuers to use virtual objects, and it's entirely reasonable for contracts to create some sort of singleton infrastructural Purse during startup.
The state changes that result from these syscalls (the kerneldb vatstore writes) should be lumped into the same atomic transaction as the clist creation and scheduling of the resolution of the vatAdmin vat-creation promise (the one that introduces the caller to the new vat's root object).
The text was updated successfully, but these errors were encountered: