Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(swingset): comms initialization check must be deterministic #3727

Merged
merged 1 commit into from
Aug 18, 2021

Conversation

warner
Copy link
Member

@warner warner commented Aug 18, 2021

The comms vat has a bunch of counters that need to be initialized exactly
once in the lifetime of the vat's durable state. There is a DB flag named
initialize to track whether this has happened already or not. Since vats
can only do syscalls during deliveries, not startup (see #2910), the comms
vat must read this flag on every delivery, just in case this is the very
first one, and the DB needs initialization.

The comms vat was using an in-RAM flag (needToInitializeState) to cache the
result, to avoid a DB read for every single delivery. However this caused the
syscall behavior to change for the first delivery after a kernel restart.
There were two extra syscalls taking place: a vatstoreGet('initialized'),
and a vatstoreSet('meta.o+0', 'true'). The vatstoreSet was causing a DB
change, which was picked up by the new kernel activityhash, and caused a
consensus failure on the first post-restart comms message.

This changes the comms vat to always read the DB flag, on every delivery,
making its behavior consistent and independent of kernel restarts. It also
moves the vatstoreSet to be guarded by the results of the initialized
check, so it only happens once in the lifetime of the vat.

To remove the need for the vatstoreGet on each delivery, we'll need to
enhance the way setup() is called, to provide an external flag that tells
the vat whether this is the very first time ever, or if it's merely a kernel
restart that required the vat's dispatch object to be reconstructed.

fixes #3726

@warner warner added the SwingSet package: SwingSet label Aug 18, 2021
@warner warner added this to the Testnet: Metering Phase milestone Aug 18, 2021
@warner warner requested a review from FUDCo August 18, 2021 20:14
@warner warner self-assigned this Aug 18, 2021
Base automatically changed from 3720-commit-after-run to master August 18, 2021 20:15
@warner warner force-pushed the 3726-comms-initialize branch from 73d938b to 6c9726e Compare August 18, 2021 20:16
The comms vat has a bunch of counters that need to be initialized exactly
once in the lifetime of the vat's durable state. There is a DB flag named
`initialize` to track whether this has happened already or not. Since vats
can only do syscalls during deliveries, not startup (see #2910), the comms
vat must read this flag on every delivery, just in case this is the very
first one, and the DB needs initialization.

The comms vat was using an in-RAM flag (`needToInitializeState`) to cache the
result, to avoid a DB read for every single delivery. However this caused the
syscall behavior to change for the first delivery after a kernel restart.
There were two extra syscalls taking place: a `vatstoreGet('initialized')`,
and a `vatstoreSet('meta.o+0', 'true')`. The `vatstoreSet` was causing a DB
change, which was picked up by the new kernel activityhash, and caused a
consensus failure on the first post-restart comms message.

This changes the comms vat to always read the DB flag, on every delivery,
making its behavior consistent and independent of kernel restarts. It also
moves the `vatstoreSet` to be guarded by the results of the `initialized`
check, so it only happens once in the lifetime of the vat.

To remove the need for the `vatstoreGet` on each delivery, we'll need to
enhance the way `setup()` is called, to provide an external flag that tells
the vat whether this is the very first time ever, or if it's merely a kernel
restart that required the vat's `dispatch` object to be reconstructed.

fixes #3726
@warner warner force-pushed the 3726-comms-initialize branch from 6c9726e to 8457d9a Compare August 18, 2021 20:41
Copy link
Contributor

@FUDCo FUDCo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly necessary...

@michaelfig michaelfig enabled auto-merge (rebase) August 18, 2021 21:01
@michaelfig michaelfig merged commit 683b771 into master Aug 18, 2021
@michaelfig michaelfig deleted the 3726-comms-initialize branch August 18, 2021 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

consensus failure due to comms get(initialized) check
3 participants