-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validators DB Upgrades #318
Conversation
2101217
to
b9d101e
Compare
caf4106
to
5c5c786
Compare
pkg/validators/sql.go
Outdated
) WITHOUT ROWID, STRICT; | ||
|
||
CREATE TABLE IF NOT EXISTS schema_version ( | ||
version INTEGER NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you are defining this table twice (again on line 87).
Also, we need it to not use ROWID, as well as STRICT, to guarantee determinism (since I believe these changes get captured in the app-hash).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I think we have trouble if this is captured in app-hash because an operator can update software to trigger this upgrades at arbitrary block height.
Introducing a consensus change like this without some kind of coordination would seem to be a problem if this were performed on a production network node.
I think perhaps using the SQLite session+changeset approach for validator state hash is not ideal. What goes on in the validator DB should really be implementation detail, while just the current validator set and the active join requests are what compose state that should be digested for the apphash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so, even if the schema version were tracked in some dumb VALIDATORS_VERSION
text file with an integer in it, an actual upgrade to the actual validator SQL tables would be something like ALTER TABLE join_reqs ADD COLUMN expiresAt INTEGER DEFAULT -1
or something similar. That would certainly be affecting the app hash with the current data store backend.
This would be fine if the schema upgrades were coordinated to happen at a given block height. I was not thinking along these lines because the db schema shouldn't be state, at least not for the validator module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, I totally forgot about that. let me revisit this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think perhaps using the SQLite session+changeset approach for validator state hash is not ideal. What goes on in the validator DB should really be implementation detail, while just the current validator set and the active join requests are what compose state that should be digested for the apphash.
I agree with this: State that should count towards appHash for ValidatorStore should be [CurValidators + JoinRequests & Approvals] but not the schema. This would change the way we compute apphash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A divergence in the schema would mean a clear lack of agreement.
I really think that this is not the state to consider in app hash. In most chains there are different implementations of the nodes and they can store data how they want. State is specified at a higher level without regard to implementation. For the public Kwil datasets (the core offering of our product), that's not the case because state is the database, but for the auxiliary data stores that are just meant to persist other state like this one and probably others like the metadata store, we're actually making things quite brittle IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an illustration: f34abae
I may still not be seeing the reasoning for why the underlying data structure is important, but my feeling is that is should be black box, and the ID that affects apphash should be based on the state read from that black box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Datastructures are not important, but the behavior or the way state would change depends on the schema/version we are on. As in the example Brennan pointed out, If set of nodes are on v0 and the rest on v1, and we have join request , nodes on v0 has indefinite expiry duration, whereas on v1 (configured expiry). So there can be scenario where the join request is expired on nodes on V1 and compute the state without this request, whereas on v0 it is included. So depending on majority corresponding block will be mined. If super majority on v1 (then v0 nodes won't reach consensus) , if they are 50-50 (no-consensus anyways) and similarly the other scenarios.
But, agree that the appHash should be based on the state read but not on the schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be reflected by the JoinExpiry value of the running node, not artifacts of the node's ability to deal with a non-infinite JoinExpiry value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this actually does seem correct. This is also true for the engine's master database registry, as well as the account store.
I'm not quite sure what @jchappelow had in mind for this, but it seems to be ok for me. Will wait for him to approve though. |
f90f6e4
to
9e4aaba
Compare
9e4aaba
to
c16edb4
Compare
67039da
to
3c91570
Compare
3c91570
to
d56090e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works and all looks alright to me. Just a few additional comments, things I didn't notice or realize earlier.
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
d56090e
to
bd9b714
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good framework for future updates.
Just a note that if you try to run an upgrade from a root_dir db from the current main
, where we already have an expiresAt
column, it's going to fail with failed to initialize database at version 1 due to error: failed to add expiresAt column to join_reqs table: failed to execute statement: sqlitex.Exec: SQL logic error: duplicate column name: expiresAt
.
So essentially the version0.sqlite is pretend since it does not have this column for the purposes of proving the upgrade framework introduced here.
I'm OK with this if everyone else is clear about this. It means that sqlAddJoinExpiry
is dead code in practice. Because of breaking consensus changes in this release, there will be no upgrades from our last release that would be treated like v0 here, and thus this db upgrade won't ever be used.
jrStr := fmt.Sprintf("%s:%d:%d", joiner, jr.power, jr.expiresAt) | ||
hasher.Write([]byte(jrStr)) | ||
|
||
var approvers []string | ||
for val, approved := range jr.validators { | ||
if approved { | ||
approvers = append(approvers, val) | ||
} | ||
} | ||
sort.Strings(approvers) | ||
for _, approver := range approvers { | ||
hasher.Write([]byte(":" + approver)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No major issue with the string based approach for digesting all the data, but I'll look at this tonight (can be after this merges) and I feel like we might be safer with an all binary approach like with the genesisConfig appHash generation. The fmt package tends to be oriented toward display and often has subtle changes. encoding/binary tends to be more well defined and stable.
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
Validator schema upgrade update tests Decouple validatorDBHash from sql notions fix docs
Legacy branch : (v0)
Without schema_version table
No Join expiry in Join requests
V1:
Introducing schema_version table that tracks the version the database is at
added expiresAt column to the join_reqs
set the API version to 1
Process:
We check existence of the version and data tables and decide the version the database is at and run appropriate upgrades or initializations. Below are the scenarios that can occur:
Fresh DB start, no version and data tables ----> Do regular initialization, set the version to currentVersion.
Legacy mode: data tables exist but not version. ----> Add version table, set the version to currentVersion and run Migrations to update the join_reqs table
Data and version tables exist, and we are on correct version ----> do regular initialization
Future purposes: If we are on lower versions -> run migrations