Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Andrew7234/validator staking history #703

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

Andrew7234
Copy link
Collaborator

@Andrew7234 Andrew7234 commented Jun 3, 2024

Tracks active/debonding escrow validators for each epoch.

Todo: add api endpoint

Alternative considered: We could fetch validator escrow balances and process them in the consensus block analyzer. It ended up being messy due since we only want to track validator balances at the start of each epoch and not at every block, and during fast sync this requires creating another todo_updates table to track per-block validator escrow balances and then only taking the first one from each epoch during FinalizeFastSync. It also requires us to fetch/temporarily store a lot of extraneous data. If anyone feels strongly about it let me know.

@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch 4 times, most recently from b6f042b to c18469c Compare June 3, 2024 22:04
@Andrew7234 Andrew7234 marked this pull request as ready for review June 4, 2024 14:57
@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch from c18469c to fc99ca1 Compare June 4, 2024 16:10
Copy link
Member

@ptrus ptrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, couple of minor comments!

We should double-check if there's any other validator historic data we need for frontend at this time or if this is it.

analyzer/queries/queries.go Outdated Show resolved Hide resolved
CREATE INDEX ix_validator_balance_history_id_epoch ON chain.validator_balance_history (id, epoch);

ALTER TABLE chain.epochs
ADD COLUMN validators base64_ed25519_pubkey[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ADD COLUMN validators base64_ed25519_pubkey[];
ADD COLUMN active_validators base64_ed25519_pubkey[];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is this missing a code change which inserts this? (or do i just not see it)

storage/migrations/16_validator_staking_history.up.sql Outdated Show resolved Hide resolved
storage/migrations/16_validator_staking_history.up.sql Outdated Show resolved Hide resolved
analyzer/validatorbalances/validatorbalances.go Outdated Show resolved Hide resolved
analyzer/queries/queries.go Outdated Show resolved Hide resolved

func (p *processor) ProcessItem(ctx context.Context, batch *storage.QueryBatch, epoch *Epoch) error {
p.logger.Info("downloading validator balances", "epoch", epoch)
validators, err := p.source.GetValidators(ctx, epoch.StartHeight)
Copy link
Member

@ptrus ptrus Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up calling oasis-core GetValidators right? Which returns just the active validators for the epoch starting at that height.

I think we maybe don't want just these, but go over all known consensus accounts since escrow balance can change to any account? (and debonding as well).

So if validator goes out of the validator set - but still has stake which is being added/removed over time, we will never update the state for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we update this to track for all accounts (not just "validators" for the epoch) then we could also use this historic table for things like "account birth/created_at" which is also a request from the frontend team.

Unless it is not feasible to go over all accounts for every epoch, then we need to think about it a bit more how to approach the "account" stats.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that grabbing all accounts at every epoch is untenable. We'd probably need StateToGenesis? Or something else that will be similarly slow, since staking data is a huge part of StateToGenesis output.
To run this in real time, we'd need StateToGenesis to be faster than 60min (= 1 epoch). Which it is, even if only when we are somewhat careful about the node's hardware (disks). But to run this in fast-sync, even a 10-minute response would mean we need ~0.5years to process 3 years of blocks. We can get around it by using multiple nodes, but I'm not sure if it's worth it.

Also, downloading so much data would put a strain on (and possibly force us to rethink/redesign) the DB, caches, etc. It's a lot of data. Very conservatively: (1e6 accounts) * (32000 epochs) * (100 bytes per account snapshot at given epoch) = 3.2TB, but easily a few times more.

I believe we limited ourselves to just validators because it's more important that we be able to show the history of a validator (as it builds their credibility), compared to the history of a "normal" user.

But point well taken; if somebody stops being an active validator and later becomes one again, their interim history matters. Andy, how many accounts would we need to touch if we queried every account that was ever a validator? Maybe that's a reasonable compromise.

If this is slow to run or complicated to implement, I strongly suggest we go with what we already have in this PR at least for now, and thus potentially report partial histories for validators. It's much better than nothing, it's 0 additional effort, and it's launchable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we queried every account that was ever a validator

I think this should be feasable. The validators dont flactuate a lot. I believe there are 226 such validators on mainnet (based on oasisscan, i think they track validators in a similar way to this).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all for the feedback! Agreed that querying all accounts that were ever a validator is a good compromise.

One potential complication - having a batch size > 1 can cause issues if a new validator appears. Eg, if we're processing epochs 1-20 in parallel but epoch 3 introduces a new validator, the worker processing epochs 4-20 won't know about the new validator.

For a given epoch N, we want to fetch info for all the validators fetched for epoch N-1, union-ed with the set of active validators for epoch N. This sequential dependency would require us to always run this analyzer with a batch size of 1 so we can build up the list of known validators. Compute-wise, this should be feasible given that there are ~35k epochs, but I'll need to do some testing to confirm. Another caveat is that we won't have staked balance history for an account before it becomes a validator. Imo this is fine, since we won't be displaying any deep history and it won't hinder staking rewards calculations when we add that in the future.

We could instead hardcode the 226 validators and have the analyzer fetch the staked balance for each one up to some recent block, and then have the analyzer just proceed sequentially(batch_size=1) from then onwards.

Another approach would be to treat each (epoch, validator) as the item instead of just (epoch). I need to do more work to see how complicated the item accounting becomes, but maybe it'll produce a nicer solution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to have the analyzer process the epochs sequentially from a given start_height, taking the union of {current validators} x {validators from prior epoch}. The start height is expected to match consensus analyzer start height, but could be adjusted for a faster reindex. This allows us to track the continuous history for each validator from the epoch they first become a validator, even if the validator drops out after.

Having the analyzer process epochs sequentially ends up not constraining performance since each epoch makes >100 node requests for the validator balances.

Copy link
Contributor

@mitjat mitjat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you Andy! Obviously still some TODOs, which I tried to call out in comments, but at a high level this LGTM. And for what it's worth, it's exactly the design I arrived at when brainstorming this with Jan (at which point I had forgotten that you already have an in-flight PR with this).

storage/migrations/16_validator_staking_history.up.sql Outdated Show resolved Hide resolved
storage/migrations/16_validator_staking_history.up.sql Outdated Show resolved Hide resolved
storage/oasis/nodeapi/api.go Show resolved Hide resolved
analyzer/validatorbalances/validatorbalances.go Outdated Show resolved Hide resolved
analyzer/validatorbalances/validatorbalances.go Outdated Show resolved Hide resolved
analyzer/validatorbalances/validatorbalances.go Outdated Show resolved Hide resolved

func (p *processor) ProcessItem(ctx context.Context, batch *storage.QueryBatch, epoch *Epoch) error {
p.logger.Info("downloading validator balances", "epoch", epoch)
validators, err := p.source.GetValidators(ctx, epoch.StartHeight)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that grabbing all accounts at every epoch is untenable. We'd probably need StateToGenesis? Or something else that will be similarly slow, since staking data is a huge part of StateToGenesis output.
To run this in real time, we'd need StateToGenesis to be faster than 60min (= 1 epoch). Which it is, even if only when we are somewhat careful about the node's hardware (disks). But to run this in fast-sync, even a 10-minute response would mean we need ~0.5years to process 3 years of blocks. We can get around it by using multiple nodes, but I'm not sure if it's worth it.

Also, downloading so much data would put a strain on (and possibly force us to rethink/redesign) the DB, caches, etc. It's a lot of data. Very conservatively: (1e6 accounts) * (32000 epochs) * (100 bytes per account snapshot at given epoch) = 3.2TB, but easily a few times more.

I believe we limited ourselves to just validators because it's more important that we be able to show the history of a validator (as it builds their credibility), compared to the history of a "normal" user.

But point well taken; if somebody stops being an active validator and later becomes one again, their interim history matters. Andy, how many accounts would we need to touch if we queried every account that was ever a validator? Maybe that's a reasonable compromise.

If this is slow to run or complicated to implement, I strongly suggest we go with what we already have in this PR at least for now, and thus potentially report partial histories for validators. It's much better than nothing, it's 0 additional effort, and it's launchable.

return nil
}

func (p *processor) QueueLength(ctx context.Context) (int, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Expose this in staging mainnet Grafana. Needs to be added in several places there because our metrics are named awkwardly 😬. (It would be smoother sails if analyzer name were just one of the attributes of a single global queue_length metric)

storage/migrations/16_validator_staking_history.up.sql Outdated Show resolved Hide resolved
@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch from 5129c9f to 565b179 Compare June 26, 2024 18:59
@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch from 565b179 to c89e957 Compare July 5, 2024 06:46
analyzer/queries/queries.go Outdated Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch 3 times, most recently from e678ff6 to f6c69cc Compare July 16, 2024 02:39
@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch 4 times, most recently from 9512aad to db66469 Compare July 29, 2024 20:25
add validatorBalances analyzer

changelog

Update analyzer/validatorbalances/validatorbalances.go

Co-authored-by: mitjat <[email protected]>

address comments

query entity address instead of node

add startHeight + validator history tracking

e2e_regression damask updates

in progress eden e2e_cache

e2e_regression eden updates

post-rebase e2e cache changes
@Andrew7234 Andrew7234 force-pushed the andrew7234/validator-staking-history branch from db66469 to 8edbb44 Compare July 29, 2024 20:43
@Andrew7234 Andrew7234 merged commit 50d4365 into main Jul 29, 2024
16 checks passed
@Andrew7234 Andrew7234 deleted the andrew7234/validator-staking-history branch July 29, 2024 21:44
This was referenced Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants