-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to output Bank hash details #32632
Conversation
d7c883b
to
b9cd4cb
Compare
@ripatel-fd - Here is the base PR for creating these files in the labs client. I'll post a sample file later there, but given some DM's and Discord, I know you had been exploring a binary format for storing these files and it also seems like you all were potentially interested in streaming these files for every slot from a labs validator for the sake of creating a set to compare against in order to trackdown incompatibilities in firedancer |
Codecov Report
@@ Coverage Diff @@
## master #32632 +/- ##
=========================================
- Coverage 82.0% 82.0% -0.1%
=========================================
Files 784 785 +1
Lines 211897 212069 +172
=========================================
+ Hits 173851 173971 +120
- Misses 38046 38098 +52 |
6c9bd02
to
4dd7698
Compare
Here are some file sizes
GitHub won't let me load up the {
"version": "1.17.0 (src:00000000; feat:461617547, client:SolanaLabs)",
"slot": 203546517,
"hash": "G1ThhNa1e2LqaFaV4MEVssXe32EyCnFedYySrpWt1LHg",
"parent_hash": "FdCQJsKX1cWzKyVyjMqcvJeDmCW46tCA2MSXVNwobcNf",
"accounts_delta_hash": "3AceoCg7mGo3G6Z4e7iWFpFc4Nf2aWUZEqNT65QaMSej",
"signature_count": 1363,
"last_blockhash": "94tm1SDn34iKWjgLsQCGoXkUpvSmMZAJ9eTPkH6RfbAo",
"accounts": [
{
"pubkey": "1M5USfamd1N4i1z6UZeECrWeu2VfrxjYMBSXThu6TqB",
"hash": "Ggea5CLgo53Zxuw6HXAuJUdq4dD64y5uZfXhrYt7SVJ5",
"lamports": 44526227501,
"rent_epoch": 0,
"executable": false,
"data": "AQAAAAQLOJOsu3tFbQhfZiz+y/Z0TLwW7WNn61hQk6BpWzrJBGaqyVrGvP7InUJygJLMO2NqEDF0ZB2f1aiAQ90NcxEKHwAAAAAAAAB23yEMAAAAAB8AAAB33yEMAAAAAB4AAAB43yEMAAAAAB0AAAB53yEMAAAAABwAAAB63yEMAAAAABsAAAB73yEMAAAAABoAAAB83yEMAAAAABkAAAB93yEMAAAAABgAAAB+3yEMAAAAABcAAAB/3yEMAAAAABYAAACA3yEMAAAAABUAAACB3yEMAAAAABQAAACC3yEMAAAAABMAAACD3yEMAAAAABIAAACE3yEMAAAAABEAAACF3yEMAAAAABAAAACG3yEMAAAAAA8AAACH3yEMAAAAAA4AAACI3yEMAAAAAA0AAACJ3yEMAAAAAAwAAACK3yEMAAAAAAsAAACL3yEMAAAAAAoAAACM3yEMAAAAAAkAAACN3yEMAAAAAAgAAACO3yEMAAAAAAcAAACP3yEMAAAAAAYAAACQ3yEMAAAAAAUAAACR3yEMAAAAAAQAAACS3yEMAAAAAAMAAACT3yEMAAAAAAIAAACU3yEMAAAAAAEAAAABdd8hDAAAAAABAAAAAAAAANcBAAAAAAAABAs4k6y7e0VtCF9mLP7L9nRMvBbtY2frWFCToGlbOskfAAAAAAAAAAFAAAAAAAAAAJgBAAAAAAAAHlncAAAAAACymdYAAAAAAJkBAAAAAAAAFK/hAAAAAAAeWdwAAAAAAJoBAAAAAAAAWD7nAAAAAAAUr+EAAAAAAJsBAAAAAAAAAojsAAAAAABYPucAAAAAAJwBAAAAAAAAF7nxAAAAAAACiOwAAAAAAJ0BAAAAAAAAzwr3AAAAAAAXufEAAAAAAJ4BAAAAAAAAm0P8AAAAAADPCvcAAAAAAJ8BAAAAAAAAW2sBAQAAAACbQ/wAAAAAAKABAAAAAAAA6lgHAQAAAABbawEBAAAAAKEBAAAAAAAAQ0MNAQAAAADqWAcBAAAAAKIBAAAAAAAA41oTAQAAAABDQw0BAAAAAKMBAAAAAAAAFmsZAQAAAADjWhMBAAAAAKQBAAAAAAAA/2UfAQAAAAAWaxkBAAAAAKUBAAAAAAAAwIwlAQAAAAD/ZR8BAAAAAKYBAAAAAAAAjaMrAQAAAADAjCUBAAAAAKcBAAAAAAAAd7cxAQAAAACNoysBAAAAAKgBAAAAAAAAawc4AQAAAAB3tzEBAAAAAKkBAAAAAAAA/lo+AQAAAABrBzgBAAAAAKoBAAAAAAAADqhEAQAAAAD+Wj4BAAAAAKsBAAAAAAAALA5LAQAAAAAOqEQBAAAAAKwBAAAAAAAA1HdRAQAAAAAsDksBAAAAAK0BAAAAAAAA3NBXAQAAAADUd1EBAAAAAK4BAAAAAAAACideAQAAAADc0FcBAAAAAK8BAAAAAAAAwIxkAQAAAAAKJ14BAAAAALABAAAAAAAASfVqAQAAAADAjGQBAAAAALEBAAAAAAAAPWBxAQAAAABJ9WoBAAAAALIBAAAAAAAAPsh3AQAAAAA9YHEBAAAAALMBAAAAAAAADyJ+AQAAAAA+yHcBAAAAALQBAAAAAAAAUH2EAQAAAAAPIn4BAAAAALUBAAAAAAAAZWaKAQAAAABQfYQBAAAAALYBAAAAAAAAfWWQAQAAAABlZooBAAAAALcBAAAAAAAAppyWAQAAAAB9ZZABAAAAALgBAAAAAAAAZuScAQAAAACmnJYBAAAAALkBAAAAAAAA2jKjAQAAAABm5JwBAAAAALoBAAAAAAAA1YmpAQAAAADaMqMBAAAAALsBAAAAAAAA+9avAQAAAADViakBAAAAALwBAAAAAAAAN/q1AQAAAAD71q8BAAAAAL0BAAAAAAAABy+8AQAAAAA3+rUBAAAAAL4BAAAAAAAA6m3CAQAAAAAHL7wBAAAAAL8BAAAAAAAANoDIAQAAAADqbcIBAAAAAMABAAAAAAAAErnOAQAAAAA2gMgBAAAAAMEBAAAAAAAAQfzUAQAAAAASuc4BAAAAAMIBAAAAAAAAM0nbAQAAAABB/NQBAAAAAMMBAAAAAAAA0YThAQAAAAAzSdsBAAAAAMQBAAAAAAAAELrnAQAAAADRhOEBAAAAAMUBAAAAAAAA4+jtAQAAAAAQuucBAAAAAMYBAAAAAAAAbUH0AQAAAADj6O0BAAAAAMcBAAAAAAAAVpn6AQAAAABtQfQBAAAAAMgBAAAAAAAAtuIAAgAAAABWmfoBAAAAAMkBAAAAAAAAizMHAgAAAAC24gACAAAAAMoBAAAAAAAA0nYNAgAAAACLMwcCAAAAAMsBAAAAAAAAYrgTAgAAAADSdg0CAAAAAMwBAAAAAAAAvK4ZAgAAAABiuBMCAAAAAM0BAAAAAAAARPEfAgAAAAC8rhkCAAAAAM4BAAAAAAAAiEcmAgAAAABE8R8CAAAAAM8BAAAAAAAAlKosAgAAAACIRyYCAAAAANABAAAAAAAAOwAzAgAAAACUqiwCAAAAANEBAAAAAAAAOkc5AgAAAAA7ADMCAAAAANIBAAAAAAAAxJs/AgAAAAA6RzkCAAAAANMBAAAAAAAAgu1FAgAAAADEmz8CAAAAANQBAAAAAAAAEuxLAgAAAACC7UUCAAAAANUBAAAAAAAAl0ZSAgAAAAAS7EsCAAAAANYBAAAAAAAAXKNYAgAAAACXRlICAAAAANcBAAAAAAAAtr1ZAgAAAABco1gCAAAAAJTfIQwAAAAAYVSlZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
},
{
"pubkey": "1bB6Z9iaL1TM4VTo54B5w1FgCXFDbbtVkVJu4VbiNNc",
"hash": "EeszVJLmgBwf8JncRjvD7pY6z8dyfNeKXrXhSwAxxK7g",
"lamports": 2808284745,
"rent_epoch": 361,
"executable": false,
"data": "AQAAAKLwte7F6IHPUBvvLVgsXjIZdw9IL574LpFDumh4Ruofrbq6hujHGziQEhBQufWTJTHc/Y1QxdtZIImoMH4/GOIKHwAAAAAAAAB23yEMAAAAAB8AAAB33yEMAAAAAB4AAAB43yEMAAAAAB0AAAB53yEMAAAAABwAAAB63yEMAAAAABsAAAB73yEMAAAAABoAAAB83yEMAAAAABkAAAB93yEMAAAAABgAAAB+3yEMAAAAABcAAAB/3yEMAAAAABYAAACA3yEMAAAAABUAAACB3yEMAAAAABQAAACC3yEMAAAAABMAAACD3yEMAAAAABIAAACE3yEMAAAAABEAAACF3yEMAAAAABAAAACG3yEMAAAAAA8AAACH3yEMAAAAAA4AAACI3yEMAAAAAA0AAACJ3yEMAAAAAAwAAACK3yEMAAAAAAsAAACL3yEMAAAAAAoAAACM3yEMAAAAAAkAAACN3yEMAAAAAAgAAACO3yEMAAAAAAcAAACP3yEMAAAAAAYAAACQ3yEMAAAAAAUAAACR3yEMAAAAAAQAAACS3yEMAAAAAAMAAACT3yEMAAAAAAIAAACU3yEMAAAAAAEAAAABdd8hDAAAAAABAAAAAAAAANcBAAAAAAAAovC17sXogc9QG+8tWCxeMhl3D0gvnvgukUO6aHhG6hfAAAAAAAAAAFAAAAAAAAAAJgBAAAAAAAAat1GAQAAAABdKUEBAAAAAJkBAAAAAAAApClMAQAAAABq3UYBAAAAAJoBAAAAAAAAK7BRAQAAAACkKUwBAAAAAJsBAAAAAAAAxu5WAQAAAAArsFEBAAAAAJwBAAAAAAAAGRBcAQAAAADG7lYBAAAAAJ0BAAAAAAAA3FFhAQAAAAAZEFwBAAAAAJ4BAAAAAAAAAmhmAQAAAADcUWEBAAAAAJ8BAAAAAAAACKxrAQAAAAACaGYBAAAAAKABAAAAAAAAu5dxAQAAAAAIrGsBAAAAAKEBAAAAAAAAfn93AQAAAAC7l3EBAAAAAKIBAAAAAAAATJV9AQAAAAB+f3cBAAAAAKMBAAAAAAAAAaCDAQAAAABMlX0BAAAAAKQBAAAAAAAArJiJAQAAAAABoIMBAAAAAKUBAAAAAAAAar2PAQAAAACsmIkBAAAAAKYBAAAAAAAA1NGVAQAAAABqvY8BAAAAAKcBAAAAAAAAq7WbAQAAAADU0ZUBAAAAAKgBAAAAAAAA2wKiAQAAAACrtZsBAAAAAKkBAAAAAAAAYVaoAQAAAADbAqIBAAAAAKoBAAAAAAAAu5WuAQAAAABhVqgBAAAAAKsBAAAAAAAATfu0AQAAAAC7la4BAAAAAKwBAAAAAAAAf2W7AQAAAABN+7QBAAAAAK0BAAAAAAAA7r3BAQAAAAB/ZbsBAAAAAK4BAAAAAAAApxLIAQAAAADuvcEBAAAAAK8BAAAAAAAAGHbOAQAAAACnEsgBAAAAALABAAAAAAAAxNfUAQAAAAAYds4BAAAAALEBAAAAAAAA0UHbAQAAAADE19QBAAAAALIBAAAAAAAAiKnhAQAAAADRQdsBAAAAALMBAAAAAAAAOwLoAQAAAACIqeEBAAAAALQBAAAAAAAA4lnuAQAAAAA7AugBAAAAALUBAAAAAAAAoZfzAQAAAADiWe4BAAAAALYBAAAAAAAA5n/5AQAAAAChl/MBAAAAALcBAAAAAAAAgIr/AQAAAADmf/kBAAAAALgBAAAAAAAANM8FAgAAAACAiv8BAAAAALkBAAAAAAAA0R0MAgAAAAA0zwUCAAAAALoBAAAAAAAAinUSAgAAAADRHQwCAAAAALsBAAAAAAAAN8EYAgAAAACKdRICAAAAALwBAAAAAAAAoeIeAgAAAAA3wRgCAAAAAL0BAAAAAAAASxUlAgAAAACh4h4CAAAAAL4BAAAAAAAAeFIrAgAAAABLFSUCAAAAAL8BAAAAAAAAp1cxAgAAAAB4UisCAAAAAMABAAAAAAAAVY83AgAAAACnVzECAAAAAMEBAAAAAAAAX9E9AgAAAABVjzcCAAAAAMIBAAAAAAAAhBtEAgAAAABf0T0CAAAAAMMBAAAAAAAAmTtKAgAAAACEG0QCAAAAAMQBAAAAAAAA0oBQAgAAAACZO0oCAAAAAMUBAAAAAAAA9qtWAgAAAADSgFACAAAAAMYBAAAAAAAARQFdAgAAAAD2q1YCAAAAAMcBAAAAAAAAiVZjAgAAAABFAV0CAAAAAMgBAAAAAAAA5plpAgAAAACJVmMCAAAAAMkBAAAAAAAAS+dvAgAAAADmmWkCAAAAAMoBAAAAAAAAsyV2AgAAAABL528CAAAAAMsBAAAAAAAAsGR8AgAAAACzJXYCAAAAAMwBAAAAAAAAFHOCAgAAAACwZHwCAAAAAM0BAAAAAAAACrKIAgAAAAAUc4ICAAAAAM4BAAAAAAAAiAWPAgAAAAAKsogCAAAAAM8BAAAAAAAAgGWVAgAAAACIBY8CAAAAANABAAAAAAAAObabAgAAAACAZZUCAAAAANEBAAAAAAAAD4ChAgAAAAA5tpsCAAAAANIBAAAAAAAA4NOnAgAAAAAPgKECAAAAANMBAAAAAAAAoyKuAgAAAADg06cCAAAAANQBAAAAAAAAHXK0AgAAAACjIq4CAAAAANUBAAAAAAAAkMi6AgAAAAAdcrQCAAAAANYBAAAAAAAAliLBAgAAAACQyLoCAAAAANcBAAAAAAAAwzvCAgAAAACWIsECAAAAAJTfIQwAAAAAYVSlZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="
}, I'll give the base64+zstd a try too; even though we'll probably land on a single encoding here, probably makes sense to include what the account data encoding format is at the top too. I'll probably write |
Various diffs for how we capture useful information while debugging firedancer verses solana... • diff against master ▪ More actively in use • banks.rs • src/account.rs • src/fee.rs • ledger/src/leader_schedule.rs • programs/bpf_loader/src/lib.rs • runtime/src/accounts_hash.rs • invoke_context.rs • system_instruction_processor.rs |
Summing up a call:
|
This looks great. Do you have an estimate when this PR will land master? |
3f8d9d2
to
65134ac
Compare
65134ac
to
365305e
Compare
can you rebase on master since |
r+ rebase |
Also allow the data to be written from ledger-tool in order to create a file to compare against. The generated files are human-readable (JSON) and diff-friendly.
Keep the serialize/deserialize trait impls a bit slimmer
4738e50
to
6aa7140
Compare
Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…32632) When a consensus divergance occurs, the current workflow involves a handful of manual steps to hone in on the offending slot and transaction. This process isn't overly difficult to execute; however, it is tedious and currently involves creating and parsing logs. This change introduces functionality to output a debug file that contains the components go into the bank hash. The file can be generated in two ways: - Via solana-validator when the node realizes it has diverged - Via solana-ledger-tool verify by passing a flag When a divergance occurs now, the steps to debug would be: - Grab the file from the node that diverged - Generate a file for the same slot with ledger-tool with a known good version - Diff the files, they are pretty-printed json
…32632) When a consensus divergance occurs, the current workflow involves a handful of manual steps to hone in on the offending slot and transaction. This process isn't overly difficult to execute; however, it is tedious and currently involves creating and parsing logs. This change introduces functionality to output a debug file that contains the components go into the bank hash. The file can be generated in two ways: - Via solana-validator when the node realizes it has diverged - Via solana-ledger-tool verify by passing a flag When a divergance occurs now, the steps to debug would be: - Grab the file from the node that diverged - Generate a file for the same slot with ledger-tool with a known good version - Diff the files, they are pretty-printed json
Backports to the stable branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. |
When a consensus divergance occurs, the current workflow involves a handful of manual steps to hone in on the offending slot and transaction. This process isn't overly difficult to execute; however, it is tedious and currently involves creating and parsing logs. This change introduces functionality to output a debug file that contains the components go into the bank hash. The file can be generated in two ways: - Via solana-validator when the node realizes it has diverged - Via solana-ledger-tool verify by passing a flag When a divergance occurs now, the steps to debug would be: - Grab the file from the node that diverged - Generate a file for the same slot with ledger-tool with a known good version - Diff the files, they are pretty-printed json (cherry picked from commit 6bbf514) # Conflicts: # Cargo.lock # ledger-tool/src/args.rs # ledger-tool/src/main.rs # programs/sbf/Cargo.lock # runtime/Cargo.toml # runtime/src/accounts_db.rs # validator/src/main.rs
…34257) * Add ability to output components that go into Bank hash (#32632) When a consensus divergance occurs, the current workflow involves a handful of manual steps to hone in on the offending slot and transaction. This process isn't overly difficult to execute; however, it is tedious and currently involves creating and parsing logs. This change introduces functionality to output a debug file that contains the components go into the bank hash. The file can be generated in two ways: - Via solana-validator when the node realizes it has diverged - Via solana-ledger-tool verify by passing a flag When a divergance occurs now, the steps to debug would be: - Grab the file from the node that diverged - Generate a file for the same slot with ledger-tool with a known good version - Diff the files, they are pretty-printed json (cherry picked from commit 6bbf514) # Conflicts: # Cargo.lock # ledger-tool/src/args.rs # ledger-tool/src/main.rs # programs/sbf/Cargo.lock # runtime/Cargo.toml # runtime/src/accounts_db.rs # validator/src/main.rs * Merge conflict * Reorder base_wotking with accounts_hash to match other branches --------- Co-authored-by: steviez <[email protected]>
Problem
When a consensus mismatch occurs, the current workflow involves a handful of manual steps to isolate down to the offending slot/transaction. This process is loosely defined below, and while not difficult to execute, it is somewhat tedious and involves arbitrarily creating and parsing some logs:
https://github.com/solana-labs/solana/wiki/Debugging-Consensus-Failures
Summary of Changes
Add functionality to dump the information that goes into a bank hash to a file. Additionally, generate that file when a node deviates from consensus (purges a slot in
ReplayStage
). With this file in hand, someone who has observed a node deviating only needs to generate the same file viasolana-ledger-tool
with a known good version to serve as a comparison to the one generated from validator. The file is currently printed as json and the accounts sorted which makes it both 1) human-readable and 2)diff
friendly.