Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Add special handling for snapshot root slot in get_confirmed_block #7430

Merged
merged 2 commits into from
Dec 11, 2019

Conversation

CriesofCarrots
Copy link
Contributor

@CriesofCarrots CriesofCarrots commented Dec 11, 2019

Problem

Testnet panics when downstream user calls getConfirmedBlock rpc on all slots. (And when all the rpc threads have panicked due to retries, the rpc node becomes unresponsive to further queries.)

In particular, the panic occurs on getConfirmedBlock on the first blocktree root (aside from slot 0). Blocktree::get_confirmed_block assumes a root always has a valid parent slot, but when a node is booted from a snapshot (as the rpc node is on testnet boot), the snapshot root doesn't contain any entries or have a valid parent root in blocktree.

Summary of Changes

  • Add handling for snapshot root slot (no parent root, no entries; returns BlocktreeError::SlotNotRooted), and next slot (parent slot has no entries, and therefore no blockhash; returns previous_blockhash: Hash::default())

Technically, this means the results of getConfirmedBlock for snapshot root and snapshot root+1 are slightly incorrect. But we expect our main downstream use case to run a node without snapshots, so this should not be an issue.
If it becomes an issue, the more general solution would be to preserve the entries of snapshot root in each snapshot.

This should fix testnet.solana.com, when deployed there.

Fixes #7418

mvines
mvines previously approved these changes Dec 11, 2019
Copy link
Contributor

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+ but no test? 😬

@codecov
Copy link

codecov bot commented Dec 11, 2019

Codecov Report

Merging #7430 into master will increase coverage by 6.4%.
The diff coverage is 97.8%.

@@           Coverage Diff            @@
##           master   #7430     +/-   ##
========================================
+ Coverage    74.2%   80.6%   +6.4%     
========================================
  Files         245     245             
  Lines       52559   48466   -4093     
========================================
+ Hits        39003   39075     +72     
+ Misses      13556    9391   -4165

@mergify mergify bot dismissed mvines’s stale review December 11, 2019 21:32

Pull request has been modified.

@CriesofCarrots
Copy link
Contributor Author

r+ but no test? 😬

Had hoped to beat you there!

@CriesofCarrots CriesofCarrots merged commit 1d0ba0d into solana-labs:master Dec 11, 2019
mergify bot pushed a commit that referenced this pull request Dec 11, 2019
…7430)

* Add special handling for snapshot root slot

* Improve test

(cherry picked from commit 1d0ba0d)

# Conflicts:
#	Cargo.lock
@CriesofCarrots CriesofCarrots deleted the fix-gcb-panic branch January 15, 2020 00:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0.21: rpc threads are panicking on testnet.solana.com and bringing down RPC
2 participants