Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(state-sync): Download state header from external storage #10515

Merged
merged 9 commits into from
Jan 30, 2024

Conversation

VanBarbascu
Copy link
Contributor

@VanBarbascu VanBarbascu commented Jan 27, 2024

In a previous PR, we have introduced the option to upload the state parts to external storage.
In this PR we are changing the way that state sync headers are requested, introducing to download
from external storage. The motivation behind this is that sometimes peers may not track all shards
making it impossible for them to share the state sync header.

The most important change is in commit #3

I had an rpc node running on mainnet and testnet and it was able to sync. Grafana The long testnet sync header download was because there was not node to upload the state in the external bucket. Node started downloading as soon as I started the uploading node.

@VanBarbascu VanBarbascu added the A-stateless-validation Area: stateless validation label Jan 27, 2024
@VanBarbascu VanBarbascu requested a review from a team as a code owner January 27, 2024 09:30
@VanBarbascu VanBarbascu force-pushed the state-sync-download-headers branch 3 times, most recently from 58d8e48 to 79478f4 Compare January 27, 2024 10:44
Copy link

codecov bot commented Jan 27, 2024

Codecov Report

Attention: 211 lines in your changes are missing coverage. Please review.

Comparison is base (95c80bf) 71.92% compared to head (1e4ed4f) 71.83%.
Report is 2 commits behind head on master.

Files Patch % Lines
chain/client/src/sync/state.rs 15.76% 184 Missing and 3 partials ⚠️
chain/client/src/sync/external.rs 44.44% 9 Missing and 1 partial ⚠️
tools/state-parts-dump-check/src/cli.rs 0.00% 5 Missing ⚠️
chain/client/src/metrics.rs 0.00% 4 Missing ⚠️
tools/state-viewer/src/state_parts.rs 0.00% 4 Missing ⚠️
chain/client-primitives/src/types.rs 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10515      +/-   ##
==========================================
- Coverage   71.92%   71.83%   -0.09%     
==========================================
  Files         720      723       +3     
  Lines      146747   146893     +146     
  Branches   146747   146893     +146     
==========================================
- Hits       105553   105526      -27     
- Misses      36331    36495     +164     
- Partials     4863     4872       +9     
Flag Coverage Δ
backward-compatibility 0.08% <0.00%> (-0.01%) ⬇️
db-migration 0.08% <0.00%> (-0.01%) ⬇️
genesis-check 1.25% <0.00%> (-0.01%) ⬇️
integration-tests 36.81% <4.63%> (-0.08%) ⬇️
linux 71.06% <18.53%> (-0.08%) ⬇️
linux-nightly 71.20% <18.53%> (-0.09%) ⬇️
macos 54.76% <15.44%> (-0.24%) ⬇️
pytests 1.47% <0.00%> (-0.01%) ⬇️
sanity-checks 1.26% <0.00%> (-0.01%) ⬇️
unittests 67.81% <15.44%> (-0.09%) ⬇️
upgradability 0.13% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@VanBarbascu VanBarbascu force-pushed the state-sync-download-headers branch 2 times, most recently from 2d94fc1 to ad4dfd0 Compare January 27, 2024 12:48
Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I have some readability suggestions and nits. Generally speaking the state sync code is hard to read, it's very nested and convoluted. I would suggest refactoring it later into smaller methods and using early returns to cut down on the nesting.

It's fine for me but I'll let someone more familiar with state sync approve. cc @telezhnaya or @posvyatokum

chain/client/src/metrics.rs Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
runtime/near-wallet-contract/res/wallet_contract.wasm Outdated Show resolved Hide resolved
Comment on lines 1123 to 1124
state_parts_arbiter_handle: &ArbiterHandle,
state_parts_mpsc_tx: Sender<StateSyncGetResult>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mayber rename to state_files_* ? But let's put renaming of existing structs in a new PR.

chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
@VanBarbascu VanBarbascu force-pushed the state-sync-download-headers branch from ad4dfd0 to 7679583 Compare January 30, 2024 08:27
Copy link
Contributor Author

@VanBarbascu VanBarbascu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look, I have addressed the comments I got so far.

chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
}
match header_result {
Ok(header) => {
if !shard_sync_download.downloads[0].done {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I separated the header store logic and the error handling for part and header.

epoch_id: &EpochId,
epoch_height: EpochHeight,
chain_id: &str,
semaphore: Arc<Semaphore>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we only have one file to download, we don't need it. I have removed the semaphore for header.

@@ -541,34 +593,58 @@ impl StateSync {
/// Makes a StateRequestHeader header to one of the peers.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

chain/client/src/sync/state.rs Outdated Show resolved Hide resolved
}
match header_result {
Ok(header) => {
if !shard_sync_download.downloads[0].done {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}),
);
}
StateSyncInner::PartsFromExternal { chain_id, semaphore, external } => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@VanBarbascu VanBarbascu requested a review from wacban January 30, 2024 08:30
In a previous PR, we have introduced the option to upload the state parts to external storage.
In this PR we are changing the way that state sync headers are requested, introducing to download
from external storage. The motivation behind this is that sometimes peers may not track all shards
making it impossible for them to share the state sync header.
@VanBarbascu VanBarbascu force-pushed the state-sync-download-headers branch from 7679583 to 4d1151f Compare January 30, 2024 12:08
Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Now that we have both upload and download can you also add a test for it? Either nayduck or integration test would probably work.

Comment on lines +110 to +111
part_id: Option<PartId>,
result: Result<StateSyncFileDownloadResult, String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the part id to the inside of StateSyncFileDownloadResult::StatePart?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because in case of error, we would not know which part failed.

Comment on lines 417 to 421
let (file_type, download_idx) = if let Some(part_id) = part_id {
(StateFileType::part_str(), part_id.idx)
} else {
(StateFileType::header_str(), 0)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the info about the type is duplicated in the result enum and the part id optionality. Ideally it should only be one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@VanBarbascu VanBarbascu force-pushed the state-sync-download-headers branch from 4d1151f to f5adcca Compare January 30, 2024 14:20
@VanBarbascu VanBarbascu force-pushed the state-sync-download-headers branch from f5adcca to 1e4ed4f Compare January 30, 2024 14:27
@VanBarbascu
Copy link
Contributor Author

The external storage state sync is covered by pytest state_sync_then_catchup.py and it passes on my changes

@VanBarbascu VanBarbascu added this pull request to the merge queue Jan 30, 2024
Merged via the queue into near:master with commit 9685ef3 Jan 30, 2024
22 of 27 checks passed
@VanBarbascu VanBarbascu deleted the state-sync-download-headers branch January 30, 2024 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stateless-validation Area: stateless validation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants