Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up old account snapshot directories to avoid the file existing hardlink error #30426

Merged
merged 5 commits into from
Feb 23, 2023

Conversation

xiangzhu70
Copy link
Contributor

@xiangzhu70 xiangzhu70 commented Feb 21, 2023

Problem

hard-link appendvec file failed. Error: File exists (os error 17)

When generating account states for a snapshot, the accouts files are hardlinked to snapshot directories. This requires the hardlink destinations are clean. When generating a new bank snapshot, we first clean up these directories. The existing code uses the existence of the bank snapshot directory to judge if all related directories are cleaned up. This turned out not be solid.
The bug happens here is in the situation that the bank snapshot directory does not exist (so remove_bank_snapshot is not called to clean up all related directories), but the related account snapshot hardlink directories still exist, causing the hardlink failure later on with the error "file exists".

It looks like there is code somewhere in the unarchiving process which removes the bank snapshot directory directly (not calling remove_bank_snapshot()), so the corresponding snapshot hardlink directories under <account_path> are not removed.

So, it is not solid to skip cleaning up account snapshot directories when the bank snapshot directory does not exist.

Summary of Changes

Check and clean up the account snapshot hardlink directories even when the bank snapshot directory does not exist.

This is in the group of the PRs split from #28745, and is fixing the bug generated by #29496.

Fixes #
hard-link appendvec file failure. Error: File exists (os error 17)

@xiangzhu70 xiangzhu70 marked this pull request as ready for review February 22, 2023 05:07
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good; some nits

runtime/src/snapshot_utils.rs Outdated Show resolved Hide resolved
runtime/src/snapshot_utils.rs Outdated Show resolved Hide resolved
for account_path in account_paths {
let account_snapshot_path = account_path
.parent()
.ok_or(SnapshotError::InvalidAppendVecPath(account_path.clone()))?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about SnapshotError::IoWithSourceAndFile instead? If we can't get parent() here, it doesn't feel like an AppendVec path issue. It'd be because the account_path here isn't within run/, right (basically)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it is probably better to use SnapshotError::IoWithSourceAndFile here. I doubt we'll actually see this though - parent() would only return None if the account path was root or an empty string here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IoWithSourceAndFile need an IO error.

  #[error("source({1}) - I/O error: {0}, file: {2}")]
    IoWithSourceAndFile(#[source] std::io::Error, &'static str, PathBuf),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not an IO error. There is no std::io::Error supplied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, added a new error InvalidAccountPath

Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with minor suggestion + second of @brooksprumo comment

for account_path in account_paths {
let account_snapshot_path = account_path
.parent()
.ok_or(SnapshotError::InvalidAppendVecPath(account_path.clone()))?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it is probably better to use SnapshotError::IoWithSourceAndFile here. I doubt we'll actually see this though - parent() would only return None if the account path was root or an empty string here.

runtime/src/snapshot_utils.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - having this additional clean up check is good

@xiangzhu70 xiangzhu70 merged commit d6da019 into solana-labs:master Feb 23, 2023
nickfrosty pushed a commit to nickfrosty/solana that referenced this pull request Mar 12, 2023
…hardlink error (solana-labs#30426)

* Fix the hardlink failure

* minor comment cleanup

* use ? and slot_str

* &slot_str

* Add InvalidAccountPath
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants