-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a bug that causes iterator to return wrong result in a rare data race #6973
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @siying for the fix.
@siying has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
// to this snapshot, but in that case it can see all the data in the | ||
// super version, which is a valid consistent state after the user | ||
// calls NewIterator(). | ||
snapshot = versions_->LastSequence(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use a couple SyncPoint
s at the end of the LastSequence()
function to enforce the flush happens immediately after sequence number is obtained. RIght now the sync points are in the right places but it's not obvious to maintainers how they should preserve the order in the future (esp. since their names are just 1, 2, 3, 4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a try and found that it would make the code much more complicated. The reason is that write also calls versions_->LastSequence(). The sync-point would be complicated if we place inside. I can move the sync point to here anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SyncPoint marker could be useful as the write/flush happen in a different thread. Up to you.
@siying has updated the pull request. Re-import the pull request |
@siying has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
// to this snapshot, but in that case it can see all the data in the | ||
// super version, which is a valid consistent state after the user | ||
// calls NewIterator(). | ||
snapshot = versions_->LastSequence(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SyncPoint marker could be useful as the write/flush happen in a different thread. Up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@siying has updated the pull request. Re-import the pull request |
Although the test failures are worth investigating, I don't think it can be caused by this PR. Will land it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
…case. Summary: The bug fixed in facebook#1816 is applicable to iterator too. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong. Fix it in the same way as the fix in facebook#1816, that is to get the sequence number after referencing the super version. Test Plan: Will run stress tests for a while to make sure there is no general regression.
@siying has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
…race (#6973) Summary: The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong. Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version. Pull Request resolved: #6973 Test Plan: Will run stress tests for a while to make sure there is no general regression. Reviewed By: ajkr Differential Revision: D22029348 fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
This pull request has been merged in d6b7b77. |
069b29cae backport of facebook/rocksdb#6973 f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization (cockroachdb#5304) Release note: None
069b29cae backport of facebook/rocksdb#6973 f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization (cockroachdb#5304) Release note (bug fix): Fixed a RocksDB that could result in inconsistencies in rare circumstances.
50394: vendor: bump Pebble to bc7bb07c r=jbowens a=jbowens ``` bc7bb07 tool: Enhance lsm visualization to show L0 sublevels 6b96ab6 internal/manifest: rename L0Sublevels.Files to L0Sublevels.Levels 0b22ec2 internal/manifest: rename Version.Files to Version.Levels 3b241b7 db: implement min-overlapping ratio heuristic 7ad569c use lockfree rand for skiplist eb0a851 db: fix race in NewSnapshot e941125 db: fix Checkpoint race condition 81f4ebe cmd/pebble: replace keys/sec with ops/sec(cum) ``` 50395: c-deps/rocksdb: update to 2512e96f r=jbowens a=jbowens ``` 069b29cae backport of facebook/rocksdb#6973 f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization (#5304) ``` Co-authored-by: Jackson Owens <[email protected]>
…race (#6973) Summary: The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong. Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version. Pull Request resolved: #6973 Test Plan: Will run stress tests for a while to make sure there is no general regression. Reviewed By: ajkr Differential Revision: D22029348 fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
069b29cae backport of facebook/rocksdb#6973 f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization Release note (bug fix): Fixed a RocksDB that could result in inconsistencies in rare circumstances.
364ab2b1 backport of facebook/rocksdb#6973 Release note (bug fix): Fixed a RocksDB that could result in inconsistencies in rare circumstances.
Summary:
The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong.
Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version.
Test Plan: Will run stress tests for a while to make sure there is no general regression.