-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: fix seek bug in pebbleMVCCScanner #60460
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like it must have been unpleasant to track down. Would it be worthwhile to have a debug version of MVCCIterator
that randomly exercises this behavior of mutating the returned unsafe key after a positioning operation.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something we'll want to backport, or are we confident that it can't cause issues with any of the existing iterator implementations?
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)
pkg/storage/pebble_mvcc_scanner.go, line 648 at r1 (raw file):
// the MVCCIterator will be at a different version of the same key, it is // free to mutate the backing for p.curUnsafeKey.Key in an arbitrary manner. // So assign to this copy, to make is stable.
s/is/it/
f0074df
to
c9215c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
That looks like it must have been unpleasant to track down.
yep, 3.5 days
Would it be worthwhile to have a debug version of MVCCIterator that randomly exercises this behavior of mutating the returned unsafe key after a positioning operation.
For some reason I was vacillating about this before, because this is the only place I know of in CockroachDB that uses the next optimization, and anyone else using an old key when the key has changed would be immediately wrong. But you suggesting this has convinced me -- I'll add it in a followup PR.
Is this something we'll want to backport, or are we confident that it can't cause issues with any of the existing iterator implementations?
This won't occur with the existing iterator, so I don't think we need a backport.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @itsbilal)
pkg/storage/pebble_mvcc_scanner.go, line 648 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/is/it/
Done
|
bors r+ |
Build failed: |
ce7fc08
to
2e5bae5
Compare
The temporary checking I added around reuse of pebbleMVCCScanner.keyBuf did not reveal any bugs. And now the CI is also green -- I am betting this is unrelated flakiness. I did find an UnsafeKey() reuse bug in fad6a33 when running with mangling of unsafe keys, that was causing a test failure there, that I am going to add to this PR. |
The seek key in pebbleMVCCScanner.seekVersion was an unsafe key that preceded a number of calls to Next. This used to work because the byte slice in the MVCCKey stayed the same as we were at a different version of the key. With the intentInterleavingIter this is not always true: the unsafe byte slice may have been constructed using the intentIter, which may no longer be in the same position. This was responsible for failures in the cdc/bank roachtest and occasional failures in TestChangefeedNemeses/sinkless, when separated intents were enabled. Some testing with mangling of unsafe keys revealed another bug, in rangefeed.registration, that is also fixed here. Informs cockroachdb#41720 Release note: None
2e5bae5
to
b0c70b6
Compare
bors r+ |
Build succeeded: |
The seek key in pebbleMVCCScanner.seekVersion was an
unsafe key that preceded a number of calls to Next.
This used to work because the byte slice in the
MVCCKey stayed the same as we were at a different
version of the key. With the intentInterleavingIter
this is not always true: the unsafe byte slice may
have been constructed using the intentIter, which may
no longer be in the same position.
This was responsible for failures in the cdc/bank
roachtest and occasional failures in
TestChangefeedNemeses/sinkless, when separated
intents were enabled.
Some testing with mangling of unsafe keys revealed
another bug, in rangefeed.registration, that is also
fixed here.
Informs #41720
Release note: None