-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: consult intent history for all read-your-own-write transactions #97471
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @arulajmani!
Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @jbowens)
pkg/storage/engine.go
line 1744 at r1 (raw file):
// [1] The supplied txnID may be empty (uuid.Nil) if the request on behalf of // which the scan is being performed is non-transactional. func ScanConflictingIntentsForDroppingLatchesEarly(
It's unfortunate that we don't have any unit testing that directly exercises this function. What do you think about adding one? This is already extracted out as a free function in the storage
package, so it shouldn't be too much of a lift.
In fact, you could consider adding one in an initial commit so that you can easily demonstrate the cases that change in the bug fix commit.
pkg/sql/logictest/testdata/logic_test/savepoints
line 1 at r1 (raw file):
# Regression test for https://github.com/cockroachdb/cockroach/issues/94337.
Could you add an explanation in this file about why the pg_sleeps were necessary to trigger the issue in #94337?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jbowens and @nvanbenschoten)
pkg/storage/engine.go
line 1744 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's unfortunate that we don't have any unit testing that directly exercises this function. What do you think about adding one? This is already extracted out as a free function in the
storage
package, so it shouldn't be too much of a lift.In fact, you could consider adding one in an initial commit so that you can easily demonstrate the cases that change in the bug fix commit.
Added a couple of tests in an initial commit. I found it easier to structure these as "read your own writes" and "others", so there are two tests. Mostly because there's so many dimensions to consider in the "read your own writes" case.
I also chose to keep the in-line expectations as how an optimized determination for needsIntentHistory
would look like, and wholesale asserted all cases return true for needsIntentHistory
. I'm open to changing this if it feels a bit awkward, but I felt there's some value in keeping these expectations around if we ever do try to optimize this in the future. Let me know what you think.
pkg/sql/logictest/testdata/logic_test/savepoints
line 1 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could you add an explanation in this file about why the pg_sleeps were necessary to trigger the issue in #94337?
Good call, added some words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r2, 10 of 11 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani and @jbowens)
-- commits
line 7 at r2:
We should make the tests pass with just this commit, but then change the expectations so that they still pass in the following commit. That way, it's clear which test cases are changed by the second commit. This also ensures that git bisects won't get tripped up by the intermediate state.
pkg/storage/engine.go
line 1744 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Added a couple of tests in an initial commit. I found it easier to structure these as "read your own writes" and "others", so there are two tests. Mostly because there's so many dimensions to consider in the "read your own writes" case.
I also chose to keep the in-line expectations as how an optimized determination for
needsIntentHistory
would look like, and wholesale asserted all cases return true forneedsIntentHistory
. I'm open to changing this if it feels a bit awkward, but I felt there's some value in keeping these expectations around if we ever do try to optimize this in the future. Let me know what you think.
It's a little awkward to have an unused expected value, but I understand why you ended up with this. It seems fine to me, assuming you address the previous comment about not having a failing test in the first commit.
pkg/storage/engine_test.go
line 1965 at r2 (raw file):
}, { name: "higher {intent seq number} equal{epoch} lower {intent timestamp}",
nit: here and below equal {epoch}
. There are a few other cases where equal
is missing a space after it too.
pkg/storage/engine_test.go
line 2116 at r2 (raw file):
// TestScanConflictingIntentsForDroppingLatchesEarly tests // ScanConflictingIntentsForDroppingLatchesEarly for all non read-your-own-write // cases. Read-your-own-write cases are tested separately.
nit: separately in ...
Also, super minor, but consider placing this test above the other one, since it's more general.
pkg/storage/engine_test.go
line 2203 at r2 (raw file):
end: keyC, expNeedsIntentHistory: false, expNumFoundIntents: 0,
nit: we seem to be explicit about expNumFoundIntents: 0
here, so we should do so in other cases as well.
This commit adds two tests, one for various read-your-own-write cases, and another for other cases. A few subtests for TestScanConflictingIntentsForDroppingLatchesEarlyReadYourOwnWrites have incorrect expectations as of this commit because of existing bugs, and are commented as such. They'll be fixed in the following commit. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r=nvanbenschoten
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jbowens and @nvanbenschoten)
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We should make the tests pass with just this commit, but then change the expectations so that they still pass in the following commit. That way, it's clear which test cases are changed by the second commit. This also ensures that git bisects won't get tripped up by the intermediate state.
Makes sense, done.
pkg/storage/engine.go
line 1744 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's a little awkward to have an unused expected value, but I understand why you ended up with this. It seems fine to me, assuming you address the previous comment about not having a failing test in the first commit.
To make it slightly better, I ended up defining a alwaysFallbackToIntentInterleavingIteratorForReadYourOwnWrites
and using that to decide whether to use the expectation or always assert true.
This way, if we decide to introduce this optimization back in the future behind some sort of cluster setting, we can run this test for both true and false variations.
pkg/storage/engine_test.go
line 2116 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit:
separately in ...
Also, super minor, but consider placing this test above the other one, since it's more general.
Done.
pkg/storage/engine_test.go
line 2203 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: we seem to be explicit about
expNumFoundIntents: 0
here, so we should do so in other cases as well.
Done.
Build failed (retrying...): |
Build succeeded: |
Read-only requests that qualify for dropping latches early scan the lock table under latches to perform conflict resolution. If they don't conflict with any concurrent transactions, they drop their latches and scan the MVCC keyspace. However, if a request is trying to read a key that was written to in its transaction, we may need access to the key's intent history when performing the scan.
If scanning the lock table uncovers an intent by our own transaction, we need access to the intent history if:
Prior to this patch, we weren't accounting for 3 and 4. Worse yet, our determination for 2 was a less than equal to -- which meant we would end up consulting the intent history in the common case. This meant that existing tests which would have exercised 3 and 4 weren't failing on us.
Instead of correcting the determination, this patch makes it such that all read-your-own-write transactions consult intent history. This avoids the subtle complexity of getting all these cases right and keeping them in sync with the MVCC code that reads keys. It also seems like scanning the lock table twice (once during conflict resolution and once during the MVCC read) for read-your-own-write transactions in the common case did not cause noticable regressions. We can always optimize this later, if we find there's a need to.
Epic: none
Fixes: #94337
Release note: None