-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-23.1: kvserver: avoid load based splits in middle of SQL row #103876
Conversation
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
dd7da1a
to
a1a5540
Compare
a1a5540
to
1e6cb0a
Compare
Added in #104082 on-top of the original PR commits. Should I merge into the previous commit Also rebased on master. I think the patch has sufficiently baked for a backport to 23.1 cc @nvanbenschoten. |
Created tracking issue for backports #104353 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I merge into the previous commit
kvserver: avoid load-based splitting between rows or leave as is?
You can leave as is to preserve the commit history.
I think the patch has sufficiently baked for a backport to 23.1
Agreed. LGTM
// 1. Within [startKey,endKey). | ||
// 2. No less than desiredSplitKey. | ||
// 3. Greater than the first key in [startKey,endKey]; or greater than all the | ||
// first row's keys if a table range. . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"range. ."
Feel free to clean this up if you happen to be working in this area on master. No need for the fix to be backported, of course.
Previously, there was no way to peak the contents of the load based splitter samples when inspecting nodes. This commit adds string methods for the `UnweightedFinder`, `WeightedFinder` and `Decider`. This commit also swaps the order of the should split check to avoid computation. As a result the output of `cpu_decider_cartesian` changed slightly as the no split key logging message is now ordered differently. Informs: #103672 Informs: #103483 Release note: None
It was possible for a SQL row to be torn across two ranges due to the load-based splitter not rejecting potentially unsafe split keys. It is impossible to determine with keys sampled from response spans, whether a key is certainly unsafe or safe. This commit side steps this problem by re-using the `adminSplitWithDescriptor` command to find the first real key, after or at the provided `args.SplitKey`. This ensures that the split key will always be a real key whilst not requiring any checks in the splitter itself. The updated `adminSplitWithDescriptor` is local only and requires opting into finding the first safe key by setting `findFirstSafeKey` to `true`. As such, all safe split key checks are also removed from the `split` pkg, with a warning added that the any split key returned is unsafe. Resolves: #103483 Release note (bug fix): It was possible for a SQL row to be split across two ranges. When this occurred, SQL queries could return unexpected errors. This bug is resolved by these changes, as we now inspect the real keys, rather than just request keys to determine load-based split points.
It was possible that a load based split was suggested for `meta1`, which would call `MVCCFirstSplitKey` and panic as the `meta1` start key `\Min` is local, whilst the `meta1` end key is global `0x02 0xff 0xff`. Add a check that the start key is greater than the `meta1` end key before processing in `MVCCFirstSplitKey` to prevent the panic. Note `meta1` would never be split regardless, as `storage.IsValidSplitKey` would fail after finding a split key. Also note that if the desired split key is a local key, the same problem doesn't exist as the minimum split key would be used to seek the first split key instead. Fixes: #104007 Release note: None
1e6cb0a
to
7f93988
Compare
Backport 3/3 commits from #103690 on behalf of @kvoli.
Backport 1/1 commits from #104082 on behalf of @kvoli.
/cc @cockroachdb/release
It was possible for a SQL row to be torn across two ranges due to the
load-based splitter not rejecting potentially unsafe split keys. It is
impossible to determine with just the sampled request keys, whether a
key is certainly unsafe or safe, so a split key is returned regardless of error.
This PR side steps this problem by re-using the
adminSplitWithDescriptor
command to find the first real key, after orat the provided
args.SplitKey
. This ensures that the split key willalways be a real key whilst not requiring any checks in the splitter
itself.
The updated
adminSplitWithDescriptor
is local only and requires optinginto finding the first safe key by setting
findFirstSafeKey
totrue
.As such, all safe split key checks are also removed from the
split
pkg, with a warning added that the any split key returned is unsafe.
Note that the weighted load based split finder, used for CPU splits
did not suffer from returning potentially unsafe splits due to e4f003b.
However, it was possible that no load-based split key was ever found
when using the weighted finder. This was because we discard potentially
unsafe samples, which could have been safe split points.
This PR reverts commit e4f003b, as the
safe split key is enforced elsewhere, mentioned above.
Resolves: #103483
Resolves: #103672
Release note (bug fix): It was possible for a SQL row to be split across
two ranges. When this occurred, SQL queries could return unexpected
errors. This bug is resolved by these changes, as we now sample the real
keys, rather than just request keys to determine load-based split points.
Release justification: Serious bug fix.