-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-20.1: sql: fix UPSERT in some cases #51944
Merged
yuzefovich
merged 1 commit into
cockroachdb:release-20.1
from
yuzefovich:backport51631-20.1
Jul 27, 2020
Merged
release-20.1: sql: fix UPSERT in some cases #51944
yuzefovich
merged 1 commit into
cockroachdb:release-20.1
from
yuzefovich:backport51631-20.1
Jul 27, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
`optTableUpserter.resultCount` tracks the number of rows in the current batch (and mirrors `rowsUpserted.Len()` if it's collecting rows). Previously, `resultCount` was never reset although `rowsUpserted` was. This resulted in a "unsynchronization" of `BatchedCount` and `BatchedValues` methods which - I think - could lead to incorrect results being returned by RETURNING clause of UPSERT statement (possibly attempting to request a "value" that is "out of bounds"). Release note: None
RaduBerinde
approved these changes
Jul 27, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)
yuzefovich
added a commit
to yuzefovich/cockroach
that referenced
this pull request
Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original bug was that if an UPSERT has RETURNING clause, we could have returned incorrect values because internally `resultCount` and `rowsUpserted` could be not synchronized. It was fixed by resetting `resultCount`, however, the resetting was done unconditionally. This is incorrect because `resultCount` variable is used by `upsertNode.BatchedNext` to indicate whether there is more work to do (and if `resultCount==0`, then we should finish). This bug would result in an UPSERT with or without RETURNING clause of more than 10k rows actually process only 10k and exit early. This is now fixed. Relatedly, an UPSERT with RETURNING clause would incorrectly return no rows when it was processing more than 10k rows. Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which exposed another bug when RETURNING clause is present - we were clearing `rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch` call), but that clear happens too early - we are accessing it after `BatchedNext` returns with `BatchedValues`. This would lead to an index out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we always had a single batch and `flushAndStartNewBatch` was never called (to reset the row container too early). This is also now fixed. Note that this second bug was impossible to run into because of the first one (we would never reach Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could finish UPSERT operation too early - namely, it would correctly insert only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with RETURNING clause in such scenario would return no rows (it would only process 10k rows but return 0 rows).
yuzefovich
added a commit
to yuzefovich/cockroach
that referenced
this pull request
Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original bug was that if an UPSERT has RETURNING clause, we could have returned incorrect values because internally `resultCount` and `rowsUpserted` could be not synchronized. It was fixed by resetting `resultCount`, however, the resetting was done unconditionally. This is incorrect because `resultCount` variable is used by `upsertNode.BatchedNext` to indicate whether there is more work to do (and if `resultCount==0`, then we should finish). This bug would result in an UPSERT with or without RETURNING clause of more than 10k rows actually process only 10k and exit early. This is now fixed. Relatedly, an UPSERT with RETURNING clause would incorrectly return no rows when it was processing more than 10k rows. Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which exposed another bug when RETURNING clause is present - we were clearing `rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch` call), but that clear happens too early - we are accessing it after `BatchedNext` returns with `BatchedValues`. This would lead to an index out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we always had a single batch and `flushAndStartNewBatch` was never called (to reset the row container too early). This is also now fixed. Note that this second bug was impossible to run into because of the first one (we would never reach Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could finish UPSERT operation too early - namely, it would correctly insert only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with RETURNING clause in such scenario would return no rows (it would only process 10k rows but return 0 rows).
yuzefovich
added a commit
to yuzefovich/cockroach
that referenced
this pull request
Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original bug was that if an UPSERT has RETURNING clause, we could have returned incorrect values because internally `resultCount` and `rowsUpserted` could be not synchronized. It was fixed by resetting `resultCount`, however, the resetting was done unconditionally. This is incorrect because `resultCount` variable is used by `upsertNode.BatchedNext` to indicate whether there is more work to do (and if `resultCount==0`, then we should finish). This bug would result in an UPSERT with or without RETURNING clause of more than 10k rows actually process only 10k and exit early. This is now fixed. Relatedly, an UPSERT with RETURNING clause would incorrectly return no rows when it was processing more than 10k rows. Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which exposed another bug when RETURNING clause is present - we were clearing `rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch` call), but that clear happens too early - we are accessing it after `BatchedNext` returns with `BatchedValues`. This would lead to an index out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we always had a single batch and `flushAndStartNewBatch` was never called (to reset the row container too early). This is also now fixed. Note that this second bug was impossible to run into because of the first one (we would never reach this code). Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could finish UPSERT operation too early - namely, it would correctly insert only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with RETURNING clause in such scenario would return no rows (it would only process 10k rows but return 0 rows).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
optTableUpserter.resultCount
tracks the number of rows in the currentbatch (and mirrors
rowsUpserted.Len()
if it's collecting rows).Previously,
resultCount
was never reset althoughrowsUpserted
was.This resulted in a "unsynchronization" of
BatchedCount
andBatchedValues
methods which - I think - could lead to incorrectresults being returned by RETURNING clause of UPSERT statement (possibly
attempting to request a "value" that is "out of bounds").
This is a "backport" of the fix in #51631 without other cleanup stuff.
Release note: None