-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-20.1: sql: fix pagination in UPSERT #51626
Merged
yuzefovich
merged 2 commits into
cockroachdb:release-20.1
from
yuzefovich:backport20.1-51608
Jul 21, 2020
Merged
release-20.1: sql: fix pagination in UPSERT #51626
yuzefovich
merged 2 commits into
cockroachdb:release-20.1
from
yuzefovich:backport20.1-51608
Jul 21, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit removes a couple of duplicated "init" calls as well as some unused parameters around upsert code. Release note: None
`optTableUpserter` was incorrectly overriding `curBatchSize` method by returning `insertRows.Len`, but that container is not actually used anywhere. As a result, `curBatchSize` was always considered 0, and we didn't perform the pagination on the UPSERTs. The bug was introduced in 33339 (in 19.1.0). Release note (bug fix): Previously, CockroachDB could hit a "command is too large" error when performing UPSERT operation with many values. Internally, we attempt to perform such operation by splitting it into "batches", but the batching mechanism was broken.
RaduBerinde
approved these changes
Jul 21, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)
yuzefovich
added a commit
to yuzefovich/cockroach
that referenced
this pull request
Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original bug was that if an UPSERT has RETURNING clause, we could have returned incorrect values because internally `resultCount` and `rowsUpserted` could be not synchronized. It was fixed by resetting `resultCount`, however, the resetting was done unconditionally. This is incorrect because `resultCount` variable is used by `upsertNode.BatchedNext` to indicate whether there is more work to do (and if `resultCount==0`, then we should finish). This bug would result in an UPSERT with or without RETURNING clause of more than 10k rows actually process only 10k and exit early. This is now fixed. Relatedly, an UPSERT with RETURNING clause would incorrectly return no rows when it was processing more than 10k rows. Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which exposed another bug when RETURNING clause is present - we were clearing `rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch` call), but that clear happens too early - we are accessing it after `BatchedNext` returns with `BatchedValues`. This would lead to an index out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we always had a single batch and `flushAndStartNewBatch` was never called (to reset the row container too early). This is also now fixed. Note that this second bug was impossible to run into because of the first one (we would never reach Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could finish UPSERT operation too early - namely, it would correctly insert only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with RETURNING clause in such scenario would return no rows (it would only process 10k rows but return 0 rows).
yuzefovich
added a commit
to yuzefovich/cockroach
that referenced
this pull request
Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original bug was that if an UPSERT has RETURNING clause, we could have returned incorrect values because internally `resultCount` and `rowsUpserted` could be not synchronized. It was fixed by resetting `resultCount`, however, the resetting was done unconditionally. This is incorrect because `resultCount` variable is used by `upsertNode.BatchedNext` to indicate whether there is more work to do (and if `resultCount==0`, then we should finish). This bug would result in an UPSERT with or without RETURNING clause of more than 10k rows actually process only 10k and exit early. This is now fixed. Relatedly, an UPSERT with RETURNING clause would incorrectly return no rows when it was processing more than 10k rows. Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which exposed another bug when RETURNING clause is present - we were clearing `rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch` call), but that clear happens too early - we are accessing it after `BatchedNext` returns with `BatchedValues`. This would lead to an index out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we always had a single batch and `flushAndStartNewBatch` was never called (to reset the row container too early). This is also now fixed. Note that this second bug was impossible to run into because of the first one (we would never reach Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could finish UPSERT operation too early - namely, it would correctly insert only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with RETURNING clause in such scenario would return no rows (it would only process 10k rows but return 0 rows).
yuzefovich
added a commit
to yuzefovich/cockroach
that referenced
this pull request
Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original bug was that if an UPSERT has RETURNING clause, we could have returned incorrect values because internally `resultCount` and `rowsUpserted` could be not synchronized. It was fixed by resetting `resultCount`, however, the resetting was done unconditionally. This is incorrect because `resultCount` variable is used by `upsertNode.BatchedNext` to indicate whether there is more work to do (and if `resultCount==0`, then we should finish). This bug would result in an UPSERT with or without RETURNING clause of more than 10k rows actually process only 10k and exit early. This is now fixed. Relatedly, an UPSERT with RETURNING clause would incorrectly return no rows when it was processing more than 10k rows. Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which exposed another bug when RETURNING clause is present - we were clearing `rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch` call), but that clear happens too early - we are accessing it after `BatchedNext` returns with `BatchedValues`. This would lead to an index out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we always had a single batch and `flushAndStartNewBatch` was never called (to reset the row container too early). This is also now fixed. Note that this second bug was impossible to run into because of the first one (we would never reach this code). Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could finish UPSERT operation too early - namely, it would correctly insert only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with RETURNING clause in such scenario would return no rows (it would only process 10k rows but return 0 rows).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 2/2 commits from #51608.
/cc @cockroachdb/release
sql: minor cleanup around upsert
This commit removes a couple of duplicated "init" calls as well as some
unused parameters around upsert code.
Release note: None
sql: fix pagination in UPSERT
optTableUpserter
was incorrectly overridingcurBatchSize
method byreturning
insertRows.Len
, but that container is not actually usedanywhere. As a result,
curBatchSize
was always considered 0, and wedidn't perform the pagination on the UPSERTs. The bug was introduced in
#33339 (in 19.1.0).
Fixes: #51391.
Release note (bug fix): Previously, CockroachDB could hit a "command is
too large" error when performing UPSERT operation with many values.
Internally, we attempt to perform such operation by splitting it into
"batches", but the batching mechanism was broken.