Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: sql: fix UPSERT in some cases #51944

Merged
merged 1 commit into from
Jul 27, 2020

Conversation

yuzefovich
Copy link
Member

optTableUpserter.resultCount tracks the number of rows in the current
batch (and mirrors rowsUpserted.Len() if it's collecting rows).
Previously, resultCount was never reset although rowsUpserted was.
This resulted in a "unsynchronization" of BatchedCount and
BatchedValues methods which - I think - could lead to incorrect
results being returned by RETURNING clause of UPSERT statement (possibly
attempting to request a "value" that is "out of bounds").

This is a "backport" of the fix in #51631 without other cleanup stuff.

Release note: None

`optTableUpserter.resultCount` tracks the number of rows in the current
batch (and mirrors `rowsUpserted.Len()` if it's collecting rows).
Previously, `resultCount` was never reset although `rowsUpserted` was.
This resulted in a "unsynchronization" of `BatchedCount` and
`BatchedValues` methods which - I think - could lead to incorrect
results being returned by RETURNING clause of UPSERT statement (possibly
attempting to request a "value" that is "out of bounds").

Release note: None
@yuzefovich yuzefovich requested a review from RaduBerinde July 27, 2020 21:21
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)

@yuzefovich yuzefovich merged commit 2250647 into cockroachdb:release-20.1 Jul 27, 2020
@yuzefovich yuzefovich deleted the backport51631-20.1 branch July 27, 2020 23:02
yuzefovich added a commit to yuzefovich/cockroach that referenced this pull request Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original
bug was that if an UPSERT has RETURNING clause, we could have returned
incorrect values because internally `resultCount` and `rowsUpserted`
could be not synchronized. It was fixed by resetting `resultCount`,
however, the resetting was done unconditionally. This is incorrect
because `resultCount` variable is used by `upsertNode.BatchedNext` to
indicate whether there is more work to do (and if `resultCount==0`, then
we should finish). This bug would result in an UPSERT with or without
RETURNING clause of more than 10k rows actually process only 10k and
exit early. This is now fixed.

Relatedly, an UPSERT with RETURNING clause would incorrectly return no
rows when it was processing more than 10k rows.

Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which
exposed another bug when RETURNING clause is present - we were clearing
`rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch`
call), but that clear happens too early - we are accessing it after
`BatchedNext` returns with `BatchedValues`. This would lead to an index
out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we
always had a single batch and `flushAndStartNewBatch` was never called
(to reset the row container too early). This is also now fixed. Note
that this second bug was impossible to run into because of the first one
(we would never reach

Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could
finish UPSERT operation too early - namely, it would correctly insert
only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with
RETURNING clause in such scenario would return no rows (it would only
process 10k rows but return 0 rows).
yuzefovich added a commit to yuzefovich/cockroach that referenced this pull request Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original
bug was that if an UPSERT has RETURNING clause, we could have returned
incorrect values because internally `resultCount` and `rowsUpserted`
could be not synchronized. It was fixed by resetting `resultCount`,
however, the resetting was done unconditionally. This is incorrect
because `resultCount` variable is used by `upsertNode.BatchedNext` to
indicate whether there is more work to do (and if `resultCount==0`, then
we should finish). This bug would result in an UPSERT with or without
RETURNING clause of more than 10k rows actually process only 10k and
exit early. This is now fixed.

Relatedly, an UPSERT with RETURNING clause would incorrectly return no
rows when it was processing more than 10k rows.

Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which
exposed another bug when RETURNING clause is present - we were clearing
`rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch`
call), but that clear happens too early - we are accessing it after
`BatchedNext` returns with `BatchedValues`. This would lead to an index
out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we
always had a single batch and `flushAndStartNewBatch` was never called
(to reset the row container too early). This is also now fixed. Note
that this second bug was impossible to run into because of the first one
(we would never reach

Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could
finish UPSERT operation too early - namely, it would correctly insert
only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with
RETURNING clause in such scenario would return no rows (it would only
process 10k rows but return 0 rows).
yuzefovich added a commit to yuzefovich/cockroach that referenced this pull request Sep 16, 2020
In cockroachdb#51944 I fixed a bug but introduced an even worse one. The original
bug was that if an UPSERT has RETURNING clause, we could have returned
incorrect values because internally `resultCount` and `rowsUpserted`
could be not synchronized. It was fixed by resetting `resultCount`,
however, the resetting was done unconditionally. This is incorrect
because `resultCount` variable is used by `upsertNode.BatchedNext` to
indicate whether there is more work to do (and if `resultCount==0`, then
we should finish). This bug would result in an UPSERT with or without
RETURNING clause of more than 10k rows actually process only 10k and
exit early. This is now fixed.

Relatedly, an UPSERT with RETURNING clause would incorrectly return no
rows when it was processing more than 10k rows.

Additionally, cockroachdb#51626 fixed a bug with pagination of UPSERTs which
exposed another bug when RETURNING clause is present - we were clearing
`rowsUpserted` in `BatchedNext` (as part of `flushAndStartNewBatch`
call), but that clear happens too early - we are accessing it after
`BatchedNext` returns with `BatchedValues`. This would lead to an index
out of bounds crush. Before cockroachdb#51626 there was no pagination done, so we
always had a single batch and `flushAndStartNewBatch` was never called
(to reset the row container too early). This is also now fixed. Note
that this second bug was impossible to run into because of the first one
(we would never reach this code).

Release note (bug fix): CockroachDB in 20.1.4 and 20.1.5 releases could
finish UPSERT operation too early - namely, it would correctly insert
only up to 10000 rows and ignoring the rest. Furthermore, an UPSERT with
RETURNING clause in such scenario would return no rows (it would only
process 10k rows but return 0 rows).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants