-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: change UPSERT to not require pre-fetching everything upfront #23698
Conversation
Note this currently implements "only 1 result returned by RETURNING" as discussed on the mailing list (option 1) and thus still precludes batching. The code would be further simplified and batchability reached if I could change it to "return the rows as they are processed". |
I have added a second commit that implements option 2 and explains in the commit message the semantics of RETURNING. RFAL |
I think once the dust settles on option 1 vs option 2 (I'm assuming option 2 will prevail), I can squash the two commits. Thoughts? |
Yeah I would squash. I have a few minor comments. Looks good overall but I'm pretty unfamiliar with this code so I would get another review. Review status: 0 of 3 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. pkg/sql/tablewriter.go, line 294 at r2 (raw file):
[nit] indices pkg/sql/tablewriter.go, line 408 at r2 (raw file):
This 200+ function is very hard to follow, can you think of a way to break things up a little? pkg/sql/tablewriter.go, line 528 at r2 (raw file):
excerpt pkg/sql/tablewriter.go, line 600 at r2 (raw file):
[nit] Using the pkg/sql/tablewriter.go, line 603 at r2 (raw file):
[nit] usually having the pkg/sql/tablewriter.go, line 606 at r2 (raw file):
This code is pretty hard to understand. We are (to some extent) using the map as a proxy for comparing pkg/sql/tablewriter.go, line 761 at r2 (raw file):
[nit] it would help to give these return values some names Comments from Reviewable |
The commit message says:
What are these upserts that were valid in PG? It was my understanding after your recent email thread that Postgres disallows this too. The following corroborates that understanding:
We're not planning on cherry-picking this into 2.0, are we? Reviewed 1 of 3 files at r1. pkg/sql/tablewriter.go, line 656 at r1 (raw file):
Not all your words, but we're pluralizing primary key in a few places here, which makes this confusing to read. Please singularize, like pkg/sql/tablewriter.go, line 659 at r1 (raw file):
I'd revert the move from pkg/sql/tablewriter.go, line 691 at r1 (raw file):
Add a comment here. pkg/sql/tablewriter.go, line 749 at r1 (raw file):
This is redundant. Also, you should name these return values and then refer to them by name in this comment. EDIT: just noticed Radu's comment below. Comments from Reviewable |
Indeed. I wrote the commit message before I understood that pg also rejects this. Thanks for spotting it. I will respond to your other comments and Radu's after I simplify the code as requested. Review status: 1 of 3 files reviewed at latest revision, 11 unresolved discussions, all commit checks successful. Comments from Reviewable |
b30f4f3
to
5a7da37
Compare
I have squashed the code and clarified the code further by:
RFAL. To the other question:
This discussion has been punted until after this PR merges (if/when it does). Review status: 1 of 3 files reviewed at latest revision, 11 unresolved discussions. pkg/sql/tablewriter.go, line 656 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/sql/tablewriter.go, line 659 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/sql/tablewriter.go, line 691 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/sql/tablewriter.go, line 749 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/sql/tablewriter.go, line 294 at r2 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 408 at r2 (raw file): Previously, RaduBerinde wrote…
Absolutely! Done, also made the naming of variables more consistent. pkg/sql/tablewriter.go, line 528 at r2 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 600 at r2 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 603 at r2 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, RaduBerinde wrote…
I guess we could, but there's no simple operator (that I know of) to compare two byte slices in Go. Want to make a suggestion? pkg/sql/tablewriter.go, line 761 at r2 (raw file): Previously, RaduBerinde wrote…
Done. Comments from Reviewable |
Friendly request to look at this this week, so that I can enjoy my vacation without fearing some merge conflicts during a rebase after I come back. |
pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, knz (kena) wrote…
bytes.Compare? Comments from Reviewable |
pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, RaduBerinde wrote…
even better bytes.Equal Comments from Reviewable |
Review status: 0 of 3 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, RaduBerinde wrote…
TIL. Thanks! I didn't know about the Now, that said, let's reconsider in context. The Once we've checked the map, we have an index. We also have the index from the row we're inserting. I agree that these two indexes are a "proxy" for the PK, but since they are there anyway, isn't it preferable for efficiency to just compare that? These two vars have at most 4-8 bytes, whereas a PK can be arbitrarily long. Comments from Reviewable |
pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, knz (kena) wrote…
Sure. Still, I would separate the deletion of the old PK from the two cases of updating existingRows. For example, have a Comments from Reviewable |
Thanks for the update! Just some nits; still need to make a more in-depth pass. Review status: 0 of 3 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/sql/tablewriter.go, line 443 at r3 (raw file):
we determined or it was determined pkg/sql/tablewriter.go, line 447 at r3 (raw file):
nit: blank line pkg/sql/tablewriter.go, line 470 at r3 (raw file):
[nit] This combo of elses and continue is confusing. I would separate this out and check for this first, just to get it out of the way (even if the condition becomes pkg/sql/tablewriter.go, line 484 at r3 (raw file):
[nit] """? you could also just indent the quote two spaces Comments from Reviewable |
Review status: 0 of 3 files reviewed at latest revision, 9 unresolved discussions. pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, RaduBerinde wrote…
Oooh now I understand what you mean. Thank you! pkg/sql/tablewriter.go, line 443 at r3 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 447 at r3 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 470 at r3 (raw file): Previously, RaduBerinde wrote…
Done. pkg/sql/tablewriter.go, line 484 at r3 (raw file): Previously, RaduBerinde wrote…
Done. Comments from Reviewable |
Review status: 0 of 3 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/sql/tablewriter.go, line 606 at r2 (raw file): Previously, knz (kena) wrote…
Thanks, this is much easier to understand! pkg/sql/tablewriter.go, line 419 at r4 (raw file):
"each PK" is not right is it? It's just those PKs that have conflicts pkg/sql/tablewriter.go, line 660 at r4 (raw file):
indices Comments from Reviewable |
**tl;dr:** - lifts limitation, slight perf boost. - clarifies the semantics of RETURNING when the same PK is updated multiple times. - Initially motivated by work on a separate patch to introduce semantically correct behavior for mutation statements. Details follow. Prior to this patch, UPSERT with secondary indexes and INSERT ON UPDATE DO SET ... would proceed as follows: - it would load all the inserted rows into memory. - it would pre-determine in one scan which of these already have data in KV (a conflict). - after this initialization, it would then proceed to resolve the upsert, by either inserting new rows (for each source row for which no conflict was seen) or updating an existing row (for each source row where there was a conflict). This situation caused three unrelated problems. 1) The first problem is that if there are many rows, the KV batch could blow up and fail / make everything slow. In comparison, we like the DELETE logic that was recently modified to operate using multiple short(er) batches. The UPSERT code should be able to evolve to do the same. 2) The second problem is that this code is unable to process two subsequent upserts to the same PK. For example, suppose that a table has no data to start with, and the following statement is issued: INSERT INTO kv(k) VALUES (1), (1) ON CONFLICT(k) DO UPDATE SET v = 2 The initial processing described above would determine that there is no conflict for neither values being inserted. Then after this initialization, processing would start. The first iteration would see there was no conflict, and insert a fresh row. The second iteration would see there was no conflict (again, the conflict information was computed initially), and (try to) insert a fresh row and fail with an error (row already exists). Because the conflict detection is done initially, it becomes impossible to properly upsert multiple rows with the same PK. Even worse, if the conflicting index is a secondary index, it is possible for UPSERT to complete without error but without processing the upsert properly, which is a plain correctness bug. To avoid both this latter bug and the conflict error, the previous code was equipped with a detection mechanism to reject multiple updates to the same PK. This check was necessary for correctness, but it was unfortunate because it rejects upserts that seem useful from a UX perspective. 3) for consistency with the check required by problem 2, the "fast path" of UPSERT was equipped with a check for the same condition. This is also rather unfortunate because the fast path uses different code that is not otherwise subject to the correctness problem above and could already support multiple updates to the same PK outright. The check was there thus only for UX consistency, so the user sees the same error regardless of which code path is used. This patch aims to resolve problem 1, and ends up solving problems 2 and 3 as a side effect. It changes the main processing to be like this: - like before, it loads the rows and pre-loads the existing row information during the initialization. - however, during the main loop, any insert or update feeds back the new row into the existing row information, so that subsequent iterations can access the updated data. Because this new processing enables incremental upsert resolution, this resolves problem 1, because now the processing can operate in smaller batches. Also it does so without the risk for correctness problems by feeding earlier updates into later updates. So the check for duplicates can be removed and the 2nd problem above is resolved as well. And then since the slow/general code doesn't need this check, the fast path code doesn't need it either any more (it was only implemented on both sides for consistency). So this patch removes the check on the fast path too and problem 3 is resolved as well. Very small performance boost at large row counts: ``` name old time/op new time/op delta SQL/Cockroach/Upsert/count=1-16 833µs ± 9% 805µs ± 4% ~ (p=0.518 n=5+9) SQL/Cockroach/Upsert/count=10-16 1.09ms ± 1% 1.09ms ± 2% ~ (p=1.000 n=5+10) SQL/Cockroach/Upsert/count=100-16 3.99ms ± 3% 3.77ms ± 9% ~ (p=0.099 n=5+10) SQL/Cockroach/Upsert/count=1000-16 28.0ms ±12% 25.4ms ± 4% -9.47% (p=0.003 n=5+8) name old alloc/op new alloc/op delta SQL/Cockroach/Upsert/count=1-16 113kB ± 1% 112kB ± 1% ~ (p=0.206 n=5+10) SQL/Cockroach/Upsert/count=10-16 155kB ± 1% 155kB ± 1% ~ (p=0.768 n=5+10) SQL/Cockroach/Upsert/count=100-16 716kB ± 2% 716kB ± 2% ~ (p=0.859 n=5+10) SQL/Cockroach/Upsert/count=1000-16 6.31MB ± 1% 6.29MB ± 1% ~ (p=0.594 n=5+10) name old allocs/op new allocs/op delta SQL/Cockroach/Upsert/count=1-16 1.02k ± 1% 0.99k ± 1% -2.32% (p=0.001 n=5+10) SQL/Cockroach/Upsert/count=10-16 1.39k ± 3% 1.35k ± 5% ~ (p=0.218 n=5+10) SQL/Cockroach/Upsert/count=100-16 5.39k ± 4% 5.23k ±15% ~ (p=0.129 n=5+10) SQL/Cockroach/Upsert/count=1000-16 45.7k ± 6% 43.5k ±11% ~ (p=0.129 n=5+10) ``` This patch also makes RETURNING on an upserting operation return all the rows in the data source that have caused a database update. If multiple rows in the data source cause an update to the same row in the database, there will be a RETURNING result for each update separately. This _extends_ the behavior of PostgreSQL. Postgres merely [specifies](https://www.postgresql.org/docs/10/static/sql-insert.html): > The optional RETURNING clause causes INSERT to compute and return > value(s) based on each row actually inserted (or updated, if an ON > CONFLICT DO UPDATE clause was used). This is primarily useful for > obtaining values that were supplied by defaults, such as a serial > sequence number. However, any expression using the table's columns > is allowed. The syntax of the RETURNING list is identical to that of > the output list of SELECT. Only rows that were successfully inserted > or updated will be returned. For example, if a row was locked but > not updated because an ON CONFLICT DO UPDATE ... WHERE clause > condition was not satisfied, the row will not be returned. However Postgres does not support multiple updates to the same row. The previous patch introduced that support; this patch clarifies and strengthens the semantics of RETURNING in that case. For example, starting with: ```sql CREATE TABLE kv(k INT PRIMARY KEY, v INT); ``` The following results are returned: ```sql > UPSERT INTO kv(k) VALUES (1), (1), (2) RETURNING k,v; +---+------+ | k | v | +---+------+ | 1 | NULL | | 2 | NULL | +---+------+ (1 row) ``` The data source specifies only k, so this query specifies `v` is to be left unchanged. The first upsert of `(1)` creates a new row `(1, NULL`); the second upsert of `(1)` doesn't change anything, so doesn't produce a RETURNING row; the last upsert of `(2)` adds a new row so is reported. Another example: ```sql > UPDATE kv SET v = 1 WHERE k = 1; > INSERT INTO kv(k) VALUES (1), (5) ON CONFLICT(k) DO UPDATE SET v = kv.v + 1 RETURNING k,v; +---+------+ | k | v | +---+------+ | 1 | 2 | | 5 | NULL | +---+------+ (2 rows) ``` Both rows caused an update; RETURNING always reflects the values post-update. A more subtle situation: ```sql > UPDATE kv SET v = 1 WHERE k = 1; > UPSERT INTO kv VALUES (1), (1) RETURNING k,v; +---+------+ | k | v | +---+------+ | 1 | NULL | | 1 | NULL | +---+------+ (2 rows) ``` The query does not specify the set of target columns, so it is an alias for `UPSERT INTO kv(k,v) VALUES (1, DEFAULT), (1, DEFAULT)`. The DEFAULT value for `v` is NULL. So this query overwrites `v` for the first row and that row is returned. The second row is also an overwrite, so it is also returned. ```sql > INSERT INTO kv VALUES (1), (4) ON CONFLICT(k) DO NOTHING RETURNING k,v; +---+------+ | k | v | +---+------+ | 4 | NULL | +---+------+ (1 row) ``` The query does not specify the set of target columns, so it is an alias for `INSERT INTO kv(k,v) VALUES (1,DEFAULT), (4,DEFAULT) ON CONFLICT (k) DO NOTHING`. The conflict on the first row prevents the update, so the RETURNING results don't contain it. Release note (sql change): lifted a limitation that prevented UPSERTs or INSERT ON CONFLICT DO UPDATE over multiple values with the same primary key. Release note (sql change): The behavior of `UPSERT` and `INSERT ... ON CONFLICT` when a `RETURNING` clause is present is made more consistent when an update touches the same row twice or more. This is a CockroachDB-specific extension.
Review status: 0 of 3 files reviewed at latest revision, 7 unresolved discussions, all commit checks successful. pkg/sql/tablewriter.go, line 419 at r4 (raw file): Previously, RaduBerinde wrote…
Yes correct, fixed. pkg/sql/tablewriter.go, line 660 at r4 (raw file): Previously, RaduBerinde wrote…
Done. Comments from Reviewable |
Okay this is ready! Thank you for your insightful and astute remarks, recommendations and requests. I have learned a bunch here. |
Commit pull away from #23373 to ease reviewing.
Fixes #23660.
tl;dr: lifts limitation, slight perf boost.
Change required by a separate patch (#23373) that introduces
semantically correct behavior for mutation statements.
Prior to this patch, UPSERT with secondary indexes and INSERT ON
UPDATE DO SET ... would proceed as follows:
already have data in KV (a conflict).
upsert, by either inserting new rows (for each source row for which
no conflict was seen) or updating an existing row (for each source
row where there was a conflict).
This situation caused three unrelated problems.
The first problem is that if there are many rows, the KV batch could
blow up and fail / make everything slow. In comparison, we like the
DELETE logic that was recently modified to operate using multiple
short(er) batches. The UPSERT code should be able to evolve to do the
same.
The second problem is that this code is unable to process two
subsequent upserts to the same PK. For example, suppose that a table
has no data to start with, and the following statement is issued:
The initial processing described above would determine that there is
no conflict for neither values being inserted. Then after this
initialization, processing would start. The first iteration would see
there was no conflict, and insert a fresh row. The second iteration
would see there was no conflict (again, the conflict information was
computed initially), and (try to) insert a fresh row and fail with an
error (row already exists).
Because the conflict detection is done initially, it becomes impossible
to properly upsert multiple rows with the same PK.
Even worse, if the conflicting index is a secondary index, it is
possible for UPSERT to complete without error but without processing
the upsert properly, which is a plain correctness bug.
To avoid both this latter bug and the conflict error, the previous
code was equipped with a detection mechanism to reject multiple
updates to the same PK. This check was necessary for correctness, but
it was unfortunate because it rejects upserts that are valid in
PostgreSQL.
for consistency with the check required by problem 2, the "fast
path" of UPSERT was equipped with a check for the same
condition. This is also rather unfortunate because the fast path
uses different code that is not otherwise subject to the
correctness problem above and could already support multiple
updates to the same PK outright. The check was there thus only for
UX consistency, so the user sees the same error regardless of which
code path is used.
This patch aims to resolve problem 1, and ends up solving problems 2
and 3 as a side effect. It changes the main processing to be like
this:
row information during the initialization.
the new row into the existing row information, so that
subsequent iterations can access the updated data.
Because this new processing enables incremental upsert resolution,
this resolves problem 1, because now the processing can operate
in smaller batches. (This change will be performed separately in #23373)
Also it does so without the risk for correctness problems by feeding
earlier updates into later updates. So the check for duplicates can be
removed and the 2nd problem above is resolved as well.
And then since the slow/general code doesn't need this check, the fast
path code doesn't need it either any more (it was only implemented on
both sides for consistency). So this patch removes the check on the
fast path too and problem 3 is resolved as well.
Very small performance boost at large row counts:
Release note (sql change): lifted a limitation that prevented UPSERTs
or INSERT ON CONFLICT DO UPDATE over multiple values with the same
primary key.