opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements #33339

andy-kimball · 2018-12-23T23:39:04Z

This commit adds support for three additional mutation operator cases:

UPSERT statement
INSERT..ON CONFLICT DO NOTHING
INSERT..ON CONFLICT..DO UPDATE

Each of these has overlapping functionality with INSERT, UPDATE, and
with each other, but also have important differences. Similar to the
other mutation operators, the Upsert operator accepts a relational
input expression that projects columns that will be used as part of
executing the mutation:

Columns containing existing values fetched from the target table
and used to detect conflicts and to formulate the key/value update
commands.
Columns containing updated values to set when a conflict is
detected, as specified by the user.
Computed columns which will be updated when a conflict is detected
and that are dependent on one or more updated columns.

For example, if this is the schema and INSERT..ON CONFLICT statement:

CREATE TABLE abc (a INT PRIMARY KEY, b INT, c INT)
INSERT INTO abc VALUES (1, 2) ON CONFLICT (a) DO UPDATE SET b=10

Then an input expression equivalent to this would be built:

SELECT ins_a, ins_b, ins_c, fetch_a, fetch_b, fetch_c, 10 AS upd_b
FROM (VALUES (1, 2, NULL)) AS ins(ins_a, ins_b, ins_c)
LEFT OUTER JOIN abc AS fetch(fetch_a, fetch_b, fetch_c)
ON ins_a = fetch_a

At runtime, the Upsert execution operator will test the fetch_a column.
If it is null, then there is no existing row, so the operator inserts a
new row using the (ins_a, ins_b, ins_c) columns. Otherwise, the operator
formulates an update using the (upd_b) column, along with any needed
fetch columns.

The CBO Upsert support is disabled by default behind a feature flag:
experimental_optimizer_updates.

cockroach-teamcity · 2018-12-23T23:39:14Z

This change is

knz

First, second and last commit LGTM modulo some nits and a missing release note.

There is a change in the left joiner exec code in the last commit which I do not understand - does it belong there? It probably merits explanation in the commit message.

Also I'd like to understand (preferably from the commit/PR description) what your view is on the transitory period where the two implementations of UPSERT have different behavior:

different conflict handling
different fast path

Which one is going to be the default? If the CBO becomes the default this needs to be communicated amply to beta users. If not we need to have a path to exercise it thoroughly.

Reviewed 10 of 10 files at r1, 8 of 8 files at r2, 26 of 27 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/plan.go, line 369 at r6 (raw file):

	p.curPlan = planTop{AST: stmt.AST}

	//s := stmt.String()

nit: stray comments

pkg/sql/tablewriter_upsert_opt.go, line 214 at r6 (raw file):

			rowIndex, ok = tu.ru.FetchColIDtoRowIndex[colID]
			if !ok {
				panic("no existing value is available for column")

return pgerror.NewAssertionErrorf(...)

(this will go to telemetry where we can pick it up without bothering users with dead nodes)

pkg/sql/logictest/testdata/logic_test/check_constraints, line 157 at r6 (raw file):

# only checks constraints if an insert or update actually occurs. In this case,
# the DO NOTHING clause skips the update, so there is no need to check the
# constraint.

Can you check whether pg does the same or different
In either case, document the behavior change in a release note.

pkg/sql/logictest/testdata/logic_test/computed, line 734 at r6 (raw file):


# Upsert -> update
# NOTE: The CBO cannot show the name of the computed column in the error message

Please remove the explanatory comment from here and instead add it to #33345.

pkg/sql/logictest/testdata/logic_test/privileges_table, line 245 at r6 (raw file):

# Grant SELECT and INSERT privilege.
# TODO(andyk): The SELECT privilege is required by the optimizer to fetch
# existing columns. This matches PG behavior, but differs from planner behavior.

This explanatory comment is not needed because we'll fix #33346.

pkg/sql/logictest/testdata/logic_test/txn, line 942 at r6 (raw file):

UPDATE kv SET v = 'foo'

# The planner uses INSERT, the optimizer uses UPSERT; accept both for now.

s/planner/heuristic planner/

pkg/sql/logictest/testdata/logic_test/upsert_opt, line 1 at r6 (raw file):

# LogicTest: local-opt fakedist-opt

Could you paste a diff of the two files in the commit message, so that our future selves can satisfy themselves nothing important was lost in the copy/paste.

pkg/sql/opt/exec/execbuilder/relational_builder.go, line 1198 at r6 (raw file):

	//
	// TODO(andyk): Using ensureColumns here can result in an extra Render.
	// Upgrade execution engine to not require this.

Can you spell out the required change? Better yet spell it out in a linked issue in the TODO text.

rytaft

Nice work! Mostly minor comments below...

Reviewed 10 of 10 files at r1, 8 of 8 files at r2, 3 of 3 files at r3, 9 of 9 files at r4, 16 of 16 files at r5, 27 of 27 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/opt_catalog.go, line 509 at r2 (raw file):

}

// IsIndex is part of the cat.Index interface.

IsIndex -> IsUnique

pkg/sql/logictest/testdata/logic_test/upsert_opt, line 1 at r6 (raw file):

Previously, knz (kena) wrote…

Could you paste a diff of the two files in the commit message, so that our future selves can satisfy themselves nothing important was lost in the copy/paste.

It would also help to add a comment in this file explaining what's different

pkg/sql/opt/ops/statement.opt, line 127 at r5 (raw file):

# into a target table. If a conflicting row already exists, then Upsert will
# instead update the existing row. The Upsert operator uses insertion, existing,
# and new values constructed by the input to insert rows, update indexes, and to

I think this sentence is trying to do a bit too much... it would help to break this up and explain the difference between an "insertion" value and a "new" value.

pkg/sql/opt/optbuilder/insert.go, line 32 at r5 (raw file):

// excludedTableName is the name of a special Upsert data source containing
// columns that could not be inserted due to a conflict with an existing row:

do you mean "rows that could not be inserted..."?

pkg/sql/opt/optbuilder/insert.go, line 474 at r5 (raw file):

	//   2. Assign name to each column
	//   3. Add id of each column to the insertColList
	mb.insertColList = make(opt.ColList, cap(mb.targetColList))

Wasn't this already done at the top of the function?

pkg/sql/opt/optbuilder/insert.go, line 533 at r5 (raw file):

// filter that discards rows that have a conflict (by checking a not-null table
// column to see if it was null-extended by the left join). See the comment
// header for buildInsert for an example.

It would be good to specify Builder.buildInsert to differentiate from mutationBuilder.buildInsert

pkg/sql/opt/optbuilder/insert.go, line 538 at r5 (raw file):

	var conflictIndex cat.Index
	if len(onConflict.Columns) != 0 {
		// Check that the ON CONFLICT columns reference at most one target row.

This is a bit confusing - how can columns reference at most one row?

pkg/sql/opt/optbuilder/insert.go, line 556 at r5 (raw file):

		// If conflict columns were explicitly specified, then only check for a
		// conflict one a single index. Otherwise, check on all indexes.

one -> on

pkg/sql/opt/optbuilder/insert.go, line 580 at r5 (raw file):

		// Remember the column ID of a scan column that is not null. This will be
		// used to detect whether a conflict was detected for a row.
		notNullColID := scanScope.cols[findNotNullIndexCol(index)].id

would be good to explain why this always exists and no canary col is necessary

pkg/sql/opt/optbuilder/insert.go, line 602 at r5 (raw file):

		mb.outScope.expr = mb.b.factory.ConstructProject(
			mb.b.factory.ConstructSelect(
				mb.b.factory.ConstructLeftJoin(

Why do you need one left join for each unique index? It seems like you could get away with a single left join, with the join conditions for different indexes OR-ed together.

pkg/sql/opt/optbuilder/insert.go, line 644 at r5 (raw file):

	}

	// Build the right side of the left outer join. Include mutation columns

Seems like a lot of this code is duplicated from the function above. Is there any way to consolidate?

pkg/sql/opt/optbuilder/insert.go, line 735 at r5 (raw file):

	if len(insertCols) != 0 {
		for _, name := range insertCols {
			if ord := cat.FindTableColumnByName(mb.tab, name); ord != -1 {

do you need to throw an error if the column is not found?

pkg/sql/opt/optbuilder/insert.go, line 796 at r5 (raw file):

	}
	panic(builderError{errors.New(
		"there is no unique or exclusion constraint matching the ON CONFLICT specification")})

Is there a pgcode for this error (since you said Postgres also requires this)?

pkg/sql/opt/optbuilder/testdata/upsert, line 258 at r5 (raw file):

                └── const: 1 [type=int]

# Use RETURNING INSERT..ON CONFLICT as a FROM clause.

This example doesn't have ON CONFLICT

andy-kimball

Regarding the default, I already have a note in the commit message:

The CBO Upsert support is disabled by default behind a feature flag:
experimental_optimizer_updates.

I will only make it the default once the equivalent fast path(s) are in place. In addition, I believe your separate PR has changed the conflict handling to be equivalent, so that shouldn't be a problem either.

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/opt_catalog.go, line 509 at r2 (raw file):

Previously, rytaft wrote…

IsIndex -> IsUnique

Done.

pkg/sql/tablewriter_upsert_opt.go, line 214 at r6 (raw file):

Previously, knz (kena) wrote…

return pgerror.NewAssertionErrorf(...)

(this will go to telemetry where we can pick it up without bothering users with dead nodes)

Ah, nice. Done.

pkg/sql/logictest/testdata/logic_test/check_constraints, line 157 at r6 (raw file):

Previously, knz (kena) wrote…

Can you check whether pg does the same or different

In either case, document the behavior change in a release note.

PG behaves the same way that the HP does. Note that I expect this difference in behavior to be temporary, until I'm able to incorporate check constraints into the CBO. I added a bit to the comment:

# TODO(andyk): The CBO behaves differently than the HP and PG in this case. It
# only checks constraints if an insert or update actually occurs. In this case,
# the DO NOTHING clause skips the update, so there is no need to check the
# constraint. Once the CBO incorporates check constraints, this difference in
# behavior should go away.

Since the CBO is off-by-default, there's no need for a release note.

pkg/sql/logictest/testdata/logic_test/computed, line 734 at r6 (raw file):

Previously, knz (kena) wrote…

Please remove the explanatory comment from here and instead add it to #33345.

How would that issue address this difference in behavior between the CBO and HP? The CBO implements computed columns as projections, and they get evaluated as part of the larger SQL expression. Adding context fields to the pgerror won't have any effect on how computed columns work in the CBO.

pkg/sql/logictest/testdata/logic_test/txn, line 942 at r6 (raw file):

Previously, knz (kena) wrote…

s/planner/heuristic planner/

Done.

pkg/sql/opt/exec/execbuilder/relational_builder.go, line 1198 at r6 (raw file):

Previously, knz (kena) wrote…

Can you spell out the required change? Better yet spell it out in a linked issue in the TODO text.

I don't have more detail on the changes that will need to be made. The TODO is there to remind us to delve more deeply into this when we have time.

pkg/sql/opt/ops/statement.opt, line 127 at r5 (raw file):

Previously, rytaft wrote…

I think this sentence is trying to do a bit too much... it would help to break this up and explain the difference between an "insertion" value and a "new" value.

I got rid of the sentence altogether. There is a ton of other information already in MutationPrivate comments, as well as Builder.buildInsert comments.

pkg/sql/opt/optbuilder/insert.go, line 32 at r5 (raw file):

Previously, rytaft wrote…

do you mean "rows that could not be inserted..."?

I rewrote the sentence.

pkg/sql/opt/optbuilder/insert.go, line 474 at r5 (raw file):

Previously, rytaft wrote…

Wasn't this already done at the top of the function?

Good catch. Removed.

pkg/sql/opt/optbuilder/insert.go, line 533 at r5 (raw file):

Previously, rytaft wrote…

It would be good to specify Builder.buildInsert to differentiate from mutationBuilder.buildInsert

Done.

pkg/sql/opt/optbuilder/insert.go, line 538 at r5 (raw file):

Previously, rytaft wrote…

This is a bit confusing - how can columns reference at most one row?

Added a bit of extra text to clarify.

pkg/sql/opt/optbuilder/insert.go, line 556 at r5 (raw file):

Previously, rytaft wrote…

one -> on

Done.

pkg/sql/opt/optbuilder/insert.go, line 580 at r5 (raw file):

Previously, rytaft wrote…

would be good to explain why this always exists and no canary col is necessary

Done.

pkg/sql/opt/optbuilder/insert.go, line 602 at r5 (raw file):

Previously, rytaft wrote…

Why do you need one left join for each unique index? It seems like you could get away with a single left join, with the join conditions for different indexes OR-ed together.

It's possible to create a big predicate that does all the analysis, but then we'd select a bad plan for it. We'd end up just doing a primary index scan, and evaluate the big predicate on each row. Instead, we need to do a series of lookup joins (that's what the current Upsert code does).

pkg/sql/opt/optbuilder/insert.go, line 644 at r5 (raw file):

Previously, rytaft wrote…

Seems like a lot of this code is duplicated from the function above. Is there any way to consolidate?

There are definitely similarities, but there are also important differences that make it painful to consolidate. This is what I arrived at after trying out various ways of consolidating. I don't know how to make it better without making it harder to understand.

pkg/sql/opt/optbuilder/insert.go, line 735 at r5 (raw file):

Previously, rytaft wrote…

do you need to throw an error if the column is not found?

They should always exist, because the existence of insertCols has already been checked. I changed the code to get rid of the unnecessary if and added a comment.

pkg/sql/opt/optbuilder/insert.go, line 796 at r5 (raw file):

Previously, rytaft wrote…

Is there a pgcode for this error (since you said Postgres also requires this)?

Possibly, but I'm trying to match the behavior of the existing code as closely as possible, and it doesn't use a PG error code.

pkg/sql/opt/optbuilder/testdata/upsert, line 258 at r5 (raw file):

Previously, rytaft wrote…

This example doesn't have ON CONFLICT

Done.

pkg/sql/logictest/testdata/logic_test/privileges_table, line 245 at r6 (raw file):

Previously, knz (kena) wrote…

This explanatory comment is not needed because we'll fix #33346.

Done.

pkg/sql/logictest/testdata/logic_test/upsert_opt, line 1 at r6 (raw file):

Previously, rytaft wrote…

It would also help to add a comment in this file explaining what's different

The main reason for adding this file was due to differences in conflict handling between the CBO and HP, specifically in the case where the same row is inserted/updated multiple times in the same statement. However, after discussion, Raphael and I have agreed that we should not make any semantic guarantees in that case, and should instead require users to use DISTINCT ON to remove any duplicates:

cockroachdb/docs#4215

Therefore, I've updated the upsert logictest to reflect this decision, and I've removed upsert_opt altogether.

knz

I think a release note should also mention the existence of this flag (I don't recall seeing such a release note earlier in the first PR where you added it).

Reviewed 3 of 62 files at r7, 3 of 25 files at r12.
Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/logictest/testdata/logic_test/check_constraints, line 157 at r6 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

PG behaves the same way that the HP does. Note that I expect this difference in behavior to be temporary, until I'm able to incorporate check constraints into the CBO. I added a bit to the comment:
# TODO(andyk): The CBO behaves differently than the HP and PG in this case. It
# only checks constraints if an insert or update actually occurs. In this case,
# the DO NOTHING clause skips the update, so there is no need to check the
# constraint. Once the CBO incorporates check constraints, this difference in
# behavior should go away.
Since the CBO is off-by-default, there's no need for a release note.

Yeah so the situation with release notes is that if you don't write something about this now, nobody will notice the change in behavior later when we switch the default. We're not creating release notes out of thin air about all the behavior that becomes different when a flag is switched. Someone has to write it, and I don't see any record of this behavior change anywhere for the future release note writer to pick up.

My recommendation would be for a release note that says "when the experimental flag is set, the behavior is ..."

pkg/sql/logictest/testdata/logic_test/computed, line 734 at r6 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

How would that issue address this difference in behavior between the CBO and HP? The CBO implements computed columns as projections, and they get evaluated as part of the larger SQL expression. Adding context fields to the pgerror won't have any effect on how computed columns work in the CBO.

I read your response to my review comment as an additional requirement on how we're going to produce meaningful contextual strings. It's a non-trivial problem but it's solvable.
no matter the outcome of that discussion, I don't think your comment in the test file here is doing any favor to future readers of this test. How does it help understand the test in any way?

Your comment is really a rationale for the change in the PR, not suitable documentation for the end result -- as such it should be part of the commit message, if anything.

pkg/sql/opt/exec/execbuilder/relational_builder.go, line 1198 at r6 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I don't have more detail on the changes that will need to be made. The TODO is there to remind us to delve more deeply into this when we have time.

acknowledged.

This return value is not ever used by callers, so this commit removes it. Release note: None

The INSERT..ON CONFLICT statement needs to know if an index was declared as UNIQUE by the user. This commit adds an IsUnique() method to the cat.Index interface that exposes that information. It also adds a cat.FindTableColumnByName method, which will also be used by upsert. Release note: None

Move "ForInsert" and "ForUpdate" methods from mutationBuilder to the insert.go and update.go files. Leave shared utility methods in the mutation_builder.go file. This sets the pattern for locating Upsert and Delete specific files as well. Release note: None

Use the word "alias" rather than "label" more consistently, in order to match the new Metadata terminology. Release note: None

This commit adds support for three additional mutation operator cases: 1. UPSERT statement 2. INSERT..ON CONFLICT DO NOTHING 3. INSERT..ON CONFLICT..DO UPDATE Each of these has overlapping functionality with INSERT, UPDATE, and with each other, but also have important differences. Similar to the other mutation operators, the Upsert operator accepts a relational input expression that projects columns that will be used as part of executing the mutation: 1. Columns containing existing values fetched from the target table and used to detect conflicts and to formulate the key/value update commands. 2. Columns containing updated values to set when a conflict is detected, as specified by the user. 3. Computed columns which will be updated when a conflict is detected and that are dependent on one or more updated columns. For example, if this is the schema and INSERT..ON CONFLICT statement: CREATE TABLE abc (a INT PRIMARY KEY, b INT, c INT) INSERT INTO abc VALUES (1, 2) ON CONFLICT (a) DO UPDATE SET b=10 Then an input expression equivalent to this would be built: SELECT ins_a, ins_b, ins_c, fetch_a, fetch_b, fetch_c, 10 AS upd_b FROM (VALUES (1, 2, NULL)) AS ins(ins_a, ins_b, ins_c) LEFT OUTER JOIN abc AS fetch(fetch_a, fetch_b, fetch_c) ON ins_a = fetch_a At runtime, the Upsert execution operator will test the fetch_a column. If it is null, then there is no existing row, so the operator inserts a new row using the (ins_a, ins_b, ins_c) columns. Otherwise, the operator formulates an update using the (upd_b) column, along with any needed fetch columns. The CBO Upsert support is disabled by default behind a feature flag: experimental_optimizer_updates. Release note (sql change): The new experimental_optimizer_updates feature flag controls whether updates and upserts are planned by the cost-based optimizer rather than the heuristic planner.

rytaft

Reviewed 7 of 62 files at r7, 3 of 8 files at r8, 9 of 9 files at r10, 15 of 16 files at r11, 22 of 25 files at r12.
Reviewable status: complete! 1 of 0 LGTMs obtained

rytaft

Reviewed 4 of 62 files at r7, 3 of 25 files at r12.
Reviewable status: complete! 1 of 0 LGTMs obtained

Build the upsertNode for the CBO Upsert operator. Also, create a new upsert tableWriter that is custom built for the CBO. The CBO can use a much simpler upserter because it incorporates conflict detection, update and computed column evaluation, and other upsert operations into the input query, rather than requiring the upserter to do it. Release note (sql change): when the experimental_optimizer_updates feature flag is set, check constraints are not checked for rows skipped by the INSERT..DO NOTHING clause.

andy-kimball

OK, I added a release note, though I question whether we should publish it, given that this flag will be removed before our final 2.2 release.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

pkg/sql/logictest/testdata/logic_test/check_constraints, line 157 at r6 (raw file):

Previously, knz (kena) wrote…

Yeah so the situation with release notes is that if you don't write something about this now, nobody will notice the change in behavior later when we switch the default. We're not creating release notes out of thin air about all the behavior that becomes different when a flag is switched. Someone has to write it, and I don't see any record of this behavior change anywhere for the future release note writer to pick up.

My recommendation would be for a release note that says "when the experimental flag is set, the behavior is ..."

Done

pkg/sql/logictest/testdata/logic_test/computed, line 734 at r6 (raw file):

Previously, knz (kena) wrote…

I read your response to my review comment as an additional requirement on how we're going to produce meaningful contextual strings. It's a non-trivial problem but it's solvable.

no matter the outcome of that discussion, I don't think your comment in the test file here is doing any favor to future readers of this test. How does it help understand the test in any way?

Your comment is really a rationale for the change in the PR, not suitable documentation for the end result -- as such it should be part of the commit message, if anything.

I found it useful, but I don't really care much either way, so I removed it.

andy-kimball · 2019-01-04T23:49:56Z

bors r+

craig · 2019-01-05T00:09:28Z

Build failed

GitHub CI (Cockroach)

andy-kimball · 2019-01-05T00:22:34Z

bors r+

33339: opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements r=andy-kimball a=andy-kimball This commit adds support for three additional mutation operator cases: 1. UPSERT statement 2. INSERT..ON CONFLICT DO NOTHING 3. INSERT..ON CONFLICT..DO UPDATE Each of these has overlapping functionality with INSERT, UPDATE, and with each other, but also have important differences. Similar to the other mutation operators, the Upsert operator accepts a relational input expression that projects columns that will be used as part of executing the mutation: 1. Columns containing existing values fetched from the target table and used to detect conflicts and to formulate the key/value update commands. 2. Columns containing updated values to set when a conflict is detected, as specified by the user. 3. Computed columns which will be updated when a conflict is detected and that are dependent on one or more updated columns. For example, if this is the schema and INSERT..ON CONFLICT statement: CREATE TABLE abc (a INT PRIMARY KEY, b INT, c INT) INSERT INTO abc VALUES (1, 2) ON CONFLICT (a) DO UPDATE SET b=10 Then an input expression equivalent to this would be built: SELECT ins_a, ins_b, ins_c, fetch_a, fetch_b, fetch_c, 10 AS upd_b FROM (VALUES (1, 2, NULL)) AS ins(ins_a, ins_b, ins_c) LEFT OUTER JOIN abc AS fetch(fetch_a, fetch_b, fetch_c) ON ins_a = fetch_a At runtime, the Upsert execution operator will test the fetch_a column. If it is null, then there is no existing row, so the operator inserts a new row using the (ins_a, ins_b, ins_c) columns. Otherwise, the operator formulates an update using the (upd_b) column, along with any needed fetch columns. The CBO Upsert support is disabled by default behind a feature flag: experimental_optimizer_updates. Co-authored-by: Andrew Kimball <[email protected]>

craig · 2019-01-05T00:42:53Z

Build succeeded

GitHub CI (Cockroach)

RaduBerinde

Sorry I didn't get a chance to review this in a timely manner. This is fantastic work! I am sure it was a huge effort to get all these details in place.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page to add: - Add FILTER per cockroachdb/cockroach#34077 - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html

A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add FILTER per cockroachdb/cockroach#34077 - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Remove `experimental_optimizer_updates` cluster setting (can't find a commit for this, but I don't see it in `SHOW ALL` output on my local build of yesterday's `master`, version number is `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`. [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html

A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Add FILTER clause on aggregate functions per cockroachdb/cockroach#34077 - Remove `experimental_optimizer_updates` cluster setting (can't find a commit for this, but I don't see it in `SHOW ALL` output on my local build of yesterday's `master`, version number is `v2.2.0-alpha.20181217-1096-gd104dcee69-dirty`. [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html

A bit more work to address #3998. Summary of changes: - Edit the 'Supported statements' section of the ['Cost-based Optimizer'][1] page as follows: - Add DELETE per cockroachdb/cockroach#34522 - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339 - Add `SELECT`, `VALUES`, and `UNION` statements that do not include window functions - Add FILTER clause on aggregate functions per cockroachdb/cockroach#34077 - Remove `experimental_optimizer_updates` cluster setting [1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html

51581: opt: improve error message for failing to force a partial index r=mgartner a=mgartner This commit expands the error message when a user tries to force a partial index that cannot satisfy the query. This error is emitted when the query filter does not imply the partial index predicate. Release note: None 51608: sql: fix pagination in UPSERT r=yuzefovich a=yuzefovich **sql: minor cleanup around upsert** This commit removes a couple of duplicated "init" calls as well as some unused parameters around upsert code. Release note: None **sql: fix pagination in UPSERT** `optTableUpserter` was incorrectly overriding `curBatchSize` method by returning `insertRows.Len`, but that container is not actually used anywhere. As a result, `curBatchSize` was always considered 0, and we didn't perform the pagination on the UPSERTs. The bug was introduced in #33339 (in 19.1.0). Fixes: #51391. Release note (bug fix): Previously, CockroachDB could hit a "command is too large" error when performing UPSERT operation with many values. Internally, we attempt to perform such operation by splitting it into "batches", but the batching mechanism was broken. Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>

andy-kimball requested review from justinj, rytaft, RaduBerinde and a team December 23, 2018 23:39

andy-kimball requested a review from a team as a code owner December 23, 2018 23:39

andy-kimball requested review from a team December 23, 2018 23:39

knz reviewed Dec 24, 2018

View reviewed changes

rytaft requested changes Jan 1, 2019

View reviewed changes

andy-kimball force-pushed the upsert branch from 70e37d0 to b250f03 Compare January 4, 2019 15:24

andy-kimball commented Jan 4, 2019

View reviewed changes

knz reviewed Jan 4, 2019

View reviewed changes

andy-kimball added 5 commits January 4, 2019 11:30

sql: Remove tree.Datums return value from tableWriter.row method

811db02

This return value is not ever used by callers, so this commit removes it. Release note: None

opt: Standardize on using "alias" as column name

01893a3

Use the word "alias" rather than "label" more consistently, in order to match the new Metadata terminology. Release note: None

rytaft approved these changes Jan 4, 2019

View reviewed changes

andy-kimball force-pushed the upsert branch from b250f03 to 16fbd5b Compare January 4, 2019 21:17

andy-kimball commented Jan 4, 2019

View reviewed changes

craig bot merged commit 16fbd5b into cockroachdb:master Jan 5, 2019

andy-kimball deleted the upsert branch January 5, 2019 00:43

RaduBerinde reviewed Jan 7, 2019

View reviewed changes

jseldess mentioned this pull request Jan 15, 2019

opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements cockroachdb/docs#4264

Closed

rmloveland mentioned this pull request Feb 7, 2019

Update CBO docs to list more supported statements cockroachdb/docs#4340

Merged

This was referenced Jul 20, 2020

sql: fix pagination in UPSERT #51608

Merged

release-19.2: sql: fix pagination in UPSERT #51612

Merged

release-19.1: sql: fix pagination in UPSERT #51625

Merged

yuzefovich mentioned this pull request Jul 20, 2020

release-20.1: sql: fix pagination in UPSERT #51626

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements #33339

opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements #33339

andy-kimball commented Dec 23, 2018

cockroach-teamcity commented Dec 23, 2018

knz left a comment

rytaft left a comment

andy-kimball left a comment

knz left a comment

rytaft left a comment

rytaft left a comment

andy-kimball left a comment

andy-kimball commented Jan 4, 2019

craig bot commented Jan 5, 2019

andy-kimball commented Jan 5, 2019

craig bot commented Jan 5, 2019

RaduBerinde left a comment

opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements #33339

opt: Add optimizer support for UPSERT and INSERT..ON CONFLICT statements #33339

Conversation

andy-kimball commented Dec 23, 2018

cockroach-teamcity commented Dec 23, 2018

knz left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

andy-kimball left a comment

Choose a reason for hiding this comment

andy-kimball commented Jan 4, 2019

craig bot commented Jan 5, 2019

Build failed

andy-kimball commented Jan 5, 2019

craig bot commented Jan 5, 2019

Build succeeded

RaduBerinde left a comment

Choose a reason for hiding this comment