Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: implement ON UPDATE expressions #69091

Merged
merged 1 commit into from
Aug 24, 2021
Merged

Conversation

pawalt
Copy link
Contributor

@pawalt pawalt commented Aug 18, 2021

This PR implements ON UPDATE expressions.

Resolves #69057

Release note (sql change): This PR adds ON UPDATE expression to allow users to
evaluate expressions when a row is updated.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@pawalt pawalt requested review from mgartner and arulajmani August 18, 2021 14:48
@pawalt pawalt marked this pull request as ready for review August 18, 2021 14:48
@pawalt pawalt requested a review from a team August 18, 2021 14:48
@pawalt pawalt requested a review from a team as a code owner August 18, 2021 14:48
@mgartner mgartner requested review from a team and removed request for a team August 18, 2021 18:33
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

We probably want to add support for ON UPDATE to the optimizer's test catalog. This will allow you to add tests for your changes in optbuilder to pkg/sql/opt/optbuilder/testdata. Here's an example commit for adding support for a new column type to the test catalog: 2045ba2

Reviewed 3 of 16 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @pawalt)


-- commits, line 9 at r1 ([raw file](https://github.com/cockroachdb/cockroach/blob/eb65355f13e4433f28efeb14d07d916675b7184a/-- commits#L9)):
nit: A release note should read as if it's included in the release notes for a DB version. So mentioning "PR" may not be the correct perspective. Here's a suggestion you can play with: "An ON UPDATE expression can now be added to a column. Whenever a row is modified, a column's ON UPDATE expression is re-evaluated and the column is updated to the result."


pkg/sql/alter_table.go, line 364 at r1 (raw file):

			case *tree.ForeignKeyConstraintTableDef:
				// We want to reject uses of ON UPDATE actions where there is already an
				// ON UPDATE expression for the column.

nit: make it clear the first ON UPDATE you mention is FK related.


pkg/sql/alter_table.go, line 1199 at r1 (raw file):

		if col.NumUsesSequences() > 0 {
			if err := params.p.removeSequenceDependencies(params.ctx, tableDesc, col); err != nil {

I don't understand the need for this block. Can you add a comment explaining what it does?


pkg/sql/alter_table.go, line 1204 at r1 (raw file):

		}
		if t.Expr == nil {
			col.ColumnDesc().OnUpdateExpr = nil

When would t.Expr be nil? Do you have a logictest for this case?


pkg/sql/catalog/descpb/structured.proto, line 145 at r1 (raw file):

  // have been serialized in a internal format. Instead, use one of the
  // schemaexpr.FormatExpr* functions.
  optional string on_update_expr = 18;

Consider adding some validation for this field here and testing that validation here.


pkg/sql/logictest/testdata/logic_test/on_update, line 110 at r1 (raw file):


statement error pq: cannot specify a foreign key update action and an ON UPDATE expression on the same column
ALTER TABLE alter_update ADD CONSTRAINT fk FOREIGN KEY (j) REFERENCES test_fk_base (j) ON UPDATE SET DEFAULT

Can you add some tests with SHOW CREATE TABLE ...?


pkg/sql/opt/optbuilder/mutation_builder.go, line 1205 at r1 (raw file):

// computed value expression for the given table column, and caches it for
// reuse.
func (mb *mutationBuilder) parseOnUpdateOrComputedExpr(colID opt.ColumnID) tree.Expr {

Is there a need to parse a computed expression here? A column is not allowed to be both computed and have an ON UPDATE, correct?

It'd also be confusing to now have both parseDefaultOrComputedExpr and parseOnUpdate. Maybe this should be refactored to parseDefaultExpr, parseOnUpdateExpr, and parseComputedExpr which all call a function that does most of the work like parseColumnExpr(colID opt.ColumnID, exprStr string) tree.Expr) { ... }. The caching might be a slight hurdle... parseColumnExpr could also take a map to use for caching.

@mgartner mgartner requested a review from a team August 18, 2021 19:30
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, we need optbuilder testcases.

We should also have a TODO somewhere to avoid updating the column unnecessarily (e.g. if the expression is immutable and the input columns don't change).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @pawalt)

Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing on behalf of @cockroachdb/sql-schema I just have a few small requests but nothing major.

"cannot specify both ON UPDATE expression and a foreign key"+
" ON UPDATE action for column %s(%d)",
col.GetName(),
col.GetID(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error also doesn't need to be an assertion failure, you can return a pgerror.New(pgcode.InvalidTableDefinition, ...
The validation framework will wrap these into assertion failures when required (at descriptor read and write time).

Also, ideally we try to capture as many validation failures as possible, not just the first encountered. I understand you want to use this in create_table.go as well. Perhaps there's a way. If you look at other validation code you'll notice that we're not particularly disciplined in doing this but better late than never.

return ""
}
return *w.desc.OnUpdateExpr
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This OnUpdateExpr field should be visited by ForEachExprStringInTableDesc in tabledesc.

Please add tests to cover a missing cross-reference in validate.go in the tabledesc package. By that I mean, a test case in which the ON UPDATE expression refers to a descriptor which doesn't exist, like an user-defined enum type value for example.

@@ -1821,6 +1824,7 @@ func newOptVirtualTable(
cat.MaybeHidden(d.IsHidden()),
d.ColumnDesc().DefaultExpr,
d.ColumnDesc().ComputeExpr,
d.ColumnDesc().OnUpdateExpr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware this is continuing an existing pattern but we really should avoid looking stuff up in ColumnDesc() or IndexDesc() when possible and instead delegate to the catalog.Column interface.

Perhaps this is out of scope for this change. Also I'm aware that the opt-catalog predates the schema-catalog but at some point we'll have to do this change so might as well start now and make it happen gradually?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 16 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @pawalt)


pkg/sql/opt/cat/column.go, line 108 at r1 (raw file):

// HasOnUpdate returns true if the column has an on update expression.
// OnUpdateStr will be set to the SQL expression string in that case.

nit: on update expression -> ON UPDATE expression
nit: OnUpdateStr -> OnUpdateExprStr?


pkg/sql/opt/cat/column.go, line 114 at r1 (raw file):

// OnUpdateExprStr is set to the SQL expression string that describes the column's
// on update expression. It is used when the user does not provide a value for

nit: ON UPDATE


pkg/sql/opt/optbuilder/mutation_builder.go, line 600 at r1 (raw file):

			// Always include WriteOnly columns.
		} else if tabCol.HasOnUpdate() && applyOnUpdate {
			// Use ON UPDATE columns if specified

nit: end comment with a period.

Copy link
Contributor Author

@pawalt pawalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @mgartner, @postamar, and @rytaft)


pkg/sql/alter_table.go, line 1199 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I don't understand the need for this block. Can you add a comment explaining what it does?

Done.

I realized I needed to take into account both DEFAULT and ON UPDATE expressions when recalculating sequence dependencies, so I've factored that logic out into an updateSequenceDependencies function. I think I'm doing the updating properly, but if someone could sanity-check it, that'd be great.

I added tests for this case at the bottom of my on_update logic test.


pkg/sql/alter_table.go, line 1204 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

When would t.Expr be nil? Do you have a logictest for this case?

t.Expr is nil in the case where we drop an ON UPDATE expression.


pkg/sql/opt_catalog.go, line 1827 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

I'm aware this is continuing an existing pattern but we really should avoid looking stuff up in ColumnDesc() or IndexDesc() when possible and instead delegate to the catalog.Column interface.

Perhaps this is out of scope for this change. Also I'm aware that the opt-catalog predates the schema-catalog but at some point we'll have to do this change so might as well start now and make it happen gradually?

I think this will have to be a one-shot migration. The Init function takes in a string pointer and knows that the expression is empty if the pointer is nil. The catalog behavior, however, is to output an empty string if there is no expression. We'll have to change Init to match this behavior. This shouldn't be too hard of a change to make, but I think it's better for a later PR.


pkg/sql/catalog/descpb/structured.proto, line 145 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Consider adding some validation for this field here and testing that validation here.

I added validation by updating ForEachExprStringInTableDesc and adding a test. Is there other validation I should be doing? The test I added covers referencing nonexistent descriptors.


pkg/sql/catalog/tabledesc/column.go, line 123 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

This OnUpdateExpr field should be visited by ForEachExprStringInTableDesc in tabledesc.

Please add tests to cover a missing cross-reference in validate.go in the tabledesc package. By that I mean, a test case in which the ON UPDATE expression refers to a descriptor which doesn't exist, like an user-defined enum type value for example.

Done.


pkg/sql/catalog/tabledesc/validate.go, line 670 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

This error also doesn't need to be an assertion failure, you can return a pgerror.New(pgcode.InvalidTableDefinition, ...
The validation framework will wrap these into assertion failures when required (at descriptor read and write time).

Also, ideally we try to capture as many validation failures as possible, not just the first encountered. I understand you want to use this in create_table.go as well. Perhaps there's a way. If you look at other validation code you'll notice that we're not particularly disciplined in doing this but better late than never.

Ah didn't know that. Changed.

I could build up a single error string with all the ON UPDATE validation errors in it, but I'm not sure how idiomatic that is. Does that sound reasonable? It's the only way I can think of to reuse this code off the top of my head.


pkg/sql/logictest/testdata/logic_test/on_update, line 110 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you add some tests with SHOW CREATE TABLE ...?

Added. Any specific other cases I should be looking out for?


pkg/sql/opt/optbuilder/mutation_builder.go, line 1205 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Is there a need to parse a computed expression here? A column is not allowed to be both computed and have an ON UPDATE, correct?

It'd also be confusing to now have both parseDefaultOrComputedExpr and parseOnUpdate. Maybe this should be refactored to parseDefaultExpr, parseOnUpdateExpr, and parseComputedExpr which all call a function that does most of the work like parseColumnExpr(colID opt.ColumnID, exprStr string) tree.Expr) { ... }. The caching might be a slight hurdle... parseColumnExpr could also take a map to use for caching.

Gave this a shot; Let me know how it looks

@pawalt pawalt force-pushed the on_update_logic branch 2 times, most recently from 12bcb7f to 8eb91fe Compare August 19, 2021 20:04
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opt stuff :lgtm: nice work! (just some nits below)

Reviewed 8 of 17 files at r2, 4 of 5 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani, @mgartner, @pawalt, and @postamar)


pkg/sql/opt/optbuilder/mutation_builder.go, line 148 at r4 (raw file):

	subqueries []*scope

	// parsedColComputedExprs is a cached set of parsed default expressions

nit: default expressions -> computed expressions


pkg/sql/opt/optbuilder/mutation_builder.go, line 152 at r4 (raw file):

	parsedColComputedExprs []tree.Expr

	// parsedColComputedExprs is a cached set of parsed computed expressions

nit: parsedColComputedExprs -> parsedColDefaultExprs
nit: computed expressions -> default expressions


pkg/sql/opt/optbuilder/mutation_builder.go, line 156 at r4 (raw file):

	parsedColDefaultExprs []tree.Expr

	// parsedColUpdateExprs is a cached set of parsed on update expressions from

nit: on update -> ON UPDATE


pkg/sql/opt/optbuilder/mutation_builder.go, line 158 at r4 (raw file):

	// parsedColUpdateExprs is a cached set of parsed on update expressions from
	// the table schema. These are parsed once and cached for reuse.
	parsedColUpdateExprs []tree.Expr

nit: I'd call this parsedColOnUpdateExprs


pkg/sql/opt/optbuilder/mutation_builder.go, line 1237 at r4 (raw file):

			}
			return datum
		} else {

nit: don't need else here (you'll probably get a lint error for this)


pkg/sql/opt/optbuilder/testdata/insert, line 1326 at r4 (raw file):


build
INSERT INTO on_update_with_default VALUES (1)

nit: add a similar test for on_update_bare and check that NULL is inserted


pkg/sql/opt/optbuilder/testdata/update, line 1713 at r4 (raw file):

      │         └── 1 [as=a_new:11]
      └── projections
           └── 10 [as=v_default:12]

nit: also add a couple tests where you explicitly update v

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exciting stuff!

Reviewed all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @pawalt, and @postamar)


pkg/sql/alter_table.go, line 1186 at r4 (raw file):

		}

	case *tree.AlterTableSetOnUpdate:

We should gate this (and the CREATE codepath) behind a cluster version. Checkout pkg/clusterversion/cockroach_versions.go, you probably want to add something in there.


pkg/sql/logictest/testdata/logic_test/on_update, line 109 at r4 (raw file):

pk4  should_not_change

# ON UPDATE usage with foreign key cascading

nit: Can we spell out this test case in a bit more detail?


pkg/sql/logictest/testdata/logic_test/on_update, line 138 at r4 (raw file):

pk2  val2       def

# ON UPDATE error cases - conflict with a FK ON UPDATE

Same as above.


pkg/sql/logictest/testdata/logic_test/on_update, line 204 at r4 (raw file):

)

# Sequence tests

Let's add something similar for enums as well? Separately, would this work with dropping enum values? Say an enum value is used in an ON UPDATE expression. Do we track/validate it correctly when trying to drop it?

Copy link
Contributor Author

@pawalt pawalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @arulajmani, @mgartner, @postamar, and @rytaft)


pkg/sql/alter_table.go, line 1186 at r4 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

We should gate this (and the CREATE codepath) behind a cluster version. Checkout pkg/clusterversion/cockroach_versions.go, you probably want to add something in there.

This PR is getting a little big, so I'll add this check in after this lands. Issue here:
#69196


pkg/sql/logictest/testdata/logic_test/on_update, line 138 at r4 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Same as above.

Done.


pkg/sql/logictest/testdata/logic_test/on_update, line 204 at r4 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Let's add something similar for enums as well? Separately, would this work with dropping enum values? Say an enum value is used in an ON UPDATE expression. Do we track/validate it correctly when trying to drop it?

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @arulajmani, @mgartner, and @postamar)

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @arulajmani, @mgartner, @pawalt, and @postamar)


pkg/sql/logictest/testdata/logic_test/on_update, line 297 at r5 (raw file):

pk2  x

# Make sure that our ON UPDATE has a dependency on test_seq

You mean test_enum?


pkg/sql/logictest/testdata/logic_test/on_update, line 299 at r5 (raw file):

# Make sure that our ON UPDATE has a dependency on test_seq
statement error pq: cannot drop type "test_enum" because other objects \(\[test.public.test_table_enum\]\) still depend on it
DROP TYPE test_enum

Can we test dropping enum values using ALTER TYPE test_enum ... DROP TYPE as well? I was hoping for a test like:

CREATE TYPE abc AS ENUM ('a', 'b', 'c'); 
CREATE TABLE t(k INT PRIMARY KEY, v abc ON UPDATE 'a'); 
ALTER TYPE abc DROP VALUE 'b' <-- should work because the enum isn't being used in a row/an expression.
ALTER TYPE abc DROP VALUE 'a' <--- should not work because 'a' is being used in the `ON UPDATE` expression of t.

Copy link
Contributor Author

@pawalt pawalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, @mgartner, @postamar, and @rytaft)


pkg/sql/alter_table.go, line 1199 at r1 (raw file):

Previously, ajwerner wrote…

I've been mulling this a bit and I think the ugliness comes from trying to avoid parsing and type-checking the expression again inside of updateSequenceDependencies. I don't think you need to worry about that. This logic where we remove them all first is a bit sad but it was pre-existing so whatever. That can get pushed down into your helper too. So, given that, imagine:

func updateNonComputedColumnExpr(
    params runParams, tab *tabledesc.Mutable, col catalog.Column,  op string, newExpr tree.Expr, exprToUpdate **string,
) error {
    	// If a DEFAULT or ON UPDATE expression starts using a sequence and is then
	// modified to not use that sequence, we need to drop the dependency from
	// the sequence to the column. The way this is done is by wiping all
	// sequence dependencies on the column and then recalculating the
	// dependencies after the new expression has been parsed.
	if col.NumUsesSequences() > 0 {
		if err := params.p.removeSequenceDependencies(params.ctx, tableDesc, col); err != nil {
			return err
		}
	}
        if expr == nil {
		*exprToUpdate = nil
	} else {
 		_, s, err = sanitizeColumnExpression(params, t.Expr, col, op)
		if err != nil {
			return err
		}
		*exprToUpdate = &s
 	}
        if err := updateSequenceDependencies(params, tab, col); err != nil {
 	 	return err
 	}
}

func updateSequenceDependencies(
	params runParams,
	tableDesc *tabledesc.Mutable,
	colDesc catalog.Column,
) error {
    var changedSeqDescs []*tabledesc.Mutable
    mergeIntoChanged := func(toAdd []*tabledesc.Mutable) {
        changedSeqDescs = append(changedSeqDescs, toAdd...)
        sort.Slice(
            changedSeqDescs,
            func(i, j int)  bool { return changedSeqDescs[i].GetID() < changedSeqDescs[j].GetID() },
        )
        truncated := changedSeqDescs[:0]
        for i, d := range changedSeqDescs {
            if i == 0 || changedSeqDescs[i-1].GetID() != d.GetID() {
                truncated = append(truncated, d)
            }
        }
        changedSeqDescs = truncated
    }
    for _, colExpr := range []struct {
        name string
        exists func(catalog.Column) bool
        get func(catalog.Column) string
    } {
        {"ON UPDATE", (catalog.Column).HasOnUpdate, (catalog.Column).GetOnUpdate},
        {"DEFAULT", (catalog.Column).HasDefault, (catalog.Column).GetDefault},
    } {
                if !colExpr.exists(col) {
                    continue
                }
		expr, _, err := sanitizeColumnExpression(params, colExpr.get(col), col, colExpr.name)
		if err != nil {
			return err
		}
        	// Get sequence references from DEFAULT expression.
		changedSeqDescsFromExpr, err := maybeAddSequenceDependencies(
			params.ctx,
			params.p.ExecCfg().Settings,
			params.p,
			tableDesc,
			colDesc.ColumnDesc(),
			expr,
			nil, /* backrefs */
		)
		if err != nil {
			return err
		}
                mergeIntoChanged(changedSeqDescsFromExpr)
    }
	for _, changedSeqDesc := range changedSeqDescs {
		if err := params.p.writeSchemaChange(
			params.ctx, changedSeqDesc, descpb.InvalidMutationID,
			fmt.Sprintf("updating dependent sequence %s(%d) for table %s(%d)",
				changedSeqDesc.Name, changedSeqDesc.ID, tableDesc.Name, tableDesc.ID,
			)); err != nil {
			return err
     }

}

Ah thanks for the help; that makes sense. Implemented.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting very close. Just tiny things from me left.

Reviewed 1 of 2 files at r7.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @arulajmani, @mgartner, @pawalt, @postamar, and @rytaft)


pkg/sql/alter_table.go, line 1144 at r7 (raw file):

	case *tree.AlterTableSetDefault:
		err := updateNonComputedColExpr(

nit: scope err to the conditional here and at the other invocation site.

if err := updateNonComputedColExpr(
    /* ... */,
); err != nil {
   return err
}

pkg/sql/alter_table.go, line 1291 at r7 (raw file):

// updateNonComputedColExpr updates an ON UPDATE or DEFAULT column expression
// and recalculates sequence dependencies for the column.

nit: say more here about what is exprField


pkg/sql/alter_table.go, line 1341 at r7 (raw file):

	s := tree.Serialize(typedExpr)

nit: remove this stray line


pkg/sql/create_table.go, line 2198 at r5 (raw file):

Previously, pawalt (Peyton Walters) wrote…

Before returning the table descriptor, we need to ensure that ON UPDATE expressions don't conflict with ON UPDATE FK actions. We need to return an error instead of failing with a vea.Report because we want this error reported to the user.

I put the validation here instead of at a higher level because if you just inspect the create statement, you have to handle both inline FKs and explicit FKs. If I wait until the descriptor is created, I can just iterate over all FKs.

Ack, give this a bit of commentary.


pkg/sql/catalog/tabledesc/validate.go, line 670 at r1 (raw file):

Previously, pawalt (Peyton Walters) wrote…

Ah didn't know that. Changed.

I could build up a single error string with all the ON UPDATE validation errors in it, but I'm not sure how idiomatic that is. Does that sound reasonable? It's the only way I can think of to reuse this code off the top of my head.

Use errors.CombineErrors to combine them all.


pkg/sql/catalog/tabledesc/validate.go, line 654 at r7 (raw file):

// ValidateOnUpdate returns an error if there is a column with both a foreign
// key constraint and an ON UPDATE expression, nil otherwise.
func ValidateOnUpdate(cols []catalog.Column, outboundFks []descpb.ForeignKeyConstraint) error {

how would you feel about using catalog.TableColSet to construct the set of columns which have an OnUpdate clause so we only iterate each fk one time?

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 16 files at r1, 14 of 17 files at r2, 1 of 5 files at r3, 3 of 4 files at r5, 3 of 5 files at r6, 2 of 2 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @arulajmani, @mgartner, @pawalt, @postamar, and @ZhouXing19)


pkg/sql/catalog/descpb/structured.proto, line 145 at r1 (raw file):

Previously, pawalt (Peyton Walters) wrote…

I added validation by updating ForEachExprStringInTableDesc and adding a test. Is there other validation I should be doing? The test I added covers referencing nonexistent descriptors.

We should add validation for invalid ON UPDATE column descriptors in validateColumns. For example, we should validate that a column is not both computed and has an ON UPDATE expression. We probably want to disallow GENERATED [ALWAYS | BY DEFAULT] AS IDENTITY (currently being worked on by @ZhouXing19, so this can be added by one of you once both syntaxes land in master) and ON UPDATE. Are there other cases where an ON UPDATE column descriptor would be invalid?


pkg/sql/logictest/testdata/logic_test/on_update, line 110 at r1 (raw file):

Previously, pawalt (Peyton Walters) wrote…

Added. Any specific other cases I should be looking out for?

A common one that is missed is CREATE TABLE cpy (LIKE t INCLUDING ...). For example, see the CREATE TABLE ... LIKE expression index tests:

# Tests for CREATE TABLE ... LIKE.
statement ok
CREATE TABLE src (
k INT PRIMARY KEY,
a INT,
b INT,
j JSON,
comp INT8 AS (1 + 10) VIRTUAL,
INDEX ((a + b)),
INDEX named_idx ((a + 1)),
UNIQUE INDEX ((a + 10)),
INVERTED INDEX ((a + b), (j->'a'))
);
CREATE TABLE copy (LIKE src);
CREATE TABLE copy_generated (LIKE src INCLUDING GENERATED);
CREATE TABLE copy_indexes (LIKE src INCLUDING INDEXES);
CREATE TABLE copy_all (LIKE src INCLUDING ALL)
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy]
----
CREATE TABLE public.copy (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL,
rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
FAMILY "primary" (k, a, b, j, comp, rowid)
)
# Inaccessible expression index columns should not be copied if the indexes are
# not copied. copy should not have any crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
0
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy_generated]
----
CREATE TABLE public.copy_generated (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL AS (1:::INT8 + 10:::INT8) VIRTUAL,
rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
FAMILY "primary" (k, a, b, j, rowid)
)
# Inaccessible expression index columns should not be copied if the indexes are
# not copied. copy_generated should not have any crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy_generated'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
0
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy_indexes]
----
CREATE TABLE public.copy_indexes (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL,
CONSTRAINT "primary" PRIMARY KEY (k ASC),
INDEX src_expr_idx ((a + b) ASC),
INDEX named_idx ((a + 1:::INT8) ASC),
UNIQUE INDEX src_expr_key ((a + 10:::INT8) ASC),
INVERTED INDEX src_expr_expr1_idx ((a + b), (j->'a':::STRING)),
FAMILY "primary" (k, a, b, j, comp)
)
# Inaccessible expression index columns should be copied if the indexes are
# copied. copy_indexes should have crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy_indexes'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
5
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy_all]
----
CREATE TABLE public.copy_all (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL AS (1:::INT8 + 10:::INT8) VIRTUAL,
CONSTRAINT "primary" PRIMARY KEY (k ASC),
INDEX src_expr_idx ((a + b) ASC),
INDEX named_idx ((a + 1:::INT8) ASC),
UNIQUE INDEX src_expr_key ((a + 10:::INT8) ASC),
INVERTED INDEX src_expr_expr1_idx ((a + b), (j->'a':::STRING)),
FAMILY "primary" (k, a, b, j)
)
# Inaccessible expression index columns should be copied if the indexes are
# copied. copy_all should have crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy_all'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
5

I'm not sure which INCLUDING options should copy the ON UPDATE to the new table. DEFAULT expressions are only copied if INCLUDING DEFAULTS or INCLUDING ALL is provided. I'd suggest creating an issue for this and addressing it in a follow-up PR to avoid bloating this one and because there's some questions here with unclear answers.


pkg/sql/opt/optbuilder/insert.go, line 386 at r7 (raw file):

		}
		if mb.tab.Column(i).HasOnUpdate() {
			return true

I'm struggling to understand why an ON UPDATE columns requires existing columns to be fetched in all cases.


pkg/sql/opt/optbuilder/mutation_builder.go, line 1205 at r1 (raw file):

Previously, pawalt (Peyton Walters) wrote…

Gave this a shot; Let me know how it looks

I like it!


pkg/sql/opt/optbuilder/mutation_builder.go, line 1220 at r7 (raw file):

		exprStr = tabCol.DefaultExprStr()
	case columnParseTypeOnUpdate:
		exprStr = tabCol.OnUpdateExprStr()

nit: you could add exprStr as an argument to parseColExpr and remove columnParseType entirely. Then parseColExpr is truly general and I think the code would be cleaned up a bit too.


pkg/sql/opt/optbuilder/testdata/upsert, line 61 at r7 (raw file):

exec-ddl
CREATE TABLE on_update_bare (
    a INT NOT NULL UNIQUE,

UPSERT with a rowid PK is making the tests confusing, and maybe not testing what you want. Is there a reason you omitted an explicit PK here?

Copy link
Contributor Author

@pawalt pawalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @arulajmani, @mgartner, @postamar, and @ZhouXing19)


pkg/sql/catalog/descpb/structured.proto, line 145 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

We should add validation for invalid ON UPDATE column descriptors in validateColumns. For example, we should validate that a column is not both computed and has an ON UPDATE expression. We probably want to disallow GENERATED [ALWAYS | BY DEFAULT] AS IDENTITY (currently being worked on by @ZhouXing19, so this can be added by one of you once both syntaxes land in master) and ON UPDATE. Are there other cases where an ON UPDATE column descriptor would be invalid?

The only column-local validation that is needed are the computed column check and the GENERATED check. We need to do the foreign key check too, but that requires a tabledesc.


pkg/sql/catalog/tabledesc/validate.go, line 670 at r1 (raw file):

Previously, ajwerner wrote…

Use errors.CombineErrors to combine them all.

Done.


pkg/sql/catalog/tabledesc/validate.go, line 654 at r7 (raw file):

Previously, ajwerner wrote…

how would you feel about using catalog.TableColSet to construct the set of columns which have an OnUpdate clause so we only iterate each fk one time?

Makes sense. Done.


pkg/sql/logictest/testdata/logic_test/on_update, line 110 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

A common one that is missed is CREATE TABLE cpy (LIKE t INCLUDING ...). For example, see the CREATE TABLE ... LIKE expression index tests:

# Tests for CREATE TABLE ... LIKE.
statement ok
CREATE TABLE src (
k INT PRIMARY KEY,
a INT,
b INT,
j JSON,
comp INT8 AS (1 + 10) VIRTUAL,
INDEX ((a + b)),
INDEX named_idx ((a + 1)),
UNIQUE INDEX ((a + 10)),
INVERTED INDEX ((a + b), (j->'a'))
);
CREATE TABLE copy (LIKE src);
CREATE TABLE copy_generated (LIKE src INCLUDING GENERATED);
CREATE TABLE copy_indexes (LIKE src INCLUDING INDEXES);
CREATE TABLE copy_all (LIKE src INCLUDING ALL)
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy]
----
CREATE TABLE public.copy (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL,
rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
FAMILY "primary" (k, a, b, j, comp, rowid)
)
# Inaccessible expression index columns should not be copied if the indexes are
# not copied. copy should not have any crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
0
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy_generated]
----
CREATE TABLE public.copy_generated (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL AS (1:::INT8 + 10:::INT8) VIRTUAL,
rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
CONSTRAINT "primary" PRIMARY KEY (rowid ASC),
FAMILY "primary" (k, a, b, j, rowid)
)
# Inaccessible expression index columns should not be copied if the indexes are
# not copied. copy_generated should not have any crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy_generated'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
0
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy_indexes]
----
CREATE TABLE public.copy_indexes (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL,
CONSTRAINT "primary" PRIMARY KEY (k ASC),
INDEX src_expr_idx ((a + b) ASC),
INDEX named_idx ((a + 1:::INT8) ASC),
UNIQUE INDEX src_expr_key ((a + 10:::INT8) ASC),
INVERTED INDEX src_expr_expr1_idx ((a + b), (j->'a':::STRING)),
FAMILY "primary" (k, a, b, j, comp)
)
# Inaccessible expression index columns should be copied if the indexes are
# copied. copy_indexes should have crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy_indexes'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
5
query T
SELECT create_statement FROM [SHOW CREATE TABLE copy_all]
----
CREATE TABLE public.copy_all (
k INT8 NOT NULL,
a INT8 NULL,
b INT8 NULL,
j JSONB NULL,
comp INT8 NULL AS (1:::INT8 + 10:::INT8) VIRTUAL,
CONSTRAINT "primary" PRIMARY KEY (k ASC),
INDEX src_expr_idx ((a + b) ASC),
INDEX named_idx ((a + 1:::INT8) ASC),
UNIQUE INDEX src_expr_key ((a + 10:::INT8) ASC),
INVERTED INDEX src_expr_expr1_idx ((a + b), (j->'a':::STRING)),
FAMILY "primary" (k, a, b, j)
)
# Inaccessible expression index columns should be copied if the indexes are
# copied. copy_all should have crdb_internal_idx_expr columns.
query I
SELECT count(*) FROM (
SELECT json_array_elements(
crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor, false)->'table'->'columns'
) AS desc FROM system.descriptor WHERE id = 'copy_all'::REGCLASS
) AS cols WHERE cols.desc->>'name' LIKE 'crdb_internal_idx_expr%'
----
5

I'm not sure which INCLUDING options should copy the ON UPDATE to the new table. DEFAULT expressions are only copied if INCLUDING DEFAULTS or INCLUDING ALL is provided. I'd suggest creating an issue for this and addressing it in a follow-up PR to avoid bloating this one and because there's some questions here with unclear answers.

#69258


pkg/sql/opt/optbuilder/insert.go, line 386 at r7 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I'm struggling to understand why an ON UPDATE columns requires existing columns to be fetched in all cases.

They don't; this was leftover from some testing. Thanks for the catch.


pkg/sql/opt/optbuilder/mutation_builder.go, line 1220 at r7 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: you could add exprStr as an argument to parseColExpr and remove columnParseType entirely. Then parseColExpr is truly general and I think the code would be cleaned up a bit too.

Good point. Done.


pkg/sql/opt/optbuilder/testdata/upsert, line 61 at r7 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

UPSERT with a rowid PK is making the tests confusing, and maybe not testing what you want. Is there a reason you omitted an explicit PK here?

Nope no reason. Fixed.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 9 of 9 files at r8, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @arulajmani, @mgartner, @pawalt, @postamar, and @ZhouXing19)


pkg/sql/catalog/tabledesc/validate.go, line 747 at r8 (raw file):

			return errors.Newf(
				"computed column %q cannot also have an ON UPDATE expression",
				column.GetName(),

nit: you can add a unit test for this here:

func TestValidateTableDesc(t *testing.T) {


pkg/sql/opt/optbuilder/mutation_builder.go, line 1222 at r8 (raw file):

	// If we were not able to find the expression type we were looking for, return
	// NULL or a default value.
	if exprStr == "" {

I'm thinking this block may be better suited within parseDefaultExpr, since I believe it only applies in that case. It'll also be safer to not assume some default values for empty computed columns and ON UPDATE columns. Sorry for not catching this earlier.


pkg/sql/opt/optbuilder/testdata/upsert, line 1966 at r8 (raw file):


build
UPSERT INTO on_update_bare VALUES (1, 2)

Does it make sense to add a test like UPSERT INTO on_update_bare VALUES (1)?

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @arulajmani, @mgartner, @pawalt, @postamar, and @ZhouXing19)

@ajwerner ajwerner requested review from postamar and removed request for postamar August 24, 2021 05:11
@pawalt pawalt requested review from ajwerner and mgartner August 24, 2021 14:16
Copy link
Contributor Author

@pawalt pawalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @arulajmani, @mgartner, @postamar, and @ZhouXing19)


pkg/sql/opt/optbuilder/testdata/upsert, line 1966 at r8 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Does it make sense to add a test like UPSERT INTO on_update_bare VALUES (1)?

Yep

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r9, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 4 stale) (waiting on @ajwerner, @arulajmani, @mgartner, @pawalt, @postamar, and @ZhouXing19)


pkg/sql/catalog/tabledesc/validate_test.go, line 1129 at r9 (raw file):

					{ID: 0, Name: "primary", ColumnIDs: []descpb.ColumnID{1}, ColumnNames: []string{"bar"}},
				},
				PrimaryIndex: descpb.IndexDescriptor{ID: 1, Name: "bar", KeyColumnIDs: []descpb.ColumnID{1},

nit: I think you can omit PrimaryIndex and NextIndexID for brevity.


pkg/sql/opt/optbuilder/mutation_builder.go, line 1192 at r9 (raw file):

	// If we were not able to find the expression type we were looking for, return
	// NULL or a default value.

nit: update the comment to match the logic: "If there is no DEFAULT expression return NULL or a default value."

@pawalt
Copy link
Contributor Author

pawalt commented Aug 24, 2021

Failing test is a flake (didn't appear in previous runs, and I haven't changed anything). Going to bors this in.

bors r=arulajmani,mgartner,ajwerner,rytaft

This PR implements ON UPDATE expressions.

Resolves cockroachdb#69057

Release note (sql change): An ON UPDATE expression can now be added to a
column. Whenever a row is updated without modifying the ON UPDATE column,
the column's ON UPDATE expression is re-evaluated, and the column is
updated to the result.

Release justification: This design has been reviewed in RFC and code
form as well as extensively tested.
@craig
Copy link
Contributor

craig bot commented Aug 24, 2021

Canceled.

@pawalt
Copy link
Contributor Author

pawalt commented Aug 24, 2021

bors r=arulajmani,mgartner,ajwerner,rytaft

@craig
Copy link
Contributor

craig bot commented Aug 24, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: implement ON UPDATE in optimizer
8 participants