-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: check equivalent constraint when creating hash index #74179
sql: check equivalent constraint when creating hash index #74179
Conversation
fc13e13
to
0263058
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/create_table.go, line 1645 at r1 (raw file):
} if checkConstraintDesc.Expr == inputCheckConstraintDesc.Expr { inputCheckConstraint.Hidden = true
I kept questioning myself why we want to make a user defined constraint hidden.
I think my answer is that we may want to have a consistent SHOW CREATE
experience for tables with hash sharded index, and it should be super rare that a super smart user defined a equivalent constraint for a shard column if we don't show the constraint at all >_> But if there would be a super smart user, showing the constraint would give them a chance to drop the constraint, do we want that? Dropping the constraint in this case could be fine since we trust the shard column so much that we skip validation during backfill already. The only thing is how optimizer gonna feel, I think it's also fine since it's user's choice.
However step back a bit I also questioned myself, why do we need all these equivalent constraint check at all. Why can't we just try to create the constraint anyway without checking? In this case, if a smart user defined an exactly same constraint, it would fail due to duplicated constraint. I think it's also an ok experience for user and this whole block of code would be gone.
pkg/sql/show_create_clauses.go, line 566 at r1 (raw file):
) error { for _, e := range desc.AllActiveAndInactiveChecks() { if e.Hidden {
Let me know if you have a strong feeling that it should be in a separate PR.
0263058
to
675f6e4
Compare
02f2f77
to
7ac4cb8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at this. Please bear in mind that my review is very superficial. I didn't find anything obviously wrong in this PR. You should probably have someone more knowledgeable take a look. Please bear in mind that it's OK to annoy people for reviews 👍
Reviewed 1 of 8 files at r1, 1 of 4 files at r2, 1 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @chengxiong-ruan)
-- commits, line 2 at r3:
s/quivalent/equivalent/
-- commits, line 16 at r3:
Can you please reword this sentence?
pkg/sql/create_table.go, line 1645 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
I kept questioning myself why we want to make a user defined constraint hidden.
I think my answer is that we may want to have a consistent
SHOW CREATE
experience for tables with hash sharded index, and it should be super rare that a super smart user defined a equivalent constraint for a shard column if we don't show the constraint at all >_> But if there would be a super smart user, showing the constraint would give them a chance to drop the constraint, do we want that? Dropping the constraint in this case could be fine since we trust the shard column so much that we skip validation during backfill already. The only thing is how optimizer gonna feel, I think it's also fine since it's user's choice.However step back a bit I also questioned myself, why do we need all these equivalent constraint check at all. Why can't we just try to create the constraint anyway without checking? In this case, if a smart user defined an exactly same constraint, it would fail due to duplicated constraint. I think it's also an ok experience for user and this whole block of code would be gone.
I may be wrong here, and I don't understand any of this too deeply, however I believe that anything that has crdb_internal
in its name is assumed to not have been created by a user. This isn't enforced anywhere, but it is a convention.
I believe the goal here is that we want the output of a SHOW CREATE TABLE
statement for a hash-sharded table to round-trip correctly. In other words, that executing the resulting CREATE TABLE
statement would produce the same table.
I may be completely wrong here.
pkg/sql/create_table.go, line 1623 at r3 (raw file):
} // if there is an equivalent check constraint from the CREATE TABLE (should be rare since we hide the constraint of shard column)
style: comments should wrap around at the 80th character column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @postamar)
pkg/sql/create_index.go, line 237 at r3 (raw file):
} columns = newColumns if err := params.p.maybeSetupConstraintForShard(params.ctx, tableDesc, shardCol, indexDesc.Sharded.ShardBuckets); err != nil {
nit: wrap this after maybeSetupConstraintForShard(
if err := params.p.maybeSetupConstraintForShard(
params.ctx, tableDesc, shardCol, indexDesc.Sharded.ShardBuckets,
); err != nil {
pkg/sql/create_index.go, line 476 at r3 (raw file):
// setupShardedIndex creates a shard column for the given index descriptor. It // returns the shard column and the new column list for the index If the shard
nit: missing period. Can you say "is new, it is added to the tableDesc as a mutation in DELETE_ONLY"?
pkg/sql/create_table.go, line 1645 at r1 (raw file):
The only thing is how optimizer gonna feel, I think it's also fine since it's user's choice.
The optimizer is going to feel sad. It absolutely relies on the constraint. If a user drops it, I'm inclined to not care that much. Alternatively, one rather dire thing we could do is fail validation if we're missing the check constraint. If we do that, we should also add code to disallow the user from dropping it.
I believe the goal here is that we want the output of a SHOW CREATE TABLE statement for a hash-sharded table to round-trip correctly. In other words, that executing the resulting CREATE TABLE statement would produce the same table.
This is definitely a goal.
pkg/sql/show_create_clauses.go, line 566 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
Let me know if you have a strong feeling that it should be in a separate PR.
I'm happy with this.
pkg/sql/create_table.go, line 1645 at r1 (raw file): Previously, ajwerner wrote…
Thanks @postamar and @ajwerner , yeah I think I'm also inclined to not care too much if a user drop the constraint, I'll remove this line of code to make user defined constraint visible. |
7ac4cb8
to
0bfea0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
Previously, postamar (Marius Posta) wrote…
s/quivalent/equivalent/
Done.
Previously, postamar (Marius Posta) wrote…
Can you please reword this sentence?
Reworded. Let me know if you're happy with it.
pkg/sql/create_index.go, line 476 at r3 (raw file):
Previously, ajwerner wrote…
nit: missing period. Can you say "is new, it is added to the tableDesc as a mutation in DELETE_ONLY"?
Done. It actually has two different cases.
pkg/sql/create_table.go, line 1623 at r3 (raw file):
Previously, postamar (Marius Posta) wrote…
style: comments should wrap around at the 80th character column
Done, thanks for catching this~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @postamar)
pkg/sql/logictest/testdata/logic_test/hash_sharded_index, line 805 at r4 (raw file):
CONSTRAINT t_pkey PRIMARY KEY (a ASC) USING HASH WITH BUCKET_COUNT = 8, FAMILY "primary" (a) )
I feel like this test is setting out to demonstrate that the table created from show create also gets a good plan. That's nice and what you want, but I think it can be done more elegantly. Also, please comment the intention of the test.
Copy the statement to a variable
let $create_statement
SELECT create_statement FROM [SHOW CREATE TABLE t]
Use the statement
statement ok
$create_statement
Code quote:
SELECT @2 FROM [SHOW CREATE TABLE t]
----
CREATE TABLE public.t (
crdb_internal_a_shard_8 INT4 NOT VISIBLE NOT NULL AS (mod(fnv32(crdb_internal.datums_to_bytes(a)), 8:::INT8)) VIRTUAL,
a INT8 NOT NULL,
CONSTRAINT t_pkey PRIMARY KEY (a ASC) USING HASH WITH BUCKET_COUNT = 8,
FAMILY "primary" (a)
)
0bfea0f
to
8755407
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/logictest/testdata/logic_test/hash_sharded_index, line 805 at r4 (raw file):
Previously, ajwerner wrote…
I feel like this test is setting out to demonstrate that the table created from show create also gets a good plan. That's nice and what you want, but I think it can be done more elegantly. Also, please comment the intention of the test.
Copy the statement to a variable
let $create_statement SELECT create_statement FROM [SHOW CREATE TABLE t]Use the statement
statement ok $create_statement
nice, thanks for suggesting that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r1, 1 of 4 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @postamar)
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
Reworded. Let me know if you're happy with it.
I still think the docs writers will struggle with this. Consider:
Release note (sql change): Before this change, the check constraint on shard
columns used by hash sharded index was printed in the corresponding
`SHOW CREATE TABLE`. The constraint had been shown because cockroach
lacked logic to ensure that such a constraint always exists for shard columns.
The database now ensures that shard columns which are part of hash sharded
indexes always have the corresponding check constraint which the optimizer relies
on to achieve properly optimized query plans on hash sharded indexes. We no
longer display this constraint in `SHOW CREATE TABLE` as it is now implied by
the `USING HASH` clause on the relevant index.
-- commits, line 11 at r5:
Grammar nits:
constraint no matter if a shard column is newly created or not. With
Code quote:
string is the same. With this logic we can try to create the
constraint no matter a shard column is newly created or not. With
-- commits, line 15 at r5:
created, or skipped if one already exists
Fixes cockroachdb#68031 Previously we only try to create constraint for shard column if it's newly created. We check duplicate constraint for shard column when `Alter Primary Key` and `Create Index`, however the check is simply a name check. This pr adds logic to check equivalent constraint by checking if the formatted expression string is the same. With this logic we can try to create the constraint no matter if a shard column is newly created or not. With this fix, we also don't need to expose the constraint through `SHOW CREATE TABLE` result since we make sure the constraint is created or skipped if one already exists. Release note (sql change): Before this change, the check constraint on shard column used by hash sharded index was printed in the corresponding `SHOW CREATE TABLE`. The constraint had been shown because cockroach lacked logic to ensure that shard columns which are part of hash sharded indexs always have the check constraint which the optimizer relies on to achieve properly optimized plans on hash sharded indexes. We no longer display this constraint in `SHOW CREATE TABLE` as it is now implied by the `USING HASH` clause on the relevant index.
8755407
to
053d361
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @postamar)
Previously, ajwerner wrote…
I still think the docs writers will struggle with this. Consider:
Release note (sql change): Before this change, the check constraint on shard columns used by hash sharded index was printed in the corresponding `SHOW CREATE TABLE`. The constraint had been shown because cockroach lacked logic to ensure that such a constraint always exists for shard columns. The database now ensures that shard columns which are part of hash sharded indexes always have the corresponding check constraint which the optimizer relies on to achieve properly optimized query plans on hash sharded indexes. We no longer display this constraint in `SHOW CREATE TABLE` as it is now implied by the `USING HASH` clause on the relevant index.
Thank you!
bors r+ |
TFTR! |
Build failed (retrying...): |
Build succeeded: |
Fixes #68031
Previously we only try to create constraint for shard column if
it's newly created. We check duplicate constraint for shard
column when
Alter Primary Key
andCreate Index
, however thecheck is simply a name check. This pr adds logic to check
equivalent constraint by checking if the formatted expression
string is the same. With this logic we can try to create the
constraint no matter if a shard column is newly created or not.
With this fix, we also don't need to expose the constraint through
SHOW CREATE TABLE
result since we make sure the constraint iscreated or skipped if one already exists.
Release note (sql change): Before this change, the check constraint
on shard column used by hash sharded index was printed in the
corresponding
SHOW CREATE TABLE
. The constraint had been shownbecause cockroach lacked logic to ensure that shard columns which
are part of hash sharded indexs always have the check constraint
which the optimizer relies on to achieve properly optimized plans
on hash sharded indexes. We no longer display this constraint in
SHOW CREATE TABLE
as it is now implied by theUSING HASH
clauseon the relevant index.