Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: set hash-sharded index column type to INT8 #76930

Merged
merged 1 commit into from
Mar 9, 2022

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Feb 23, 2022

Shard columns of hash-sharded indexes are computed columns with an
expression in the form mod(fnv32(crdb_internal.datums_to_bytes(...))).
The mod() builtin returns a value of type INT8, while the shard
column's type was previously INT4. Because these types did not match,
mod expressions were wrapped in the assignment cast function,
crdb_internal.assignment_cast, in mutation query plans.

This commit changes the type of newly created shard columns to INT8,
eliminating the need for an assignment cast. Because all integers are
encoded as varints in keys and values, this will not increase the amount
of space on disk required for these columns.

Release justification: This is a minor change to hash-sharded indexes,
which are a newly un-experimentalized feature in the upcoming release.

Release note (sql change): The type of shard columns created for
hash-sharded indexes have changed from INT4 to INT8. This should
have no effect on behavior or performance.

@mgartner mgartner requested review from ajwerner, chengxiong-ruan and a team February 23, 2022 16:08
@mgartner mgartner requested a review from a team as a code owner February 23, 2022 16:08
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome~

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

@mgartner mgartner force-pushed the use-int8-for-shard-column branch from 4722ea9 to 0c154e8 Compare February 23, 2022 17:13
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we could / should use fnv64 instead of fnv32?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we could / should use fnv64 instead of fnv32?

I think this would only be useful when the bucket count is greater than 2^32 - 1, which seems unlikely. In fact, I just tried to create a hash-sharded index with a bucket count of 999999999 (which is less than 2^32 - 1), and the CREATE TABLE statement locked up. I created an issue to track this: #77000.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

@mgartner mgartner force-pushed the use-int8-for-shard-column branch 4 times, most recently from b1d64e9 to 6772437 Compare February 25, 2022 16:46
@mgartner mgartner force-pushed the use-int8-for-shard-column branch from 6772437 to ccd582c Compare March 7, 2022 15:06
Shard columns of hash-sharded indexes are computed columns with an
expression in the form `mod(fnv32(crdb_internal.datums_to_bytes(...)))`.
The `mod()` builtin returns a value of type `INT8`, while the shard
column's type was previously `INT4`. Because these types did not match,
`mod` expressions were wrapped in the assignment cast function,
`crdb_internal.assignment_cast`, in mutation query plans.

This commit changes the type of newly created shard columns to `INT8`,
eliminating the need for an assignment cast. Because all integers are
encoded as varints in keys and values, this will not increase the amount
of space on disk required for these columns.

Release justification: This is a minor change to hash-sharded indexes,
which are a newly un-experimentalized feature in the upcoming release.

Release note (sql change): The type of shard columns created for
hash-sharded indexes have changed from `INT4` to `INT8`. This should
have no effect on behavior or performance.
@mgartner mgartner force-pushed the use-int8-for-shard-column branch from ccd582c to fde3881 Compare March 8, 2022 21:02
@mgartner
Copy link
Collaborator Author

mgartner commented Mar 8, 2022

@ajwerner can you take a look at this and sign-off if you're ok with it.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@mgartner
Copy link
Collaborator Author

mgartner commented Mar 9, 2022

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 9, 2022

🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

@mgartner
Copy link
Collaborator Author

mgartner commented Mar 9, 2022

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 9, 2022

Stopped waiting for PR status (Github check) without running due to duplicate requests to run. You may check Bors to see that this PR is included in a batch by one of the other requests.

@craig
Copy link
Contributor

craig bot commented Mar 9, 2022

Build succeeded:

@craig craig bot merged commit a161490 into cockroachdb:master Mar 9, 2022
@mgartner mgartner deleted the use-int8-for-shard-column branch March 9, 2022 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants