Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: make hashed OIDs identifiable for optimal virtual indexes #103399

Merged
merged 1 commit into from
May 16, 2023

Conversation

fqazi
Copy link
Collaborator

@fqazi fqazi commented May 16, 2023

Due to the recent increase in table sizes for builtins, certain tables like pg_class are now joining on hundreds of rows. While these tables contain virtual indexes, these virtual indexes will return to full table scan if an OID is not found. That's because they have no way of knowing if the OID is a hashed index or a descriptor ID. To address this, these changes are going to alter OID hashing, so that OID hashes are generated at values greater than CockroachPredefinedOIDMax. This then allows virtual indexes to avoid full table scans if the value matched is not found (i.e. if its a builtin function reference).

Fixes: #102851

Release note (sql change): OID generation for pg_catalog has changed. For example column, index and constraint OID's will have different values. Relation, type and function OID's are unchanged.

Benchmark difference after this change:

Before:
BenchmarkORMQueries/introspection_description_join-16                          1        43943592221 ns/op              134.0 roundtrips

After:
BenchmarkORMQueries/introspection_description_join-16                          1        8972831474 ns/op               134.0 roundtrips
goarch: arm64
                                             │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.ae796e15493 │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.a5ad91451a4 │
                                             │                                      sec/op                                       │                          sec/op                            vs base                │
ORMQueries/introspection_description_join-10                                                                         26.639 ± 3%                                                  4.874 ± 3%  -81.70% (p=0.000 n=10)

                                             │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.ae796e15493 │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.a5ad91451a4 │
                                             │                                    roundtrips                                     │                          roundtrips                            vs base            │
ORMQueries/introspection_description_join-10                                                                          134.0 ± 0%                                                      134.0 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                                             │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.ae796e15493 │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.a5ad91451a4 │
                                             │                                       B/op                                        │                           B/op                             vs base                │
ORMQueries/introspection_description_join-10                                                                       13.033Gi ± 0%                                                3.704Gi ± 0%  -71.58% (p=0.000 n=10)

                                             │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.ae796e15493 │ /var/folders/p3/c61_z_vd3r7dr1_hnztm3ryr0000gq/T/tmp.BxbnSBiT4e/bench.a5ad91451a4 │
                                             │                                     allocs/op                                     │                         allocs/op                          vs base                │
ORMQueries/introspection_description_join-10                                                                        189.16M ± 0%                                                 39.85M ± 0%  -78.93% (p=0.000 n=10)

Due to the recent increase in table sizes for builtins,
certain tables like pg_class are now joining on hundreds
of rows. While these tables contain virtual indexes, these
virtual indexes will return to full table scan if an OID is
not found. That's because they have no way of knowing if
the OID is a hashed index or a descriptor ID. To address this,
these changes are going to alter OID hashing, so that OID hashes
are generated at values greater than CockroachPredefinedOIDMax.
This then allows virtual indexes to avoid full table scans
if the value matched is not found (i.e. if its a builtin
function reference).

Fixes: cockroachdb#102851

Release note (sql change): OID generation for pg_catalog
has changed. For example column, index and constraint
OID's will have different values. Relation, type and function
OID's are unchanged.
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@fqazi fqazi marked this pull request as ready for review May 16, 2023 12:27
@fqazi fqazi requested a review from a team as a code owner May 16, 2023 12:27
Copy link
Contributor

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fqazi fqazi added the backport-23.1.x Flags PRs that need to be backported to 23.1 label May 16, 2023
@fqazi
Copy link
Collaborator Author

fqazi commented May 16, 2023

@chengxiong-ruan TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented May 16, 2023

Build succeeded:

@craig craig bot merged commit 46e1fe9 into cockroachdb:master May 16, 2023
@blathers-crl
Copy link

blathers-crl bot commented May 16, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from a5ad914 to blathers/backport-release-23.1-103399: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to write an rttanalysis test that shows how this helps?

@@ -4740,6 +4747,10 @@ func (h oidHasher) writeTypeTag(tag oidTypeTag) {

func (h oidHasher) getOid() *tree.DOid {
i := h.h.Sum32()
// Ensure generated OID hashes are above the pre-defined max limit,
// this gives us a cheap filter.
i = i%(math.MaxUint32-oidext.CockroachPredefinedOIDMax) +
Copy link
Collaborator

@rafiss rafiss May 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like it would be quicker to do

if i < oidext.CockroachPredefinedOIDMax {
  i += oidext.CockroachPredefinedOIDMax
}

since it's fewer arithmetic operations. this would have also reduced the size of the diff in tests. but more importantly, it's easier to understand what the code is doing.

@rafiss
Copy link
Collaborator

rafiss commented May 17, 2023

blathers backport 23.1

@blathers-crl
Copy link

blathers-crl bot commented May 17, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from a5ad914 to blathers/backport-release-23.1-103399: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@rafiss
Copy link
Collaborator

rafiss commented May 17, 2023

blathers backport 23.1

1 similar comment
@rafiss
Copy link
Collaborator

rafiss commented May 17, 2023

blathers backport 23.1

@rafiss
Copy link
Collaborator

rafiss commented May 17, 2023

blathers backport 23.1

craig bot pushed a commit that referenced this pull request May 17, 2023
101485: roachtest: add 32TB, 400 incremental restore roachtest r=rhu713 a=rhu713

Add a 32TB, 400 incremental restore roachtest. The restore in this roachtest has been verified to fail on v22.2.7 due to OOM, but should succeed for v23.1+ due to the slim manifest changes.

Release note: None

103487: ccl/sqlproxyccl: default tenants to both public and private connectivities r=pjtatlow a=jaylim-crl

Previously, we had introduced 3 states (ALLOW_NONE, ALLOW_PUBLIC, and ALLOW_PRIVATE) with the intention that ALLOW_NONE would be useful. Thinking about it further, it does not make sense for a cluster to support ALLOW_NONE (i.e. what would that actually mean?). This commit updates the connectivity types to support ALLOW_ALL, ALLOW_PUBLIC_ONLY, and ALLOW_PRIVATE_ONLY. By default, ConnectivityType=0, which maps to ALLOW_ALL.

Release note: None

Epic: none

103540: roachtest: fix drain test to account for jobs_wait r=rafiss a=rafiss

fixes #103396

The change in e8b075b made this test start failing. Since we don't need this test to do anything with jobs, we can fix it by turning off the jobs waiting period.

This also fixes the node drain CLI command to be aware of jobs_wait.

Release note: None

103547: sql: use a simpler method for clamping OID hash r=rafiss a=rafiss

This reduces the amount of OID churn and also is simpler, so might perform better. Having less churn is better for backports.

informs #103399
Release note: None

Co-authored-by: Rui Hu <[email protected]>
Co-authored-by: Jay <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

75% increase in Django test suite run time between CockroachDB 22.2 and 23.1
4 participants