Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtins: mark some pg_.* builtins as strict #96397

Merged
merged 1 commit into from
Feb 24, 2023

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Feb 2, 2023

Builtins defined using the UDF Body field will be wrapped in a CASE
expression if they are strict, i.e., CalledOnNullInput=false. When the
builtin is inlined, the CASE expression prevents decorrelation,
leaving a slow apply-join in the query plan. This caused a significant
regression of some ORM introspection queries.

Some of these builtins have filters that cause the SQL body to return no rows
if any of the arguments is NULL. In this case, the builtin will have the same
behavior whether or not it is defined as being strict. We can safely optimize
these builtins by setting CalledOnNullInput=true.

The following conditions are sufficient to prove that CalledOnNullInput can
be set for a builtin function with a SQL body:

  1. The WHERE clause of the SQL query null-rejects every argument of the
    builtin. Operators like = and < null-reject their operands because
    they filter rows for which an operand is NULL.

  2. The arguments are not used elsewhere in the query. This is not strictly
    necessary, but simplifies the proof because it ensures NULL arguments will
    not cause the builtin to error.

Examples of SQL statements that would allow CalledOnNullInput to be set:

SELECT * FROM tab WHERE $1=1 AND $2='two';

SELECT * FROM tab WHERE $1 > 0;

Fixes #96218
Fixes #95569

Epic: None

Release note: None

@mgartner mgartner requested review from rafiss and a team February 2, 2023 00:12
@blathers-crl
Copy link

blathers-crl bot commented Feb 2, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@mgartner
Copy link
Collaborator Author

mgartner commented Feb 2, 2023

In a cockroach demo with ~1000 tables, the introspection query mentioned in #96218 takes ~225ms, must faster than the ~1.5s latency on v22.2.2.

One non-obvious side-effect of this change is that these builtins could potentially result in errors when presented with a NULL argument, whereas before they'd always result in NULL. In PG, these functions are listed as leakproof, so I don't think this should be a problem, but I want to spend a bit more time thinking about this before merging it.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lgtm! i think the same change should be made to the following, since they have the same pattern of using a single WHERE clause filtering by the first parameter:

  • makePGGetViewDef
  • makePGGetConstraintDef
  • pg_get_functiondef
  • pg_get_function_arguments
  • pg_get_function_result
  • pg_get_function_identity_arguments
  • pg_get_indexdef
  • col_description
  • obj_description
  • shobj_description
  • _pg_index_position

@rafiss rafiss mentioned this pull request Feb 3, 2023
@rafiss
Copy link
Collaborator

rafiss commented Feb 15, 2023

Will we be able to merge either this PR or #96545?

The performance regression is causing nightlies to fail (see #95569 (comment)), since the latency allows a different bug to manifest (#94337).

@DrewKimball DrewKimball force-pushed the 96218-fix-pg-builtin-regression branch 2 times, most recently from bb130f7 to 7b33861 Compare February 24, 2023 07:15
@DrewKimball DrewKimball changed the title builtins: mark pg_.*_is_visible builtins as not strict builtins: mark some pg_.* builtins as strict Feb 24, 2023
@DrewKimball
Copy link
Collaborator

I've added all the pg_.* builtins that I could prove would return the expected result when called with NULL arguments, and added corresponding tests.

Builtins defined using the UDF `Body` field will be wrapped in a `CASE`
expression if they are strict, i.e., `CalledOnNullInput=false`. When the
builtin is inlined, the `CASE` expression prevents decorrelation,
leaving a slow apply-join in the query plan. This caused a significant
regression of some ORM introspection queries.

Some of these builtins have filters that cause the SQL body to return no rows
if any of the arguments is NULL. In this case, the builtin will have the same
behavior whether or not it is defined as being strict. We can safely optimize
these builtins by setting `CalledOnNullInput=true`.

The following conditions are sufficient to prove that `CalledOnNullInput` can
be set for a builtin function with a SQL body:

  1. The WHERE clause of the SQL query *null-rejects* every argument of the
     builtin. Operators like `=` and `<` *null-reject* their operands because
     they filter rows for which an operand is NULL.

  2. The arguments are not used elsewhere in the query. This is not strictly
     necessary, but simplifies the proof because it ensures NULL arguments will
     not cause the builtin to error.

Examples of SQL statements that would allow `CalledOnNullInput` to be set:
```
SELECT * FROM tab WHERE $1=1 AND $2='two';

SELECT * FROM tab WHERE $1 > 0;
```

Fixes cockroachdb#96218
Fixes cockroachdb#95569

Epic: None

Release note: None
@DrewKimball DrewKimball force-pushed the 96218-fix-pg-builtin-regression branch from 7b33861 to 649f219 Compare February 24, 2023 17:36
@DrewKimball DrewKimball requested review from a team and removed request for a team February 24, 2023 17:36
Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Thanks for wrapping this up Drew!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@DrewKimball
Copy link
Collaborator

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 24, 2023

Build succeeded:

@craig craig bot merged commit ae2e8b8 into cockroachdb:master Feb 24, 2023
@mgartner mgartner deleted the 96218-fix-pg-builtin-regression branch February 28, 2023 16:35
@mgartner
Copy link
Collaborator Author

Thanks @DrewKimball!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: severe regression in ORM query performance {django} roachtest: django failed
5 participants