-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: inverted-index accelerate queries with filters in the form json->'a' IN (...) #87856
Closed
Tracked by
#59331
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
Comments
mgartner
added
the
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
label
Sep 12, 2022
Shivs11
added a commit
to Shivs11/cockroach
that referenced
this issue
Feb 2, 2023
…>'field' IN (...) Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON arrays from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has it's own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is OR'ed together to replicate the functionality of the IN operator. In other words, a filter of the form: j->0 IN ('1', '2') is thought of as (j->0 = '1') OR (j->0 = '2') and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: cockroachdb#87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, json_col->0 IN ('1', '2') and json_col->'a' IN ('1', '2')
Shivs11
added a commit
to Shivs11/cockroach
that referenced
this issue
Feb 6, 2023
…>'field' IN (...) Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON objects from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has its own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is OR'ed together to replicate the functionality of the IN operator. In other words, a filter of the form: j->0 IN ('1', '2') is thought of as (j->0 = '1') OR (j->0 = '2') and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: cockroachdb#87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, json_col->0 IN ('1', '2') and json_col->'a' IN ('1', '2')
Shivs11
added a commit
to Shivs11/cockroach
that referenced
this issue
Feb 7, 2023
…>'field' IN (...) Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON objects from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has its own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is OR'ed together to replicate the functionality of the IN operator. In other words, a filter of the form: j->0 IN ('1', '2') is thought of as (j->0 = '1') OR (j->0 = '2') and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: cockroachdb#87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, json_col->0 IN ('1', '2') and json_col->'a' IN ('1', '2')
Shivs11
added a commit
to Shivs11/cockroach
that referenced
this issue
Feb 8, 2023
…>'field' IN (...) Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON objects from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has its own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is OR'ed together to replicate the functionality of the IN operator. In other words, a filter of the form: j->0 IN ('1', '2') is thought of as (j->0 = '1') OR (j->0 = '2') and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: cockroachdb#87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, json_col->0 IN ('1', '2') and json_col->'a' IN ('1', '2')
Shivs11
added a commit
to Shivs11/cockroach
that referenced
this issue
Feb 9, 2023
…>'field' IN (...) Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON objects from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has its own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is OR'ed together to replicate the functionality of the IN operator. In other words, a filter of the form: j->0 IN ('1', '2') is thought of as (j->0 = '1') OR (j->0 = '2') and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: cockroachdb#87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, json_col->0 IN ('1', '2') and json_col->'a' IN ('1', '2')
Shivs11
added a commit
to Shivs11/cockroach
that referenced
this issue
Feb 14, 2023
…>'field' IN (...) Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON objects from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has its own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is OR'ed together to replicate the functionality of the IN operator. In other words, a filter of the form: j->0 IN ('1', '2') is thought of as (j->0 = '1') OR (j->0 = '2') and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: cockroachdb#87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, json_col->0 IN ('1', '2') and json_col->'a' IN ('1', '2')
craig bot
pushed a commit
that referenced
this issue
Feb 14, 2023
96471: opt: inverted-index accelerate queries with filters in the form `json_col ->'field' IN (...)` r=Shivs11 a=Shivs11 Previously, the optimizer did not plan inverted index scans for queries consisting of the IN operator. To address this, we now build JSON arrays from fetch value expressions consisting of both integers and strings. Each entity present inside of the tuple following the IN operator has it's own seperate JSON array being created. From these JSON arrays, inverted spans are built. Each inverted span generated is `OR'ed` together to replicate the functionality of the `IN` operator. In other words, a filter of the form: `j->0 IN ('1', '2')` is thought of as `(j->0 = '1') OR (j->0 = '2')` and the generated inverted span replicates this behaviour. Epic: CRDB-3301 Fixes: #87856 Release note (performance improvement): The optimizer now plans inverted index scans for queries that filter by JSON fetch value operators (->), consisting of both string and integer index values, alongside the IN operator, e.g, `json_col->0 IN ('1', '2')` and `json_col->'a' IN ('1', '2')` 96696: sql: handle implicit record return types better r=rharding6373 a=rharding6373 This PR validates the UDF return type at build time if it is a user-defined type. If the return type is no longer compatible with what the UDF body returns, we return an error instead. This is more in line with postgres behavior. Fixes: 95558 Epic: CRDB-19496 Release note (sql change): UDFs with implicit record return types will return an error when called if the return type has been altered and is no longer compatible with the body of the UDF. 96698: sql: allow UDFs with SETOF return types r=mgartner a=mgartner #### distsql: check projectSetNode exprs for distribution recommendation Previously, the scalar expressions within a `projectSetNode` were not checked for their ability to be distributed. This commit fixes this. Release note: None #### sql: use `eval.ValueGenerator` for evaluation of routines This commit creates a new implementation of the `eval.ValueGenerator` that is used to evaluate `tree.Routine` expressions. This will allow for a more unified implementation of both scalar and set-returning UDFs in a future commit. Release note: None #### sql: allow UDFs with SETOF return types Set-returning UDFs can now be created by specifying a `SETOF` return type. For example: CREATE FUNCTION fn() RETURNS SETOF INT LANGUAGE SQL AS $$ SELECT a FROM abc; $$ Fixes #86391 Release note (sql change): Set-returning user-defined functions can now be created by specific a `SETOF` return type. Co-authored-by: Shivam Saraf <[email protected]> Co-authored-by: rharding6373 <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
For a query like
SELECT * FROM t WHERE json_col->'field' IN ('"val1"', '"val2"')
, the optimizer should explore using an inverted index scan on anINVERTED INDEX (json_col)
, if it one exists.We should also handle filters like
json_col->0 IN (...)
.Epic: CRDB-24501
Jira issue: CRDB-19574
The text was updated successfully, but these errors were encountered: