sql: composable inverted indexes #109302
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
(All credit goes to @lin-crl for this idea.)
Suppose we had a table with a JSON column, and we wanted to quickly find all the rows with a certain path equal to "abc". The path might vary depending on the query. We can do that today with an inverted index:
The plan looks like:
But now suppose we want to find all the rows with a certain path LIKE "%abc%". Normally we would use a trigram inverted index for this. We could create a trigram inverted index on the entire JSON column cast to a string:
That plan looks like:
But that plan won't necessarily guarantee that we're matching against one specific path in the JSON column. To do this we need to add back the predicate as an additional filter:
This means we'll have to perform the LIKE operation multiple times on a single row to confirm a match, and we might be searching many more rows than we need to. Here's the plan:
Instead of using one inverted index plus a filter, it would be nice if we could combine the power of the two inverted indexes to only perform a trigram search on the contents of the path we want. In other words, if we could compose the two inversions. Something like this:
CREATE INVERTED INDEX ON t (j, (j_inverted_key::string) gin_trgm_ops);
Where
j_inverted_key
is the name of the inverted JSON value. So for each row of the table there would be multiple rows (for each JSON path) and then for each of these there would be multiple rows (for each trigram).Jira issue: CRDB-30858
The text was updated successfully, but these errors were encountered: