-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: support INVERTED INDEX range scans #24960
Comments
Assigning to @awoods187 for prioritization |
@RaduBerinde is this related to the computed index ideas we've been discussing? |
I think so, we could have an index on a computed value that extracts the json field. |
A similar use case is for prefix matches on inverted index string columns:
Today, you can do exact matches:
With some way to do range scans, you could do something like (syntax doesn't work, but this is the intent):
|
We also got bitten by this, a filter over an JSON column with an inverted index uses it, but if you add another field to the game, a PK for example, it actually ignores the inverted index. Thanks for raising this @danhhz DROP table foo;
CREATE TABLE foo (A INT PRIMARY KEY, B jsonb, C VARCHAR);
INSERT INTO foo (A, B, C) SELECT generate_series(1,100) AS A, '{"values": ["foo", "bar", "baz"]}' AS B, md5(random()::text) AS C;
CREATE INVERTED INDEX foo_inv ON foo(B);
CREATE INDEX foo_idx ON foo(C); EXPLAIN SELECT * FROM foo WHERE B @> '{"values": ["baz"]}'; Looks good: [
{
"tree": "",
"field": "distributed",
"description": "false"
},
{
"tree": "",
"field": "vectorized",
"description": "false"
},
{
"tree": "index-join",
"field": "",
"description": ""
},
{
"tree": " │",
"field": "table",
"description": "foo@primary"
},
{
"tree": " │",
"field": "key columns",
"description": "a"
},
{
"tree": " └── scan",
"field": "",
"description": ""
},
{
"tree": "",
"field": "table",
"description": "foo@foo_inv"
},
{
"tree": "",
"field": "spans",
"description": "/\"values\"/Arr/\"baz\"-/\"values\"/Arr/\"baz\"/PrefixEnd"
}
] But: EXPLAIN SELECT * FROM foo WHERE B @> '{"values": ["baz"]}' AND C = 'someAutogenID'; Doesn't: [
{
"tree": "",
"field": "distributed",
"description": "false"
},
{
"tree": "",
"field": "vectorized",
"description": "false"
},
{
"tree": "filter",
"field": "",
"description": ""
},
{
"tree": " │",
"field": "filter",
"description": "b @> '{\"values\": [\"baz\"]}'"
},
{
"tree": " └── index-join",
"field": "",
"description": ""
},
{
"tree": " │",
"field": "table",
"description": "foo@primary"
},
{
"tree": " │",
"field": "key columns",
"description": "a"
},
{
"tree": " └── scan",
"field": "",
"description": ""
},
{
"tree": "",
"field": "table",
"description": "foo@foo_idx"
},
{
"tree": "",
"field": "spans",
"description": "/\"78455d02293f0f16ab5e519c244a70dc\"-/\"78455d02293f0f16ab5e519c244a70dc\"/PrefixEnd"
}
] |
@lopezator - in the second case, it's much better to use the primary index since we scan at most one row (for a=3). Using the inverted index would be worse in most cases. |
@RaduBerinde you are right, bad example. I've update the example above to be more clear. |
@lopezator Your updated example is scanning |
We have marked this issue as stale because it has been inactive for |
a comment will keep it active |
From the forum: https://forum.cockroachlabs.com/t/multi-tenant-custom-fields-saas-app/1565/2
So, let’s say all tenants’ products are in one table and there’s foreign key tenant_id. Then we have a json field, custom_data. A tenant might have a custom field price. Then the tenant wants to search all his products where price > 100. An index on the foreign key, possibly compound with other “static” fields will speed up the query. But an index on json field will not be useful in this case, right?
Jira issue: CRDB-5739
The text was updated successfully, but these errors were encountered: