-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: new functions crdb_internal.ranges_in_span
and crdb_internal.tenant_ranges_per_table
#96011
Conversation
35e6926
to
3e7c95b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)
pkg/sql/logictest/testdata/logic_test/ranges_in_span
line 1 at r3 (raw file):
# LogicTest: local
When new functions are added, we should make sure they are secondary-tenant compatible.
It looks like you are already using rangedesc.Iterator
which is good, but # LogicTest: local
will prevent the test from being run under a config that uses a secondary tenant (for example the 3node-tenant
config).
What happens if you remove this and run it under the 3node-tenant
config?
3e7c95b
to
13074d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ecwall)
pkg/sql/logictest/testdata/logic_test/ranges_in_span
line 1 at r3 (raw file):
Previously, ecwall (Evan Wall) wrote…
When new functions are added, we should make sure they are secondary-tenant compatible.
It looks like you are already using
rangedesc.Iterator
which is good, but# LogicTest: local
will prevent the test from being run under a config that uses a secondary tenant (for example the3node-tenant
config).What happens if you remove this and run it under the
3node-tenant
config?
Looks like a test failure:
RangeIterator failed to seek to /Meta2/Table/3/NULL: rpc error: code = Unauthenticated desc = requested key /Meta2/Table/3/NULL not fully contained in tenant keyspace /Tenant/1{0-1}
Would it be acceptable to swallow this error and continue iterating, effectively skip any iterations we aren't authed for? Or is there a better way to handle this, I'm not familiar with how we handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @THardy98)
pkg/sql/sem/builtins/generator_builtins.go
line 2941 at r6 (raw file):
} func getRangeIteratorWithinSpan(
Can you add GetRangeIterator(context.Context, roachpb.Span)
to eval.Planner
(see GetRangeDescByID
)? Then you don't need to pass in the db
from here
pkg/sql/logictest/testdata/logic_test/ranges_in_span
line 1 at r3 (raw file):
Previously, THardy98 (Thomas Hardy) wrote…
Looks like a test failure:
RangeIterator failed to seek to /Meta2/Table/3/NULL: rpc error: code = Unauthenticated desc = requested key /Meta2/Table/3/NULL not fully contained in tenant keyspace /Tenant/1{0-1}
Would it be acceptable to swallow this error and continue iterating, effectively skip any iterations we aren't authed for? Or is there a better way to handle this, I'm not familiar with how we handle this.
That means you are trying to access keys outside of the tenant's keyspace which is /Tenant/1{0-1}
meaning [/Tenant/10, /Tenant/11)
. Can you use a span like '/Tenant/10', '/Tenant/11'
instead (you have to figure out the bytes for this)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the direction this is going but there are two main design questions remaining open:
- how do you intend this to behave when a single large table contains thousands of ranges?
- there's memory accounting missing in this approach
Reviewed 1 of 6 files at r1, 1 of 5 files at r3, 1 of 1 files at r5, 5 of 5 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @THardy98)
-- commits
line 9 at r6:
Can you make this more granular, so that it takes as argument which tables to retrieve ranges for. If there's 100 databases and you inspect just 1, you want this SRF to produce only the data for that db.
pkg/sql/sem/builtins/generator_builtins.go
line 3032 at r6 (raw file):
// Start implements the tree.ValueGenerator interface. func (trpti *tenantRangesPerTableIterator) Start(ctx context.Context, txn *kv.Txn) error { rangeIter, err := getRangeIteratorWithinSpan(ctx, txn.DB(), trpti.span)
This method would be better served by a call into a function implemented in pkg/sql off (*planner)
that inspects the descriptor collection and retrieves descriptors one by one. As implemented here, it still retrieves all descriptors in RAM (via the vtable) before it starts iterating.
pkg/sql/sem/builtins/generator_builtins.go
line 3085 at r6 (raw file):
End: interval.Comparable(tableStartKey.PrefixEnd()), } tableOverlappingRanges := trpti.rangeIntervalTree.Get(tableRange)
Wowzer, what do you intend to do here if there are 10000 ranges in 1 table? E.g. a large table?
I think this code would be better served by a range iterator in the mapping from table -> range, with 1 result row per range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @knz)
pkg/sql/logictest/testdata/logic_test/ranges_in_span
line 1 at r3 (raw file):
Previously, ecwall (Evan Wall) wrote…
That means you are trying to access keys outside of the tenant's keyspace which is
/Tenant/1{0-1}
meaning[/Tenant/10, /Tenant/11)
. Can you use a span like'/Tenant/10', '/Tenant/11'
instead (you have to figure out the bytes for this)?
I'm confused as to why this happens though, particularly for tenant_ranges_per_table
where we explicitly pass in the tenant's span. Why would we be accessing keys outside the tenant's keyspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @THardy98)
pkg/sql/logictest/testdata/logic_test/ranges_in_span
line 1 at r3 (raw file):
Previously, THardy98 (Thomas Hardy) wrote…
I'm confused as to why this happens though, particularly for
tenant_ranges_per_table
where we explicitly pass in the tenant's span. Why would we be accessing keys outside the tenant's keyspace?
Yeah /Meta2
is not inside the tenant's keyspace so I am not sure why that is being read. What is the stack trace when this error happens?
…tenant_ranges_per_table` Part of: cockroachdb#94332 Part of: cockroachdb#94330 This PR introduces two new SRFs: `crdb_internal.ranges_in_span(start_key, end_key)` - returns the set of ranges encompassing this span in the form of `(range_id, start_key, end_key)` `crdb_internal.tenant_ranges_per_table()` - returns the tenant's tables, each table row contains the set of ranges encompassing the table, each row takes the form of `(database_name, database_id, table_name, table_id, range_count, range_ids)` The former SRF is a QOL improvement. The latter SRF is intended to be used as a method to retrieve a large number of tables' range data quickly, particularly for the database details page (where we surface the range count of each table). The latter SRF is able to the range ids & range count for ~10,000 tables in ~1.5s. Release note(sql change): introduce two SRFs `crdb_internal.ranges_in_span` and `crdb_internal.tenant_ranges_per_table`
13074d8
to
e9d6f7e
Compare
Part of: cockroachdb#96011 This PR introduces a new SRF: `crdb_internal.ranges_in_span(start_key, end_key)` - returns the set of ranges encompassing this span in the form of `(range_id, start_key, end_key)` Release note (sql change): introduce new builtin function `crdb_internal.ranges_in_span(start_key, end_key), returns the set of ranges encompassing the given start and end key.
Closing this PR in favour of opening a new PR using the new Appreciate the time spent on reviews for this PR. I've separated builtins into their own PRs here for your review:
|
Part of: cockroachdb#96011 This PR introduces a new SRF: `crdb_internal.ranges_in_span(start_key, end_key)` - returns the set of ranges encompassing this span in the form of `(range_id, start_key, end_key)` Release note (sql change): introduce new builtin function `crdb_internal.ranges_in_span(start_key, end_key), returns the set of ranges encompassing the given start and end key.
Part of: cockroachdb#96011 This PR introduces a new SRF: `crdb_internal.ranges_in_span(start_key, end_key)` - returns the set of ranges encompassing this span in the form of `(range_id, start_key, end_key)` Release note (sql change): introduce new builtin function `crdb_internal.ranges_in_span(start_key, end_key), returns the set of ranges encompassing the given start and end key.
Part of: #94332
Part of: #94330
This PR introduces two new SRFs:
crdb_internal.ranges_in_span(start_key, end_key)
- returns the set of ranges encompassing this span in the form of(range_id, start_key, end_key)
Example response:
crdb_internal.tenant_ranges_per_table()
- returns the tenant's tables, each table row contains the set of ranges encompassing the table, each row takes the form of(database_name, database_id, table_name, table_id, range_count, range_ids)
Example response:
crdb_internal.ranges_in_span
is a QOL improvement, for convienence.crdb_internal.tenant_ranges_per_table
is intended to be used as a method to retrieve a large number of tables' range data quickly, particularly for the database details page (where we surface the range count of each table).Tested with 100 databases, each with 100 tables, each table having 1 range,
crdb_internal.tenant_ranges_per_table
is able to retrieve the range ids & range count for the ~10,000 tables in ~1.5s.Release note(sql change): introduce two SRFs
crdb_internal.ranges_in_span
andcrdb_internal.tenant_ranges_per_table