-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: leasing of virtual table privileges adds to latency on first use #93182
Comments
This turns out to be a less-than-ideal approach because there are 290 virtual tables. It's a lot of work to fetch them all in parallel if you go "the long way" by executing queries against each of them. It's hard to get at the cache directly to have it do the cheaper thing. |
Ideally we could load all of the data for all the virtual tables in one scan. |
This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run `SHOW TABLES` it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be *very* slow in an MR cluster. This patch attempts to somewhat alleviate this pain by kicking off goroutines to fetch these privileges eagerly at server startup. This is okay, but is pretty expensive. Each privilege fetch itself internally runs a query which means we're running 290 statements in 290 transactions to fetch this information. Ideally we'd do something better, like pre-warm the cache with one scan against the system table. Partially addresses cockroachdb#93182. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup, with some parallelism, to mitigate the latency when accessing pg_catalog right after the server boots up.
This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run `SHOW TABLES` it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be *very* slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Partially addresses cockroachdb#93182. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
fixes cockroachdb#93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run `SHOW TABLES` it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be *very* slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
fixes cockroachdb#93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run `SHOW TABLES` it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be *very* slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
fixes cockroachdb#93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run `SHOW TABLES` it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be *very* slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
Fixes cockroachdb#93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run SHOW TABLES it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be very slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
Fixes cockroachdb#93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run SHOW TABLES it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be very slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
93153: rttanalysis: don't count leasing the database desc r=andreimatei a=andreimatei A bunch of rtt-analysis tests were counting a request for leasing the database descriptor. This is not interesting. This patch makes the test framework lease it first through a "USE" statement. The number of KV requests required for leasing is currently mis-counted. We count 1, but in reality it's 4. A different patch will correct the miscounting that, at which point that would be too significant for the tests. Release note: None Epic: None 93325: multitenant: retain range splits after TRUNCATE for secondary tenants r=knz a=ecwall Fixes #69499 Fixes #82944 Existing split points are preserved after a TRUNCATE statement is executed by a secondary tenant. Release note: None 93354: tracing: disallow children of sterile span with different Tracer r=andreimatei a=andreimatei Before this patch, creating a "child" of a sterile span with a different Tracer than the one used to create the sterile span was tolerated - on the argument that sterile spans don't actually get children (the would-be child span is created as a root), so the arguments for not allowing a children to be created with different tracers don't apply. At the same time, creating a child of a noop span with a different Tracer than the noop span's Tracer was documented to not be allowed. In practice, it was, because the code was confused [1]. This patch disallows creating children of sterile spans with a different tracer, for consistency with all the other spans. The patch also makes it a panic for the children of noop spans to be created with a different Tracer. This is all meant as a cleanup / code simplification. [1] WithParent(sp) meant to treat sterile spans differently than noop spans but it was using sp.IsSterile(), which returns true for both. As such, it was unintentionally returning an empty parent option. startSpanGeneric() meant to check the tracer of parent noop spans, but it was failing to actually do so because it was going through the opts.Parent.empty(). Release note: None Epic: None 93545: sql: make SHOW RANGES FOR TABLE include all indexes r=ajwerner a=knz Informs #80906. Fixes #93546. Fixes #82995. Release note (backward-incompatible change): `SHOW RANGES FOR TABLE` now includes rows for all indexes that support the table. Prior to this change, `SHOW RANGES FOR TABLE foo` was an alias for `SHOW RANGES FOR INDEX foo@primary`. This was causing confusion, as it would miss data for secondary indexes. It is still possible to filter to just the primary index using `SHOW RANGES FOR INDEX foo@primary`. The statement output now also includes the index name. 93557: syntheticprivilegecache: scan all privileges at startup r=ajwerner a=ajwerner #### syntheticprivilegecache: move caching logic out of sql This is a pure refactor to move the logic for caching synthetic privileges from the sql package. This will make it easier to add features later. #### syntheticprivilegecache: scan all privileges at startup Fixes #93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run SHOW TABLES it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be very slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up. 93563: pgwire: handle decoding Geometry/Geography in binary r=rafiss a=otan Resolves #81066 Resolves #93352 Release note (bug fix): Previously, CockroachDB would error when receiving Geometry/Geography using binary parameters. This is now resolved. 93618: cli,cliccl: move some mt commands to cliccl r=ajwerner a=ajwerner Part of #91714 Epic: none Release note: None Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Oliver Tan <[email protected]>
Fixes cockroachdb#93182 This commit attempts to alleviate the pain caused by synthetic virtual table privileges introduced in 22.2. We need to fetch privileges for virtual tables to determine whether the user has access. This is done lazily. However, when a user attempts to read a virtual table like pg_class or run SHOW TABLES it will force the privileges to be determined for each virtual table (of which there are 290 at the time of writing). This sequential process can be somewhat slow in a single region cluster and will be very slow in an MR cluster. This patch attempts to somewhat alleviate this pain by scanning the table eagerly during server startup. Release note (performance improvement): In 22.2 we introduced privileges on virtual tables (system catalogs like pg_catalog, information_schema, and crdb_internal). A problem with this new feature is that we now must fetch those privileges into a cache before we can use those tables or determine their visibility in other system catalogs. This process used to occur on-demand, when the privilege was needed. Now we'll fetch these privileges eagerly during startup to mitigate the latency when accessing pg_catalog right after the server boots up.
Describe the problem
In 22.2 we added synthetic privileges which are used for things like external connections, and, critically, virtual tables. These privileges need to be checked when iterating through various system catalog schemas (
pg_catalog
,information schema
,SHOW TABLES
, ...).We populate these caches lazily.
Describe the solution you'd like
We could eagerly launch routines to populate these caches for the different virtual tables concurrently rather than serially.
Describe alternatives you've considered
We could do some sort of special initialization to preload data. These tables are almost always empty and are small when populated. We could scan it all and avoid the cost of all the parallel work.
Epic: CRDB-18596
Jira issue: CRDB-22207
The text was updated successfully, but these errors were encountered: