Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: allow fine-grained control of index visibility #82363

Open
ajwerner opened this issue Jun 2, 2022 · 2 comments
Open

sql: allow fine-grained control of index visibility #82363

ajwerner opened this issue Jun 2, 2022 · 2 comments
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Jun 2, 2022

Is your feature request related to a problem? Please describe.
Invisible indexes logically hide indexes from application traffic so that query plans do not use these indexes but writes keep them up-to-date. This is useful to validate that an index should be dropped without risking the cost of rebuilding the index if dropping it was a mistake. This can still be risky for tier-zero services which are not eager to shoulder the risk of a needed index being dropped even if for a short amount of time.

Describe the solution you'd like

We could make the index visibility fine-grained such that sessions of a certain user or application have different index visibility.

Note that we also already support using a float value for visibility to allow an index to be "partially visible". For example, if the visibility value for an index is 0.1, the optimizer has a 10% chance of considering the index when planning.

Additional context

Perhaps the best way for risk-sensitive workloads to know that they can drop an index is to ensure that the index isn't being used, see #68542.

Jira issue: CRDB-16312

@ajwerner ajwerner added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jun 2, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Jun 2, 2022
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Jun 29, 2022
This PR added an RFC draft for the invisible index feature. The main
purpose of this RFC is to introduce the feature, to document different
choices of SQL syntaxes, and ultimately to justify the decision.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release Note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Jun 29, 2022
This PR added an RFC draft for the invisible index feature. The main
purpose of this RFC is to introduce the feature, to document different
choices of SQL syntaxes, and ultimately to justify the decision.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release Note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Jun 29, 2022
This PR added an RFC draft for the invisible index feature. The main
purpose of this RFC is to introduce the feature, to document different
choices of SQL syntaxes, and ultimately to justify the decision.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release Note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Jul 26, 2022
This commit adds an RFC draft for the invisible index feature.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release Note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Jul 28, 2022
This commit adds an RFC draft for the invisible index feature.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release Note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Aug 13, 2022
Prior to this commit, not visible indexes were always ignored by the optimizer
unless it is explicitly selected with index hinting or used for constraint
check. This commit adds a session variable along with a cluster setting to
control whether the optimizer can still choose to use not visible indexes for
the query plan.

Note that even if the the session variable is enabled and optimizer can choose
to use not visible indexes, not visible indexes remain as not visible in index
descriptors.

Cluster setting: `sql.defaults.optimizer_use_not_visible_indexes.enabled`
Session setting: `optimizer_use_not_visible_indexes` By default, the setting is
disabled, meaning not visible index will be properly ignored. When the setting
is enabled, optimizer treats not visible indexes as they are visible and can
choose to use them for query plan.

Assists: cockroachdb#82363

Release note (sql change): Session setting `optimizer_use_not_visible_indexes`
and cluster setting `sql.defaults.optimizer_use_not_visible_indexes.enabled` can
be used to disable not visible index feature. When they are enabled, optimizer
treats not visible indexes as they are visible and can choose to use them for
query plan. By default, they are disabled.
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Aug 13, 2022
Prior to this commit, not visible indexes were always ignored by the optimizer
unless it is explicitly selected with index hinting or used for constraint
check. This commit adds a session variable along with a cluster setting to
control whether the optimizer can still choose to use not visible indexes for
the query plan.

Note that even if the the session variable is enabled and optimizer can choose
to use not visible indexes, not visible indexes remain as not visible in index
descriptors.

Cluster setting: `sql.defaults.optimizer_use_not_visible_indexes.enabled`
Session setting: `optimizer_use_not_visible_indexes` By default, the setting is
disabled, meaning not visible index will be properly ignored. When the setting
is enabled, optimizer treats not visible indexes as they are visible and can
choose to use them for query plan.

Assists: cockroachdb#82363

Release note (sql change): Session setting `optimizer_use_not_visible_indexes`
and cluster setting `sql.defaults.optimizer_use_not_visible_indexes.enabled` can
be used to disable not visible index feature. When they are enabled, optimizer
treats not visible indexes as they are visible and can choose to use them for
query plan. By default, they are disabled.
craig bot pushed a commit that referenced this issue Aug 13, 2022
86033: sql: setting to disable NotVisible index feature r=wenyihu6 a=wenyihu6

Prior to this commit, not visible indexes were always ignored by the optimizer
unless it is explicitly selected with index hinting or used for constraint
check. This commit adds a session variable along with a cluster setting to
control whether the optimizer can still choose to use not visible indexes for
the query plan.

Note that even if the the session variable is enabled and optimizer can choose
to use not visible indexes, not visible indexes remain as not visible in index
descriptors.

Cluster setting: `sql.defaults.optimizer_use_not_visible_indexes.enabled`
Session setting: `optimizer_use_not_visible_indexes` By default, the setting is
disabled, meaning not visible index will be properly ignored. When the setting
is enabled, optimizer treats not visible indexes as they are visible and can
choose to use them for query plan.

Assists: #82363

Release note (sql change): Session setting `optimizer_use_not_visible_indexes`
and cluster setting `sql.defaults.optimizer_use_not_visible_indexes.enabled` can
be used to disable not visible index feature. When they are enabled, optimizer
treats not visible indexes as they are visible and can choose to use them for
query plan. By default, they are disabled.

Co-authored-by: wenyihu3 <[email protected]>
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Aug 22, 2022
This commit adds an RFC for the invisible index feature.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release justification: low risk to the existing functionality; this commit just
adds rfc.

Release Note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this issue Aug 22, 2022
This commit adds an RFC for the invisible index feature.

Related issue: cockroachdb#72576,
cockroachdb#82363

Release justification: low risk to the existing functionality; this commit just
adds rfc.

Release Note: none
craig bot pushed a commit that referenced this issue Aug 26, 2022
83531: rfc: add rfc for invisible index feature r=wenyihu6 a=wenyihu6

This commit adds an RFC for the invisible index feature.

Related issue: #72576, #82363

Release justification: low risk to the existing functionality; this commit just
adds rfc.

Release Note: none

86267: allocator: select a good enough store for decom/recovery r=lidorcarmel a=lidorcarmel

Until now, when decommissioning a node, or when recovering from a dead
node, the allocator tries to pick one of the best possible stores as
the target for the recovery.

Because of that, we sometimes see multiple stores recover replicas
to the same store, for example, when decommissioning a node and
at the same time adding a new node.

This PR changes the way we select a destination store by choosing
a random store out of all the stores that are "good enough" for
the replica. The risk diversity is still enforced, but we may
recover a replica to a store that is considered "over full", for
example.

Note that during upreplication the allocator will still try to use
one of the "best" stores as targets.

Fixes: #86265

Release note: None

Release justification: a relatively small change, and it can be
reverted by setting kv.allocator.recovery_store_selector=best.

86345: clusterversion: prevent upgrades from development versions r=ajwerner a=dt

This change defines a new "unstableVersionsAbove" point on the cluster
version  line, above which any cluster versions are considered unstable
development-only versions which are still subject to change.

Performing an upgrade to a version while it is still unstable leaves a
cluster in a state where it persists a version that claims it has done
that upgrade and all prior, however those upgrades are still subject to
change by nature of being unstable. If it subsequently upgraded to a
stable version, this could result in subtle and nearly impossible to
detect issues, as being at or above a particular version is used to
assume that all subsequent version upgrades _as released_ were run; on a
cluster that ran an earlier iteration of an upgrade this does not hold.

Thus to prevent clusters which upgrade to development versions from
subsequently upgrading to a stable version, we offset all development
versions -- those above the unstableVersionsAbove point -- into the far
future by adding one million to their major version e.g. v22.x-y becomes
1000022.x-y. This means an attempt to subsequently "upgrade" to a stable
version -- such as v22.2 -- will look like a downgrade and be forbidden.

On the release branch, prior to starting to publish upgradable releases,
the unstableVersionsAbove value should be set to invalidVersionKey to
reflect that all version upgrades in that release branch are now
considered to be stable, meaning they must be treated as immutable and
append-only.

Release note (ops change): clusters that are upgraded to an alpha or
other manual build from the development branch will not be able to be
subsequently upgraded to a release build.

Release justification: high-priority change to existing functionality,
to allow releasing alphas with known version upgrade bugs while ensuring
they do not subsequently upgrade into stable version but silently
corrupted clusters.

86630: kvserver: add additional testing to multiqueue r=AlexTalks a=AlexTalks

Add testing for cancelation of multi-queue requests
and fix a bug where the channel wasn't closed on task
cancelation.

Release justification: Test-only change.
Release note: None

86801: ttl: add queries to job details r=otan a=rafiss

fixes #81905

This helps with observability so users can understand what the TTL job
is doing behind the scenes.

The job details in the DB console will show:
```
ttl for defaultdb.public.t
-- for each range, iterate to find rows:
SELECT id FROM [108 AS tbl_name]
AS OF SYSTEM TIME '30s'
WHERE <crdb_internal_expiration OR ttl_expiration_expression> <= $1
AND (id) > (<range start>) AND (id) < (<range end>)
ORDER BY id
LIMIT <ttl_select_batch_size>
-- then delete with:
DELETE FROM [108 AS tbl_name]
WHERE <crdb_internal_expiration OR ttl_expiration_expression> <= $1
AND (id) IN (<rows selected above>)
```

Release note: None

Release justification: low risk change

86876: kv: Include error information in `crdb_internal.active_range_feeds` r=miretskiy a=miretskiy

Include error count, and the last error information in
`crdb_internal.active_range_feeds` table whenever rangefeed
disconnects due to an error.

Release justification: observability improvement.
Release note: None

86901: sql: fix cluster_execution_insights priority level r=j82w a=j82w

This fixes the priority level and converts it to be a string.

closes: #86900, closes #86867

Release justification: Category 2: Bug fixes and
low-risk updates to new functionality

Release note (sql change): Fix the insight
execution priority to have correct value
instead of always being default. Changed
the column to be a string to avoid
converting it in the ui.

86921: kvserver: incorporate sending queue priority into snapshot requests r=AlexTalks a=AlexTalks

This change modifies the `(Delegated)SnapshotRequest` Raft RPCs in order
to incorporate the name of the sending queue, as well as the sending
queue's priority, in order to be used to prioritize queued snapshots on
a receiving store.

Release justification: Low-risk change to existing functionality.
Release note: None

86923: dev: bump version r=yuzefovich a=yuzefovich

This commit bumps version since there appears to have been a "merge
skew" between #85095 and #86167, and somehow I had a `dev` binary that
didn't include the benchmark fix from the latter.

Release justification: test-only change.

Release note: None

Co-authored-by: wenyihu3 <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
Co-authored-by: David Taylor <[email protected]>
Co-authored-by: Alex Sarkesian <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: j82w <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
craig bot pushed a commit that referenced this issue Apr 18, 2023
101222: roachprod: `c.ParallelE` can be made optionally made to fail slow r=renatolabs a=smg260

however sometimes it is desirable for the command to complete running on all nodes despite failures.

e.g. `roachprod run <cluster> 'ls /mnt/data1/cockroach'` or when collecting dmesg logs after a test.

This commit has also switched to functional options for `c.Parallel` for more flexibility and allow calls to specify WithFailSlow().

[TeamCity nightly](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestNightlyGceBazel/9551260?showRootCauses=true&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true)

Epic: none
Fixes: #101150

Release note: None

101334: sql: add visibility column to virtual tables and SHOW output r=rytaft a=rytaft

This commit adds a new column `visibility` to `crdb_internal.table_indexes`
and `information_schema.statistics`. It also adds a new column `visibility`
to the output of following SQL statements:

```
SHOW INDEX FROM (table_name)
SHOW INDEXES FROM (table_name)
SHOW KEYS FROM (table_name)
SHOW INDEX FROM DATABASE (database_name)
SHOW INDEXES FROM DATABASE (database_name)
SHOW KEYS FROM DATABASE (database_name)
```

To minimize changes to the existing functionality, the existing columns
"is_visible", "visible" remain there. They are only set to true if the index
is fully visible. Any index with visibility [0.0, 1.0) would be false.

Informs #82363

Release note (sql change): Added a new column `visibility` to
`crdb_internal.table_indexes` and `information_schema.statistics`. Also added
a new column `visibility` to the output of following SQL statements:

```
SHOW INDEX FROM (table_name)
SHOW INDEXES FROM (table_name)
SHOW KEYS FROM (table_name)
SHOW INDEX FROM DATABASE (database_name)
SHOW INDEXES FROM DATABASE (database_name)
SHOW KEYS FROM DATABASE (database_name)
```
This new column contains a floating point number specifying the level of
visibility of the index, from 0 (not visible) to 1 (fully visible). If
the value is between 0 and 1, the index will be visible to the corresponding
fraction of queries.

Co-authored-by: wenyihu3 <[email protected]>

101505: ui: fix txn label r=maryliag a=maryliag

The label for the count was showing as Statement on the Transactions page.
This commit changes to the correct label.

Before
<img width="713" alt="Screenshot 2023-04-13 at 6 43 34 PM" src="https://user-images.githubusercontent.com/1017486/231899022-8cd6c713-6f42-488c-99f2-9d4e03f6b7a0.png">


After
<img width="736" alt="Screenshot 2023-04-13 at 6 43 47 PM" src="https://user-images.githubusercontent.com/1017486/231899046-270df02f-d82d-485b-89d5-f1851283a609.png">


Epic: None
Release note: None

101552: ui: add back option to Advanced Debug pages r=maryliag a=maryliag

Previously, all the pages inside the Advanced Debug had no back option, meaning the user would have to user the back from the browser or click again on the side menu. This commit adds the back option to all the pages.

https://www.loom.com/share/c4b12c0dfdfd4137b5cd6d8a75053c0c

Epic: None
Release note: None

101696: roachtest: don't schedule backups for perf tests r=yuzefovich a=yuzefovich

This commit disables scheduling backups for tests that are somewhat perf sensitive (meaning that we internally use them for perf regressions).

Informs: #101526.

Epic: None

Release note: None

Co-authored-by: Miral Gadani <[email protected]>
Co-authored-by: Rebecca Taft <[email protected]>
Co-authored-by: maryliag <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
rytaft added a commit to wenyihu6/cockroach that referenced this issue Apr 18, 2023
This commit adds support to `ALTER INDEX ... VISIBILITY ...`. `Visibility` is
currently not supported with `CREATE INDEX` or `CREATE TABLE`. Users would need
to create the index as it is fully not visible or fully visible first. And they
can change the index visibility to any float [0.0, 1.0] using `ALTER INDEX ...
VISIBILITY ...` Invisibility specifies the invisibility of an index to the
optimizer and can be any float64 between [0.0, 1.0]. An index with invisibility
0.0 means that the index is visible. An index with invisibility 1.0 means that
the index is fully not visible. By default, an index should be visible or
invisibility 0.0. An index with invisibility 1.0 is ignored by the optimizer
unless it is used for constraint check or is explicitly selected with index
hinting. An index with invisibility between (0.0, 1.0) would be made fully not
visible to a corresponding fraction of the queries. By convention, we will
refer any indexes with invisibility == 0.0 as visible, any indexes with
invisibility == 1.0 as fully not visible, and any indexes with index visibility
in-between as partially not visible.

Informs cockroachdb#82363

Release note (sql change): `ALTER INDEX ... VISIBILITY ...` is now supported.
It can change an index visibility to any visibility within [0.0, 1.0].
Visibility 0.0 means the index is not visible to the optimizer, while
visibility 1.0 means the index is fully visible. A value in the range
(0.0, 1.0) means the index will be visibile to the corresponding fraction of
queries.

Co-authored-by: Rebecca Taft <[email protected]>
craig bot pushed a commit that referenced this issue Apr 18, 2023
87301: sql: support ALTER INDEX … VISIBILITY ... r=rytaft a=wenyihu6

This commit adds support to `ALTER INDEX ... VISIBILITY ...`. `Visibility` is
currently not supported with `CREATE INDEX` or `CREATE TABLE`. Users would need
to create the index as it is fully not visible or fully visible first. And they
can change the index visibility to any float [0.0, 1.0] using `ALTER INDEX ... VISIBILITY ...`.
Invisibility specifies the invisibility of an index to the
optimizer and can be any float64 between [0.0, 1.0]. An index with invisibility
0.0 means that the index is visible. An index with invisibility 1.0 means that
the index is fully not visible. By default, an index should be visible or
invisibility 0.0. An index with invisibility 1.0 is ignored by the optimizer
unless it is used for constraint check or is explicitly selected with index
hinting. An index with invisibility between (0.0, 1.0) would be made fully not
visible to a corresponding fraction of the queries. By convention, we will
refer any indexes with invisibility == 0.0 as visible, any indexes with
invisibility == 1.0 as fully not visible, and any indexes with index visibility
in-between as partially not visible.

Informs #82363

Release note (sql change): `ALTER INDEX ... VISIBILITY ...` is now supported.
It can change an index visibility to any visibility within [0.0, 1.0].
Visibility 0.0 means the index is not visible to the optimizer, while
visibility 1.0 means the index is fully visible. A value in the range
(0.0, 1.0) means the index will be visibile to the corresponding fraction of
queries.

Co-authored-by: wenyihu3 <[email protected]>
rytaft added a commit to rytaft/cockroach that referenced this issue Apr 19, 2023
This commit adds support for CREATE INDEX ... VISIBILITY ... as well as
CREATE TABLE ... (... INDEX (...) VISIBILITY ...). This allows users to set
the index visibility to any float [0.0, 1.0] upon creation of the index.

ALTER INDEX ... VISIBILITY ... was previously supported in cockroachdb#87301.

Informs cockroachdb#82363

Release note (sql change): CREATE INDEX ... VISIBILITY ... as well as
CREATE TABLE ... (... INDEX (...) VISIBILITY ...) are now supported. This
allows a user to set the index visibility to any visibility within [0.0, 1.0].
Visibility 0.0 means the index is not visible to the optimizer, while
visibility 1.0 means the index is fully visible. A value in the range
(0.0, 1.0) means the index will be visibile to the corresponding fraction of
queries.
craig bot pushed a commit that referenced this issue Apr 20, 2023
101812: sql: support setting index visibility in CREATE INDEX and CREATE TABLE r=rytaft a=rytaft

This commit adds support for `CREATE INDEX ... VISIBILITY ...` as well as `CREATE TABLE ... (... INDEX (...) VISIBILITY ...)`. This allows users to set the index visibility to any float [0.0, 1.0] upon creation of the index.

`ALTER INDEX ... VISIBILITY ...` was previously supported in #87301.

Informs #82363

Release note (sql change): `CREATE INDEX ... VISIBILITY ...` as well as `CREATE TABLE ... (... INDEX (...) VISIBILITY ...)` are now supported. This allows a user to set the index visibility to any visibility within [0.0, 1.0]. Visibility 0.0 means the index is not visible to the optimizer, while visibility 1.0 means the index is fully visible. A value in the range (0.0, 1.0) means the index will be visibile to the corresponding fraction of queries.

Co-authored-by: Rebecca Taft <[email protected]>
rytaft added a commit to rytaft/cockroach that referenced this issue Apr 20, 2023
Sometimes generate a partially invisible index instead of a fully
invisible index.

Informs cockroachdb#82363

Release note: None
rytaft added a commit to rytaft/cockroach that referenced this issue Apr 20, 2023
Sometimes generate a partially invisible index instead of a fully
invisible index.

Informs cockroachdb#82363

Release note: None
craig bot pushed a commit that referenced this issue Apr 20, 2023
101861: sql: add sqlsmith support to test partial visibility of indexes r=rytaft a=rytaft

Sometimes generate a partially invisible index instead of a fully invisible index.

Informs #82363

Release note: None

Co-authored-by: Rebecca Taft <[email protected]>
@mgartner mgartner moved this to 23.2 Release in SQL Queries Jul 24, 2023
@mgartner mgartner moved this from 23.2 Release to New Backlog in SQL Queries Aug 3, 2023
rytaft added a commit to rytaft/cockroach that referenced this issue Aug 11, 2023
…indexes

Prior to this commit, index recommendations could change randomly between runs
of the same query if any partially visible indexes existed in the tables used
by the query. This was because the logic to randomly use a partially visible
index for some fraction of queries was also being applied when generating index
recommendations. This commit makes index recommendations deterministic in this
case, so that we treat partially visible indexes as fully invisible for the
purpose of index recommendations, and we always recommend making a partially
visible index fully visible if it can be used to improve a query plan.

Informs cockroachdb#82363

There is no release note since partially visible indexes are not yet part of
any release.

Release note: None
@rytaft
Copy link
Collaborator

rytaft commented Aug 11, 2023

I've updated the description of this issue since partially visible indexes are now fully supported. The only remaining unimplemented idea in this issue is "We could make the index visibility fine-grained such that sessions of a certain user or application have different index visibility."

rytaft added a commit to rytaft/cockroach that referenced this issue Aug 12, 2023
…indexes

Prior to this commit, index recommendations could change randomly between runs
of the same query if any partially visible indexes existed in the tables used
by the query. This was because the logic to randomly use a partially visible
index for some fraction of queries was also being applied when generating index
recommendations. This commit makes index recommendations deterministic in this
case, so that we treat partially visible indexes as fully invisible for the
purpose of index recommendations, and we always recommend making a partially
visible index fully visible if it can be used to improve a query plan.

Informs cockroachdb#82363

There is no release note since partially visible indexes are not yet part of
any release.

Release note: None
craig bot pushed a commit that referenced this issue Aug 12, 2023
108601: opt: make index recommendations deterministic with partially visible indexes r=rytaft a=rytaft

Prior to this commit, index recommendations could change randomly between runs
of the same query if any partially visible indexes existed in the tables used
by the query. This was because the logic to randomly use a partially visible
index for some fraction of queries was also being applied when generating index
recommendations. This commit makes index recommendations deterministic in this
case, so that we treat partially visible indexes as fully invisible for the
purpose of index recommendations, and we always recommend making a partially
visible index fully visible if it can be used to improve a query plan.

Informs #82363

There is no release note since partially visible indexes are not yet part of
any release.

Release note: None

Co-authored-by: Rebecca Taft <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

3 participants