forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
rfc: added an rfc draft for invisible index feature
This PR added an RFC draft for the invisible index feature. The main purpose of this RFC is to introduce the feature, to document different choices of SQL syntaxes, and ultimately to justify the decision. Related issue: cockroachdb#72576, cockroachdb#82363 Release Note: none
- Loading branch information
Showing
1 changed file
with
295 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,295 @@ | ||
- Feature Name: Invisible Index | ||
- Status: draft | ||
- Start Date: 2022-06-28 | ||
- Authors: Wenyi Hu | ||
- RFC PR: (TODO (wenyihu6): link PR later) | ||
- Cockroach Issue: https://github.com/cockroachdb/cockroach/issues/72576, | ||
https://github.com/cockroachdb/cockroach/issues/82363 | ||
|
||
# Summary | ||
|
||
This new feature introduces the option to make an index become invisible. An | ||
invisible index is an index that is up-to-date but is ignored by the optimizer | ||
unless explicitly specified with [index | ||
hinting](https://www.cockroachlabs.com/docs/v22.1/table-expressions#force-index-selection). | ||
Users can create an index as invisible or alter an index to be invisible after | ||
its initialization. As for now, primary indexes cannot be invisible. But unique | ||
indexes can still be invisible. Specifically, the unique constraint still prevents | ||
insertion of duplicates into a column regardless of whether the inedx is invisible. | ||
But the index will be ignored by the optimizer for queries. | ||
|
||
The main purpose of this RFC is to introduce the feature, to document different | ||
choices of potential SQL syntaxes, and ultimately to justify the decision. | ||
|
||
# Motivation | ||
|
||
Currently, users are not able to observe the impact of removing an index without | ||
risking the cost of rebuilding the index. This new feature would allow users to | ||
validate whether an index should be dropped by changing it to invisible first. | ||
If a drop in query performance is observed, the index can be quickly toggled | ||
back to visible without rebuilding the index. | ||
|
||
Similarly, this new feature would also allow users to roll out new indexes with | ||
more confidence. Currently, some users with large production scales are | ||
concerned about the impact of introducing new indexes and potentially affecting | ||
their applications significantly. With this feature, users can create new | ||
indexes and easily toggle it back to invisible without the cost of dropping the | ||
index. | ||
|
||
This new feature would also be useful if we want to set an index to be visible | ||
only to specific queries. By using index hinting, users can force an invisible | ||
index to be visible to parts of their applications without affecting the rest of | ||
the application. | ||
|
||
# Technical design | ||
|
||
## SQL Syntax | ||
|
||
This following section will discuss different SQL syntax choices. PostgreSQL | ||
does not support invisible indexes yet. We will be using MySQL and Oracle SQL as | ||
a reference for the standardized way to support invisible index syntax. The | ||
points below outline different choices and their use examples. | ||
|
||
Just for reference, the following section shows how SQL syntax now looks like. | ||
The parts surrounded by *** [] *** propose different options that we can consider. | ||
|
||
[//]: # (CREATE INDEX) | ||
1. `CREATE INDEX` | ||
- Create an invisible index by using `CREATE INDEX` in index definition. | ||
- Create an unique invisible index by using `CREATE UNIQUE INDEX` in index definition. | ||
```sql | ||
CREATE [UNIQUE | INVERTED] INDEX [CONCURRENTLY] [IF NOT EXISTS] [<idxname>] | ||
ON <tablename> ( <colname> [ASC | DESC] [, ...] ) | ||
[USING HASH] [STORING ( <colnames...> )] | ||
[PARTITION BY <partition params>] | ||
[WITH <storage_parameter_list>] [WHERE <where_conds...>] | ||
*** [INVISIBLE | NOT VISIBLE | VISIBLE | HIDDEN] *** | ||
``` | ||
|
||
```sql | ||
CREATE INDEX a ON b.c (d) VISIBLE | ||
CREATE INDEX a ON b.c (d) INVISIBLE | ||
CREATE INDEX a ON b.c (d) HIDDEN | ||
CREATE INDEX a ON b.c (d) NOT VISIBLE | ||
|
||
CREATE INDEX a ON b (c) WITH (fillfactor = 100, y_bounds = 50) VISIBLE | ||
CREATE INDEX a ON b (c) WITH (fillfactor = 100, y_bounds = 50) INVISIBLE | ||
CREATE INDEX a ON b (c) WITH (fillfactor = 100, y_bounds = 50) HIDDEN | ||
CREATE INDEX a ON b (c) WITH (fillfactor = 100, y_bounds = 50) NOT VISIBLE | ||
|
||
CREATE INDEX geom_idx ON t USING GIST(geom) WITH (s2_max_cells = 20, s2_max_level = 12, s2_level_mod = 3) HIDDEN | ||
CREATE INDEX geom_idx ON t USING GIST(geom) WITH (s2_max_cells = 20, s2_max_level = 12, s2_level_mod = 3) INVISIBLE | ||
CREATE INDEX geom_idx ON t USING GIST(geom) WITH (s2_max_cells = 20, s2_max_level = 12, s2_level_mod = 3) NOT VISIBLE | ||
|
||
CREATE UNIQUE INDEX IF NOT EXISTS a ON b (c) WHERE d > 3 HIDDEN | ||
CREATE UNIQUE INDEX IF NOT EXISTS a ON b (c) WHERE d > 3 INVISIBLE | ||
CREATE UNIQUE INDEX IF NOT EXISTS a ON b (c) WHERE d > 3 NOT VISIBLE | ||
``` | ||
[//]: # (CREATE TABLE) | ||
2. `CREATE TABLE` | ||
- Create an invisible index by adding an index in a `CREATE TABLE...(INDEX)` definition. | ||
- Create an unique invisible index by adding an index in a `CREATE TABLE...(UNIQUE INDEX)` definition. | ||
- Create an unique invisible index by adding an unique constraint within the table constraint definition of a `CREATE TABLE ...(CONSTRAINT ...)` | ||
- Create an unique invisible index by adding an unique constraint within the column definition of a `CREATE TABLE ...(UNIQUE...)` definition. | ||
|
||
```sql | ||
CREATE [[GLOBAL | LOCAL] {TEMPORARY | TEMP}] TABLE [IF NOT EXISTS] <tablename> [table_element_list] [<on_commit>] | ||
``` | ||
|
||
<blockquote><details> | ||
<summary>table_element_list</summary> | ||
|
||
<details> | ||
<summary>Index Definition</summary> | ||
|
||
```sql | ||
[UNIQUE | INVERTED] INDEX [<name>] ( <colname> [ASC | DESC] [, ...] | ||
[USING HASH] [{STORING | INCLUDE | COVERING} ( <colnames...> )] | ||
[PARTITION BY <partition params>] | ||
[WITH <storage_parameter_list>] [WHERE <where_conds...>] | ||
*** [INVISIBLE | NOT VISIBLE | VISIBLE | HIDDEN] *** | ||
``` | ||
|
||
```sql | ||
CREATE TABLE a (b INT8, c STRING, INDEX (b ASC, c DESC) STORING (c) INVISIBLE) | ||
CREATE TABLE a (b INT8, c STRING, INDEX (b ASC, c DESC) STORING (c) NOT VISIBLE) | ||
|
||
CREATE TABLE a (b INT, UNIQUE INDEX foo (b) WHERE c > 3 INVISIBLE) | ||
CREATE TABLE a (b INT, UNIQUE INDEX foo (b) WHERE c > 3 NOT VISIBLE) | ||
``` | ||
|
||
</details> | ||
|
||
<details> | ||
<summary>Column Constraint Definition</summary> | ||
|
||
```sql | ||
[CONSTRAINT <constraintname>] | ||
{ NULL | NOT NULL | NOT VISIBLE | | ||
UNIQUE [WITHOUT INDEX | *** WITH INVISIBLE INDEX | WITH NOT VISIBLE INDEX ***] | ||
PRIMARY KEY *** [WITH INVISIBLE INDEX | WITH NOT VISIBLE INDEX | WITHOUT VISIBLE INDEX] *** | CHECK (<expr>) | DEFAULT <expr> | ON UPDATE <expr> | GENERATED { ALWAYS | BY DEFAULT } | ||
AS IDENTITY [( <opt_sequence_option_list> )] } | ||
-- Note: primary index cannot be invisible. In this case, the rule is introduced only to throw a semantic error later on. | ||
``` | ||
|
||
```sql | ||
CREATE TABLE a (b INT8 CONSTRAINT c UNIQUE WITHOUT INDEX) | ||
CREATE TABLE a (b INT8 CONSTRAINT c UNIQUE WITH INVISIBLE INDEX) | ||
CREATE TABLE a (b INT8 CONSTRAINT c UNIQUE WITH HIDDEN INDEX) | ||
CREATE TABLE a (b INT8 CONSTRAINT c UNIQUE WITHOUT VISIBLE INDEX) | ||
CREATE TABLE a (b INT8 CONSTRAINT c UNIQUE WITH NOT VISIBLE INDEX) | ||
|
||
CREATE TABLE a (b INT8 CONSTRAINT c PRIMARY KEY INVISIBLE) --/ semantic error | ||
CREATE TABLE a (b INT8 CONSTRAINT c PRIMARY KEY NOT VISIBLE) -- semantic error | ||
CREATE TABLE a (b INT8 CONSTRAINT c PRIMARY KEY HIDDEN) -- semantic error | ||
``` | ||
</details> | ||
|
||
<details> | ||
<summary>Table Constraint Definition</summary> | ||
|
||
```sql | ||
UNIQUE [WITHOUT INDEX | *** WITH INVISIBLE INDEX | WITH NOT VISIBLE INDEX ***] ( <colnames...> ) [{STORING | INCLUDE | COVERING} ( <colnames...> )] | ||
PRIMARY KEY ( <colnames...> ) [USING HASH] *** [WITH INVISIBLE INDEX | WITH NOT VISIBLE INDEX] *** -- Note: primary index cannot be invisible. In this case, the rule is introduced only to throw a semantic error later on. | ||
``` | ||
```sql | ||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE WITH INVISIBLE INDEX (b, c)) | ||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE WITH HIDDEN INDEX (b, c)) | ||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE WITH NOT VISIBLE INDEX (b, c)) | ||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE WITHOUT VISIBLE INDEX (b, c)) | ||
|
||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE (b) INVISIBLE INDEX) | ||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE (b) HIDDEN INDEX) | ||
CREATE TABLE a (b INT8, c STRING, CONSTRAINT d UNIQUE (b) NOT VISIBLE INDEX) | ||
|
||
CREATE TABLE a (b INT8, c STRING, PRIMARY KEY (b, c, "0") INVISIBLE) -- semantic error | ||
CREATE TABLE a (b INT8, c STRING, PRIMARY KEY (b, c, "0") HIDDEN) -- semantic error | ||
CREATE TABLE a (b INT8, c STRING, PRIMARY KEY (b, c, "0") NOT VISIBLE) -- semantic error | ||
|
||
``` | ||
</details> | ||
|
||
</blockquote></details> | ||
|
||
[//]: # (Alter Table) | ||
3. `Alter Table` | ||
- Create an unique invisible index by adding unique constraint within the table constraint definition of an `ALTER TABLE <name> ADD CONSTRAINT ...` | ||
- Create an unique invisible index by adding unique constraint within the column definition of `ALTER TABLE <name> ADD <coldef>`, `ALTER TABLE <name> ADD IF NOT EXISTS <coldef>`, `ALTER TABLE <name> ADD COLUMN <coldef>`, `ALTER TABLE <name> ADD COLUMN IF NOT EXISTS <coldef>`. | ||
|
||
```sql | ||
ALTER TABLE ... ADD [COLUMN] [IF NOT EXISTS] <colname> <type> [<constraint...>] | ||
ALTER TABLE ... ADD <constraint> | ||
ALTER TABLE ... ALTER PRIMARY KEY USING INDEX <name> -- Note: primary index cannot be invisible. In this case, the rule is introduced only to throw a semantic error later on. | ||
``` | ||
|
||
<blockquote><details> | ||
<summary>constraint</summary> | ||
|
||
```sql | ||
[CONSTRAINT <constraintname>] {NULL | NOT NULL | UNIQUE [WITHOUT INDEX | *** WITH INVISIBLE INDEX | WITH NOT VISIBLE INDEX *** ]| PRIMARY KEY [*** WITH INVISIBLE INDEX | WITH NOT VISIBLE INDEX *** ]| CHECK (<expr>) | DEFAULT <expr>} | ||
``` | ||
|
||
```sql | ||
ALTER TABLE a ADD CONSTRAINT a_idx UNIQUE WITH INVISIBLE INDEX (a) | ||
ALTER TABLE a ADD CONSTRAINT a_idx UNIQUE WITH HIDDEN INDEX (a) | ||
ALTER TABLE a ADD CONSTRAINT a_idx UNIQUE WITHOUT VISIBLE INDEX (a) | ||
ALTER TABLE a ADD CONSTRAINT a_idx UNIQUE WITH NOT VISIBLE INDEX (a) | ||
|
||
ALTER TABLE IF EXISTS a ADD COLUMN b INT8 UNIQUE WITH INVISIBLE INDEX, ADD CONSTRAINT a_no_idx UNIQUE WITH INVISIBLE INDEX (a) | ||
ALTER TABLE IF EXISTS a ADD COLUMN b INT8 UNIQUE WITH HIDDEN INDEX, ADD CONSTRAINT a_no_idx UNIQUE WITH HIDDEN INDEX (a) | ||
ALTER TABLE IF EXISTS a ADD COLUMN b INT8 UNIQUE WITH NOT VISIBLE INDEX, ADD CONSTRAINT a_no_idx UNIQUE WITH NOT VISIBLE INDEX (a) | ||
ALTER TABLE IF EXISTS a ADD COLUMN b INT8 UNIQUE WITHOUT VISIBLE INDEX, ADD CONSTRAINT a_no_idx UNIQUE WITHOUT VISIBLE INDEX (a) | ||
``` | ||
</blockquote></details> | ||
|
||
[//]: # (Alter Index) | ||
4. `Alter Index` | ||
```sql | ||
ALTER INDEX [IF EXISTS] <idxname> [INVISIBLE | NOT VISIBLE | HIDDEN] | ||
``` | ||
|
||
```sql | ||
ALTER INDEX a@b INVSIBLE | ||
ALTER INDEX a@b NOT VISIBLE | ||
ALTER INDEX a@b HIDDEN | ||
``` | ||
|
||
[//]: # (Show Constraint) | ||
5. `Show Constraint` | ||
```sql | ||
SHOW CONSTRAINT FROM table_name with_comment | ||
``` | ||
|
||
```sql | ||
table_name constraint_name constraint_type details(*** add invisible index details ***) validated | ||
unique_without_index my_partial_unique_f UNIQUE UNIQUE WITH INVISIBLE INDEX (f) WHERE (f > 0) true | ||
unique_without_index my_partial_unique_f UNIQUE UNIQUE WITH HIDDEN INDEX (f) WHERE (f > 0) true | ||
unique_without_index my_partial_unique_f UNIQUE UNIQUE WITH NOT VISIBLE INDEX (f) WHERE (f > 0) true | ||
``` | ||
|
||
[//]: # (Show Index) | ||
6. A new column needs to be added to `crdb_internal.table_indexes`. | ||
``` | ||
descriptor_id descriptor_name index_id index_name index_type is_unique is_inverted is_sharded ***is_hidden*** shard_bucket_count created_at | ||
***is_invisible*** | ||
***is_not_invisible*** | ||
***visibility*** | ||
``` | ||
|
||
7. A new column needs to be added to the output of following SQL statements: | ||
```sql | ||
SHOW INDEX FROM (table_name) | ||
SHOW INDEXES FROM(table_name) | ||
SHOW KEYS FROM (table_name) | ||
|
||
SHOW INDEX FROM DATABASE(database_name) | ||
SHOW INDEXES FROM DATABASE (database_name) | ||
SHOW KEYS FROM DATABASE (database_name) | ||
``` | ||
|
||
```sql | ||
table_name index_name non_unique seq_in_index column_name direction storing implicit ***is_hidden*** | ||
***is_invisible*** | ||
***is_not_invisible*** | ||
***visibility*** | ||
``` | ||
|
||
7. Note that `CREATE CONSTRAINT` and `ALTER CONSTRAINT` are both not supported by the parser. | ||
|
||
### Discussion | ||
CockroachDB currently supports invisible column feature. For this | ||
feature, `NOT VISIBLE` is used for its SQL statement, and `is_hidden` is used | ||
for the new column added to `SHOW INDEX`. It would be nice to stay consistent. | ||
But [MySQL](https://dev.mysql.com/doc/refman/8.0/en/invisible-indexes.html) and | ||
[Oracle](https://oracle-base.com/articles/11g/invisible-indexes-11gr1) both | ||
support `INVISIBLE` for invisible indexes. [MySQL](https://dev.mysql.com/doc/refman/8.0/en/invisible-columns.html), [Oracle](https://oracle-base.com/articles/12c/invisible-columns-12cr1) | ||
also use `INVISIBLE` for the invisible columns. | ||
[MySQL](https://dev.mysql.com/doc/refman/8.0/en/invisible-indexes.html) use | ||
`is_visible` for the new column added to `SHOW INDEX`. | ||
[Oracle](http://www.dba-oracle.com/t_11g_new_index_features.htm) uses | ||
`VISIBILITY` for the new column added to `SHOW INDEX`. | ||
|
||
I was wondering why `NOT VISIBLE` was chosen for the invisible column feature. | ||
I tried changing it to `INVISIBLE` in `sql.y`, and it caused conflicts in the | ||
grammar. I'm not sure if this was the reason why we chose `NOT VISIBLE`. | ||
PostgreSQL currently doesn't support invisible index or invisible column | ||
feature. We can also try supporting both `NOT VISIBLE` and `INVISIBLE`, but more | ||
work would be needed to find a grammar rule that allows both. | ||
|
||
## Later Discussion: Fine-Grained Control of Index Visibility | ||
Later on, we want to extend this feature and allow a more fine-grained control | ||
of index visibility by introducing the following two features. | ||
|
||
As of now, the plan is to introduce the general feature of invisible index | ||
first. The design and implementation details for fine-grained control of index | ||
visibility will be added later on. | ||
|
||
1. Indexes are not restricted to just being visible or invisible; users can experiment | ||
with different levels of visibility. In other words, instead of using a boolean | ||
invisible flag, users can set a float invisible flag between 0.0 and 1.0. The | ||
index would be made invisible only to a corresponding fraction of queries. | ||
Related: https://github.com/cockroachdb/cockroach/issues/72576#issuecomment-1034301996 | ||
|
||
2. Different sessions of a certain user or application can set different index | ||
visibilities for indexes. | ||
Related: https://github.com/cockroachdb/cockroach/issues/82363 |