Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REDACT option for EXPLAIN, EXPLAIN ANALYZE #16929

Merged
merged 4 commits into from
May 12, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 74 additions & 6 deletions v23.1/explain-analyze.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,11 @@ The `EXPLAIN ANALYZE` [statement](sql-statements.html) **executes a SQL query**
Parameter | Description
-------------------|-----------
`PLAN` | _(Default)_ Execute the statement and return a statement plan with planning and execution time for an [explainable statement](sql-grammar.html#preparable_stmt). See [`PLAN` option](#plan-option).
`DISTSQL` | Execute the statement and return a statement plan and performance statistics as well as a generated link to a graphical distributed SQL physical statement plan tree. See [`DISTSQL` option](#distsql-option).
`VERBOSE` | Execute the statement and show as much information as possible about the statement plan.
`TYPES` | Execute the statement and include the intermediate [data types](data-types.html) CockroachDB chooses to evaluate intermediate SQL expressions.
`DEBUG` | Execute the statement and generate a ZIP file containing files with detailed information about the query and the database objects referenced in the query. See [`DEBUG` option](#debug-option).
`REDACT` | Execute the statement and redact constants, literal values, parameter values, and personally identifiable information (PII) from the output. See [`REDACT` option](#redact-option).
`DISTSQL` | Execute the statement and return a statement plan and performance statistics as well as a generated link to a graphical distributed SQL physical statement plan tree. See [`DISTSQL` option](#distsql-option).
`preparable_stmt` | The [statement](sql-grammar.html#preparable_stmt) you want to execute and analyze. All preparable statements are explainable.

## Required privileges
Expand Down Expand Up @@ -90,9 +93,9 @@ Statement plan tree properties | Description

By default, `EXPLAIN ANALYZE` uses the `PLAN` option. `EXPLAIN ANALYZE` and `EXPLAIN ANALYZE (PLAN)` produce the same output.

### `PLAN` options
### `PLAN` suboptions

The `PLAN` options `VERBOSE` and `TYPES` described in [`EXPLAIN` options](explain.html#options) are also supported. For an example, see [`EXPLAIN ANALYZE (VERBOSE)`](#explain-analyze-verbose).
The `PLAN` suboptions `VERBOSE` and `TYPES` described in [`EXPLAIN` options](explain.html#options) are also supported. For an example, see [`EXPLAIN ANALYZE (VERBOSE)`](#explain-analyze-verbose).

## `DISTSQL` option

Expand Down Expand Up @@ -165,6 +168,14 @@ You can obtain this ZIP file by following the link provided in the `EXPLAIN ANAL

{% include common/sql/statement-bundle-warning.md %}

## `REDACT` option

`EXPLAIN ANALYZE (REDACT)` executes a query and causes constants, literal values, parameter values, and personally identifiable information (PII) to be redacted as `‹×›` in the output.

You can use the `REDACT` flag in combination with the [`PLAN`](#plan-option) option (including the `VERBOSE` and `TYPES` [suboptions](#plan-suboptions)) to redact sensitive values in the physical statement plan, and with the [`DEBUG`](#debug-option) option to redact values in the statement bundle.

For an example, see [`EXPLAIN ANALYZE (REDACT)`](#explain-analyze-redact).

## Examples

The following examples use the [`movr` example dataset](cockroach-demo.html#datasets).
Expand All @@ -179,7 +190,7 @@ For example, the following `EXPLAIN ANALYZE` statement executes a simple query a

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN ANALYZE SELECT city, AVG(revenue) FROM rides GROUP BY city;
EXPLAIN ANALYZE SELECT city, AVG(revenue) FROM rides GROUP BY city;
~~~

~~~
Expand Down Expand Up @@ -263,7 +274,7 @@ EXPLAIN ANALYZE SELECT * FROM vehicles JOIN rides ON rides.vehicle_id = vehicles

### `EXPLAIN ANALYZE (VERBOSE)`

The `VERBOSE` option displays the physical statement plan with additional execution statistics.
Use the `VERBOSE` suboption of `PLAN` to execute a query and display the physical statement plan with additional execution statistics.

{% include_cached copy-clipboard.html %}
~~~ sql
Expand Down Expand Up @@ -369,7 +380,7 @@ Use the [`DEBUG`](#debug-option) option to generate a ZIP file containing files

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN ANALYZE (DEBUG) SELECT city, AVG(revenue) FROM rides GROUP BY city;
EXPLAIN ANALYZE (DEBUG) SELECT city, AVG(revenue) FROM rides GROUP BY city;
~~~

~~~
Expand All @@ -387,6 +398,63 @@ Use the [`DEBUG`](#debug-option) option to generate a ZIP file containing files

To download the ZIP file containing the statement diagnostics, open the URL after **Direct link**, run the `\statement-diag download` command, or run `cockroach statement-diag download`. You can also obtain the bundle by activating [statement diagnostics](ui-statements-page.html#diagnostics) in the DB Console.

### `EXPLAIN ANALYZE (REDACT)`

Use the [`REDACT` option](#redact-option) to execute a query and cause constants, literal values, parameter values, and personally identifiable information (PII) to be redacted as `‹×›` in the physical statement plan or statement bundle.

{% include_cached copy-clipboard.html %}
~~~ sql
EXPLAIN ANALYZE (REDACT) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
info
----------------------------------------------------------------------------------------------
planning time: 836µs
execution time: 148ms
distribution: full
vectorized: true
rows read from KV: 125,000 (21 MiB, 3 gRPC calls)
cumulative time spent in KV: 125ms
maximum memory usage: 14 MiB
network usage: 0 B (0 messages)
sql cpu time: 85ms
estimated RUs consumed: 0

• sort
│ nodes: n1
│ actual row count: 11,105
│ estimated max memory allocated: 4.0 MiB
│ estimated max sql temp disk usage: 0 B
│ sql cpu time: 8ms
│ estimated row count: 12,156
│ order: +revenue
└── • filter
│ nodes: n1
│ actual row count: 11,105
│ sql cpu time: 9ms
│ estimated row count: 12,156
│ filter: revenue > ‹×›
└── • scan
nodes: n1
actual row count: 125,000
KV time: 125ms
KV contention time: 0µs
KV rows read: 125,000
KV bytes read: 21 MiB
KV gRPC calls: 3
estimated max memory allocated: 10 MiB
sql cpu time: 68ms
estimated row count: 125,000 (100% of the table; stats collected 16 minutes ago)
table: rides@rides_pkey
spans: FULL SCAN
(40 rows)
~~~

In the preceding output, the `revenue` comparison value is redacted as `‹×›`.

## See also

- [`ALTER TABLE`](alter-table.html)
Expand Down
76 changes: 51 additions & 25 deletions v23.1/explain.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ The user requires the appropriate [privileges](security-reference/authorization.
`VERBOSE` | Show as much information as possible about the statement plan. See [`VERBOSE` option](#verbose-option).
`TYPES` | Include the intermediate [data types](data-types.html) CockroachDB chooses to evaluate intermediate SQL expressions. See [`TYPES` option](#types-option).
`OPT` | Display the statement plan tree generated by the [cost-based optimizer](cost-based-optimizer.html). See [`OPT` option](#opt-option).
`ENV` | Include all details used by the optimizer, including statistics. See [`ENV` suboption](#opt-env-option).
`MEMO` | Print a representation of the optimizer memo with the best plan. See [`MEMO` suboption](#opt-memo-option).
`REDACT` | Redact constants, literal values, parameter values, and personally identifiable information (PII) from the output. See [`REDACT` suboption](#opt-redact-option).
`VEC` | Show detailed information about the [vectorized execution](vectorized-execution.html) plan for a query. See [`VEC` option](#vec-option).
`DISTSQL` | Generate a URL to a [distributed SQL physical statement plan diagram](explain-analyze.html#distsql-plan-diagram). See [`DISTSQL` option](#distsql-option).
`preparable_stmt` | The [statement](sql-grammar.html#preparable_stmt) you want details about. All preparable statements are explainable.
Expand Down Expand Up @@ -87,7 +90,7 @@ By default, `EXPLAIN` includes the least detail about the statement plan but can

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
Expand Down Expand Up @@ -206,7 +209,7 @@ If you run `EXPLAIN` on a [join](joins.html) query, the output will display whic

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT * FROM rides AS r
EXPLAIN SELECT * FROM rides AS r
JOIN users AS u ON r.rider_id = u.id;
~~~

Expand Down Expand Up @@ -245,7 +248,7 @@ The following output shows that the query will perform a cross join:

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT * FROM rides AS r
EXPLAIN SELECT * FROM rides AS r
JOIN users AS u ON r.city = 'new york';
~~~

Expand Down Expand Up @@ -279,7 +282,7 @@ Time: 2ms total (execution 2ms / network 0ms)

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN INSERT INTO users(id, city, name) VALUES ('c28f5c28-f5c2-4000-8000-000000000026', 'new york', 'Petee');
EXPLAIN INSERT INTO users(id, city, name) VALUES ('c28f5c28-f5c2-4000-8000-000000000026', 'new york', 'Petee');
~~~

~~~
Expand All @@ -304,14 +307,14 @@ For more complex types of `INSERT` queries, `EXPLAIN` output can include more in

{% include_cached copy-clipboard.html %}
~~~ sql
> CREATE UNIQUE INDEX ON users(city, id, name);
CREATE UNIQUE INDEX ON users(city, id, name);
~~~

To display the `EXPLAIN` output for an [`INSERT ... ON CONFLICT` statement](insert.html#on-conflict-clause), which inserts some data that might conflict with the `UNIQUE` constraint imposed on the `name`, `city`, and `id` columns, run:

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN INSERT INTO users(id, city, name) VALUES ('c28f5c28-f5c2-4000-8000-000000000026', 'new york', 'Petee') ON CONFLICT DO NOTHING;
EXPLAIN INSERT INTO users(id, city, name) VALUES ('c28f5c28-f5c2-4000-8000-000000000026', 'new york', 'Petee') ON CONFLICT DO NOTHING;
~~~

~~~
Expand Down Expand Up @@ -406,7 +409,7 @@ The `VERBOSE` option includes:

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (VERBOSE) SELECT * FROM rides AS r
EXPLAIN (VERBOSE) SELECT * FROM rides AS r
JOIN users AS u ON r.rider_id = u.id
WHERE r.city = 'new york'
ORDER BY r.revenue ASC;
Expand Down Expand Up @@ -454,7 +457,7 @@ The `TYPES` option includes

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (TYPES) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN (TYPES) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
Expand Down Expand Up @@ -491,7 +494,7 @@ To display the statement plan tree generated by the [cost-based optimizer](cost-

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (OPT) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN (OPT) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
Expand All @@ -507,15 +510,15 @@ To display the statement plan tree generated by the [cost-based optimizer](cost-
Time: 1ms total (execution 1ms / network 0ms)
~~~

`OPT` has four suboptions: [`VERBOSE`](#opt-verbose-option), [`TYPES`](#opt-types-option), [`ENV`](#opt-env-option), [`MEMO`](#opt-memo-option).
`OPT` has five suboptions: [`VERBOSE`](#opt-verbose-option), [`TYPES`](#opt-types-option), [`ENV`](#opt-env-option), [`MEMO`](#opt-memo-option), [`REDACT`](#opt-redact-option).

##### `OPT, VERBOSE` option

To include cost details used by the optimizer in planning the query, use the `OPT, VERBOSE` option:

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (OPT, VERBOSE) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN (OPT, VERBOSE) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
Expand Down Expand Up @@ -571,7 +574,7 @@ To include cost and type details, use the `OPT, TYPES` option:

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (OPT, TYPES) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN (OPT, TYPES) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
Expand Down Expand Up @@ -629,7 +632,7 @@ To include all details used by the optimizer, including statistics, use the `OPT

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (OPT, ENV) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN (OPT, ENV) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

The output of `EXPLAIN (OPT, ENV)` is a URL with the data encoded in the fragment portion. Encoding the data makes it easier to share debugging information across different systems without encountering formatting issues. Opening the URL shows a page with the decoded data. The data is processed in the local browser session and is never sent out over the network. Keep in mind that if you are using any browser extensions, they may be able to access the data locally.
Expand Down Expand Up @@ -729,7 +732,6 @@ sort

The `MEMO` suboption prints a representation of the optimizer memo with the best plan. You can use the `MEMO` flag in combination with other flags. For example, `EXPLAIN (OPT, MEMO, VERBOSE)` prints the memo along with verbose output for the best plan.


~~~sql
EXPLAIN (OPT, MEMO) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~
Expand Down Expand Up @@ -771,14 +773,38 @@ EXPLAIN (OPT, MEMO) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
Time: 2ms total (execution 2ms / network 1ms)
~~~

##### `OPT, REDACT` option

The `REDACT` suboption causes constants, literal values, parameter values, and personally identifiable information (PII) to be redacted as `‹×›` in the physical statement plan.

You can also use the `REDACT` option in combination with the [`VERBOSE`](#opt-verbose-option), [`TYPES`](#opt-types-option), and [`MEMO`](#opt-memo-option) suboptions.

{% include_cached copy-clipboard.html %}
~~~ sql
EXPLAIN (OPT, REDACT) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

~~~
info
-------------------------------------
distribute
└── sort
└── select
├── scan rides
└── filters
└── revenue > ‹×›
(6 rows)
~~~

In the preceding output, the `revenue` comparison value is redacted as `‹×›`.

#### `VEC` option

To view details about the [vectorized execution plan](vectorized-execution.html#how-vectorized-execution-works) for the query, use the `VEC` option.

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (VEC) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
EXPLAIN (VEC) SELECT * FROM rides WHERE revenue > 90 ORDER BY revenue ASC;
~~~

The output shows the different internal functions that will be used to process each batch of column-oriented data.
Expand Down Expand Up @@ -806,7 +832,7 @@ For example, the following `EXPLAIN (DISTSQL)` statement generates a physical pl

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (DISTSQL) SELECT l_shipmode, AVG(l_extendedprice) FROM lineitem GROUP BY l_shipmode;
EXPLAIN (DISTSQL) SELECT l_shipmode, AVG(l_extendedprice) FROM lineitem GROUP BY l_shipmode;
~~~

The output of `EXPLAIN (DISTSQL)` is a URL for a graphical diagram that displays the processors and operations that make up the physical statement plan. For details about the physical statement plan, see [DistSQL plan diagram](explain-analyze.html#distsql-plan-diagram).
Expand All @@ -825,7 +851,7 @@ To include the data types of the input columns in the physical plan, use `EXPLAI

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN (DISTSQL, TYPES) SELECT l_shipmode, AVG(l_extendedprice) FROM lineitem GROUP BY l_shipmode;
EXPLAIN (DISTSQL, TYPES) SELECT l_shipmode, AVG(l_extendedprice) FROM lineitem GROUP BY l_shipmode;
~~~

~~~
Expand All @@ -844,14 +870,14 @@ You can use `EXPLAIN` to understand which indexes and key ranges queries use, wh

{% include_cached copy-clipboard.html %}
~~~ sql
> CREATE TABLE kv (k INT PRIMARY KEY, v INT);
CREATE TABLE kv (k INT PRIMARY KEY, v INT);
~~~

Because column `v` is not indexed, queries filtering on it alone scan the entire table:

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT * FROM kv WHERE v BETWEEN 4 AND 5;
EXPLAIN SELECT * FROM kv WHERE v BETWEEN 4 AND 5;
~~~

~~~
Expand All @@ -878,12 +904,12 @@ When `disallow_full_table_scans=on`, attempting to execute a query with a plan t

{% include_cached copy-clipboard.html %}
~~~ sql
> SET disallow_full_table_scans=on;
SET disallow_full_table_scans=on;
~~~

{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT * FROM kv WHERE v BETWEEN 4 AND 5;
SELECT * FROM kv WHERE v BETWEEN 4 AND 5;
~~~

~~~
Expand All @@ -896,12 +922,12 @@ If there were an index on `v`, CockroachDB would be able to avoid scanning the e

{% include_cached copy-clipboard.html %}
~~~ sql
> CREATE INDEX v ON kv (v);
CREATE INDEX v ON kv (v);
~~~

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN SELECT * FROM kv WHERE v BETWEEN 4 AND 5;
EXPLAIN SELECT * FROM kv WHERE v BETWEEN 4 AND 5;
~~~

~~~
Expand Down Expand Up @@ -932,15 +958,15 @@ Suppose you have a table of key-value pairs:

{% include_cached copy-clipboard.html %}
~~~ sql
> CREATE TABLE IF NOT EXISTS kv (k INT PRIMARY KEY, v INT);
CREATE TABLE IF NOT EXISTS kv (k INT PRIMARY KEY, v INT);
UPSERT INTO kv (k, v) VALUES (1, 5), (2, 10), (3, 15);
~~~

You can use `EXPLAIN` to determine whether the following `UPDATE` is using `SELECT FOR UPDATE` locking.

{% include_cached copy-clipboard.html %}
~~~ sql
> EXPLAIN UPDATE kv SET v = 100 WHERE k = 1;
EXPLAIN UPDATE kv SET v = 100 WHERE k = 1;
~~~

The following output contains a `locking strength` field, which means that `SELECT FOR UPDATE` locking is being used. If the `locking strength` field does not appear, the statement is not using `SELECT FOR UPDATE` locking.
Expand Down