Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: redact statement bundle #68570

Closed
jordanlewis opened this issue Aug 7, 2021 · 3 comments
Closed

sql: redact statement bundle #68570

jordanlewis opened this issue Aug 7, 2021 · 3 comments
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. N-followup Needs followup. T-sql-queries SQL Queries Team

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Aug 7, 2021

Statement bundles contain several bits of unredacted user data, ranked in order from most to least sensitive

  1. Histograms contain real data samples
  2. Placeholders can contain real data samples
  3. The statement can contain data
  4. The schema could be sensitive

The histograms are very sensitive. Arbitrary user data is contained in them. We should redact these values, along with the placeholders.

Ideally, the redaction would be done using some method of obfuscation that preserves the meaning and usability of the histogram while removing the real data values.

Jira issue: CRDB-9095

@jordanlewis jordanlewis added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-postmortem Originated from a Postmortem action item. N-followup Needs followup. labels Aug 7, 2021
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Aug 7, 2021
@mgartner
Copy link
Collaborator

I was under the impression that this is by design and something that users need to be aware of before they share statement bundles, just like a debug zip.

Perhaps the action item here is to make this clear in the EXPLAIN ANALYZE docs, like we do in the cockroach debug zip docs.

@rytaft
Copy link
Collaborator

rytaft commented Aug 17, 2021

cc @ianjevans for the docs-todo @mgartner identified above

cc @vy-ton for prioritization of the top-level issue

michae2 added a commit to michae2/cockroach that referenced this issue Jan 9, 2023
Add a new explain flag, `REDACT`, which can be used to collect a
redacted statement bundle with `EXPLAIN ANALYZE (DEBUG, REDACT)`.
Initially this is the only variant of `EXPLAIN` that supports `REDACT`
but the possibility of other variants using `REDACT` is left open.

This first commit plumbs the redact flag into explain_bundle.go but does
not implement redaction for any of the files, instead simply omitting
files which could contain user data. Following commits will add
redaction support for each file.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add a new `REDACT` flag to `EXPLAIN` which
causes constants, literal values, parameter values, and any other user
data to be redacted in explain output. Redacted statement diagnostics
bundles can now be collected with `EXPLAIN ANALYZE (DEBUG, REDACT)`.
michae2 added a commit to michae2/cockroach that referenced this issue Jan 9, 2023
Support redaction of statement.sql, and add it back to redacted
statement diagnostics bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Jan 9, 2023
Support redaction of statement.sql, and add it back to redacted
statement diagnostics bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Jan 11, 2023
Add a new explain flag, `REDACT`, which can be used to collect a
redacted statement bundle with `EXPLAIN ANALYZE (DEBUG, REDACT)`.
Initially this is the only variant of `EXPLAIN` that supports `REDACT`
but the possibility of other variants using `REDACT` is left open.

This first commit plumbs the redact flag into explain_bundle.go but does
not implement redaction for any of the files, instead simply omitting
files which could contain user data. Following commits will add
redaction support for each file.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add a new `REDACT` flag to `EXPLAIN` which
causes constants, literal values, parameter values, and any other user
data to be redacted in explain output. Redacted statement diagnostics
bundles can now be collected with `EXPLAIN ANALYZE (DEBUG, REDACT)`.
michae2 added a commit to michae2/cockroach that referenced this issue Jan 11, 2023
Support redaction of statement.sql, and add it back to redacted
statement diagnostics bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Jan 11, 2023
Support redaction of `EXPLAIN (PLAN)` and add plan.txt back to redacted
statement diagnostics bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add support for the `REDACT` flag to the
following variants of `EXPLAIN`:
- `EXPLAIN`
- `EXPLAIN (PLAN)`
- `EXPLAIN (VEC)`
- `EXPLAIN ANALYZE`
- `EXPLAIN ANALYZE (PLAN)`
These explain statements (along with `EXPLAIN ANALYZE (DEBUG)`, which
already supported `REDACT`) will have constants, literal values,
parameter values, and any other user data redacted in output.
craig bot pushed a commit that referenced this issue Jan 14, 2023
94950: sql: stub implementation of EXPLAIN ANALYZE (DEBUG, REDACT) r=RaduBerinde,yuzefovich a=michae2

**sql: stub implementation of EXPLAIN ANALYZE (DEBUG, REDACT)**

Add a new explain flag, `REDACT`, which can be used to collect a
redacted statement bundle with `EXPLAIN ANALYZE (DEBUG, REDACT)`.
Initially this is the only variant of `EXPLAIN` that supports `REDACT`
but the possibility of other variants using `REDACT` is left open.

This first commit plumbs the redact flag into explain_bundle.go but does
not implement redaction for any of the files, instead simply omitting
files which could contain user data. Following commits will add
redaction support for each file.

Part of: #68570

Epic: CRDB-19756

Release note (sql change): Add a new `REDACT` flag to `EXPLAIN` which
causes constants, literal values, parameter values, and any other user
data to be redacted in explain output. Redacted statement diagnostics
bundles can now be collected with `EXPLAIN ANALYZE (DEBUG, REDACT)`.

**sql: add statement.sql to EXPLAIN ANALYZE (DEBUG, REDACT)**

Support redaction of statement.sql, and add it back to redacted
statement diagnostics bundles.

Part of: #68570

Epic: CRDB-19756

Release note: None

95232: randident: add some escape sequences r=j82w a=knz

Jake found out that we have some API boundaries that don't deal well with identifiers containing things that get interpreted during string formatting. This patch extends the name generator to include those too.

Release note: None
Epic: None

95241: sem/tree: fix the formatting of backup options r=adityamaru a=knz

Fixes #89054.
Fixes #95235.

Found using TestRandomSyntax.

Release note: None

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
michae2 added a commit to michae2/cockroach that referenced this issue Feb 1, 2023
Support redaction of `EXPLAIN (PLAN)` and add plan.txt back to redacted
statement diagnostics bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add support for the `REDACT` flag to the
following variants of `EXPLAIN`:
- `EXPLAIN`
- `EXPLAIN (PLAN)`
- `EXPLAIN (VEC)`
- `EXPLAIN ANALYZE`
- `EXPLAIN ANALYZE (PLAN)`
These explain statements (along with `EXPLAIN ANALYZE (DEBUG)`, which
already supported `REDACT`) will have constants, literal values,
parameter values, and any other user data redacted in output.
michae2 added a commit to michae2/cockroach that referenced this issue Feb 1, 2023
Now that we have a new `RedactValues` field in `explain.Flags` there is
some confusion with the existing `RedactFlags`. Rename these to
`DeflakeFlags`.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Mar 16, 2023
Add a new `create_redactable` column to the
`crdb_internal.create_statements` virtual table which provides the
`CREATE` statement for the table or view with all constants and literals
surrounded by redaction markers. Combined with the
`crdb_internal.redact` function this can be used to obtain a redacted
`CREATE` statement for any table.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Mar 16, 2023
Add a new `WITH REDACT` option to `SHOW CREATE` statements which reads
from `create_redactable` instead of `create_statement` when delegating
`SHOW CREATE` to `crdb_internal.create_statements`.

(The `WITH REDACT` syntax is intended to allow for additional options in
the future (if we want them) such as `REDACTABLE`)

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add a new `WITH REDACT` option to the
following statements:

- `SHOW CREATE`
- `SHOW CREATE TABLE`
- `SHOW CREATE VIEW`

which, when used, replaces constants and literals in the printed
`CREATE` statement with the redacted marker, '‹×›'.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 16, 2023
Use `SHOW CREATE ... WITH REDACT` to generate `CREATE` statements for
schema.sql in redacted statement diagnostics bundles.

Also use statistics without histograms in redacted bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Mar 16, 2023
Add a new `create_redactable` column to the
`crdb_internal.create_statements` virtual table which provides the
`CREATE` statement for the table or view with all constants and literals
surrounded by redaction markers. Combined with the
`crdb_internal.redact` function this can be used to obtain a redacted
`CREATE` statement for any table.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Mar 16, 2023
Add a new `WITH REDACT` option to `SHOW CREATE` statements which reads
from `create_redactable` instead of `create_statement` when delegating
`SHOW CREATE` to `crdb_internal.create_statements`.

(The `WITH REDACT` syntax is intended to allow for additional options in
the future (if we want them) such as `REDACTABLE`)

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add a new `WITH REDACT` option to the
following statements:

- `SHOW CREATE`
- `SHOW CREATE TABLE`
- `SHOW CREATE VIEW`

which, when used, replaces constants and literals in the printed
`CREATE` statement with the redacted marker, '‹×›'.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 16, 2023
Use `SHOW CREATE ... WITH REDACT` to generate `CREATE` statements for
schema.sql in redacted statement diagnostics bundles.

Also use statistics without histograms in redacted bundles.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
craig bot pushed a commit that referenced this issue Mar 16, 2023
98251: sql: add schema.sql and stats.sql back to redacted statement bundles r=msirek,rharding6373 a=michae2

**sql: add create_redactable column to crdb_internal.create_statements**

Add a new `create_redactable` column to the
`crdb_internal.create_statements` virtual table which provides the
`CREATE` statement for the table or view with all constants and literals
surrounded by redaction markers. Combined with the
`crdb_internal.redact` function this can be used to obtain a redacted
`CREATE` statement for any table.

Part of: #68570

Epic: CRDB-19756

Release note: None
 
**sql: add WITH REDACT option to SHOW CREATE**

Add a new `WITH REDACT` option to `SHOW CREATE` statements which reads
from `create_redactable` instead of `create_statement` when delegating
`SHOW CREATE` to `crdb_internal.create_statements`.

(The `WITH REDACT` syntax is intended to allow for additional options in
the future (if we want them) such as `REDACTABLE`)

Part of: #68570

Epic: CRDB-19756

Release note (sql change): Add a new `WITH REDACT` option to the
following statements:

- `SHOW CREATE`
- `SHOW CREATE TABLE`
- `SHOW CREATE VIEW`

which, when used, replaces constants and literals in the printed
`CREATE` statement with the redacted marker, '‹×›'.

**sql: add schema.sql and stats.sql back to redacted statement bundles**

Use `SHOW CREATE ... WITH REDACT` to generate `CREATE` statements for
schema.sql in redacted statement diagnostics bundles.

Also use statistics without histograms in redacted bundles.

Part of: #68570

Epic: CRDB-19756

Release note: None

98290: sql: avoid panic by not applying AvoidBuffering to InternalExecutor r=rafiss a=stevendanna

The InternalExecutor creates a streamingCommandResult, which does not support DisableBuffering.

Here, we skip calling DisableBuffering() if the request is from an internal executor.

Fixes: #98204

Release note (bug fix): Fixes a bug in which `SET avoid_buffering = true` could produce a crash on subsequent operations.

98800: sql: add name resolver to constraint validator for legacy schema changer r=chengxiong-ruan a=chengxiong-ruan

Fixes #91697

Previously, in legacy schema changer, when adding a constraint with expression containing OID datum, it panics because we didn't give the validator a proper name resolver to resolve sequence names when deserializing constraint expressions. The funny logic of deserialization is that it tries to resolve anything that is a OID datum, even it's just a scalar in which case we could fail to find a sequence and we skip it well.

Release note (sql change): this commit fixes a bug where check constraint on a OID type column panics in legacy schema changer.

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
michae2 added a commit to michae2/cockroach that referenced this issue Mar 17, 2023
Support redaction of `EXPLAIN (OPT)` and add opt*.txt back to redacted
statement diagnostics bundles. This is achieved by adding a new
`redactableValues` option to `cat.FormatTable`,
`xform.(*Optimizer).FormatMemo`, and `memo.ExprFmtCtx` which causes
user-provided constants and literals to be surrounded by redaction
markers in formatted output. Then `redact.(RedactableString).Redact` is
used to further redact the redactable output.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add support for the `REDACT` flag to the
following variants of `EXPLAIN`:

- `EXPLAIN (OPT)`
- `EXPLAIN (OPT, CATALOG)`
- `EXPLAIN (OPT, MEMO)`
- `EXPLAIN (OPT, TYPES)`
- `EXPLAIN (OPT, VERBOSE)`

These explain statements will have constants, literal values, parameter
values, and any other user data redacted in output.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 17, 2023
Create another version of TestExplainRedact which also tests
`EXPLAIN (REDACT)` of DDL statements. This version is in sqlccl to allow
for DDL statements using partitions.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note: None
michae2 added a commit to michae2/cockroach that referenced this issue Mar 20, 2023
Support redaction of `EXPLAIN (OPT)` and add opt*.txt back to redacted
statement diagnostics bundles. This is achieved by adding a new
`redactableValues` option to `cat.FormatTable`,
`xform.(*Optimizer).FormatMemo`, and `memo.ExprFmtCtx` which causes
user-provided constants and literals to be surrounded by redaction
markers in formatted output. Then `redact.(RedactableString).Redact` is
used to further redact the redactable output.

Part of: cockroachdb#68570

Epic: CRDB-19756

Release note (sql change): Add support for the `REDACT` flag to the
following variants of `EXPLAIN`:

- `EXPLAIN (OPT)`
- `EXPLAIN (OPT, CATALOG)`
- `EXPLAIN (OPT, MEMO)`
- `EXPLAIN (OPT, TYPES)`
- `EXPLAIN (OPT, VERBOSE)`

These explain statements will have constants, literal values, parameter
values, and any other user data redacted in output.
michae2 added a commit to michae2/cockroach that referenced this issue Mar 20, 2023
Create another version of TestExplainRedact which also tests
`EXPLAIN (REDACT)` of DDL statements. This version is in sqlccl to allow
for DDL statements using partitions.

Part of: cockroachdb#68570
Informs: cockroachdb#98746

Epic: CRDB-19756

Release note: None
craig bot pushed a commit that referenced this issue Mar 20, 2023
97549: sql: support redaction of EXPLAIN (OPT) r=RaduBerinde a=michae2

**sql: support redaction of EXPLAIN (OPT)**

Support redaction of `EXPLAIN (OPT)` and add opt*.txt back to redacted
statement diagnostics bundles. This is achieved by adding a new
`redactableValues` option to `cat.FormatTable`,
`xform.(*Optimizer).FormatMemo`, and `memo.ExprFmtCtx` which causes
user-provided constants and literals to be surrounded by redaction
markers in formatted output. Then `redact.(RedactableString).Redact` is
used to further redact the redactable output.

Part of: #68570

Epic: CRDB-19756

Release note (sql change): Add support for the `REDACT` flag to the
following variants of `EXPLAIN`:

- `EXPLAIN (OPT)`
- `EXPLAIN (OPT, CATALOG)`
- `EXPLAIN (OPT, MEMO)`
- `EXPLAIN (OPT, TYPES)`
- `EXPLAIN (OPT, VERBOSE)`

These explain statements will have constants, literal values, parameter
values, and any other user data redacted in output.

**sql, tests, sqlccl: split TestExplainRedactDDL out of TestExplainRedact**

Create another version of TestExplainRedact which also tests
`EXPLAIN (REDACT)` of DDL statements. This version is in sqlccl to allow
for DDL statements using partitions.

Part of: #68570
Informs: #98746

Epic: CRDB-19756

Release note: None

Co-authored-by: Michael Erickson <[email protected]>
@michae2
Copy link
Collaborator

michae2 commented Mar 21, 2023

Marking this as fixed, as of v23.1. Further improvements will be tracked by #98817 and other issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. N-followup Needs followup. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

No branches or pull requests

5 participants