Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: the util/pretty code is incorrectly exposed through user-facing features #91197

Open
knz opened this issue Nov 3, 2022 · 23 comments
Open
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting T-sql-queries SQL Queries Team

Comments

@knz
Copy link
Contributor

knz commented Nov 3, 2022

Description

The util/pretty code was designed for use in one-off commands (like a command line program), not to be served over the network.

In particular:

  • some combination of parameters will cause a super-quadratic / exponential explosion in computation time / memory usage. This is a side-effect of the algorithm. It was deemed acceptable for one-off commands.
  • it does not know about memory monitors / query cancellation.

This makes this code profoundly inadequate for inclusion in CRDB in a way that can be triggered by users through an API.

I've already identified the following places where this code is used:

  • prettify_statement() SQL built-in function, the user can choose the options.
  • combined statement stats (uses PrettyAlignAndDeIndent, subject to quadratic explosion)
  • statement insights in UI front-end, via a SQL call to prettify_statement (uses PrettyAlignAndDeIndent, subject to quadratic explosion)
  • statement bundles (uses PrettyNoAlign, somewhat fine)
  • SHOW CREATE for view (uses PrettyNoAlign, somewhat fine)

Expected behavior

  1. The code in util/pretty is modified/enhanced to support memory monitors and recognize query cancellation. This is absolutely required at least for the use in prettify_statement() where the user is in control.
  2. the callers for stmt bundles / combined statements stats avoid the combination of parameters that result in super-quadratic behavior
  3. the callers also define a timeout on the length of the operation (via context cancellation)

Jira issue: CRDB-21145

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting A-sql-observability Related to observability of the SQL layer labels Nov 3, 2022
@knz
Copy link
Contributor Author

knz commented Nov 3, 2022

Additionally, this code in server/combined_statement_stats.go:

  query = fmt.Sprintf(
    `SELECT prettify_statement($1, %d, %d, %d)`,
    tree.ConsoleLineWidth, tree.PrettyNoAlign, tree.UpperCase)
  row, ⊙ = ie.QueryRowEx(⋄, "combined-stmts-details-format-query", ∅,
    sessiondata.InternalExecutorOverride{
      User: username.NodeUserName(),
    }, query, args…)

can be simplified to just a call to the Go function tree.Pretty() cc @maryliag

@maryliag
Copy link
Contributor

maryliag commented Nov 3, 2022

@knz the example above can't use the tree.Pretty() because I don't actually have the statement in a node tree format at that point. The prior steps is retrieving the query from the database, which returns a string, so I would need to convert to NodeFormatter to be able to use that function. So to improve the usage on combined statements and statement UI, I will update the parameter of alignment from 2 to 1.

I'll leave the improvements of the function to the queries team

maryliag added a commit to maryliag/cockroach that referenced this issue Nov 3, 2022
Previously, we were using `PrettyAlignAndDeindent`
parameter option for the usage of `prettify_statement`
on statement details endpoint and insights details
endpoints, which was subjected to a quadratic explosion.

This commit updates those uses to parameter
`PrettyAlignOnly` (1).

Part Of cockroachdb#91197

Release note: None
@ajwerner
Copy link
Contributor

ajwerner commented Nov 3, 2022

so I would need to convert to NodeFormatter to be able to use that function.

You ought to be able to use parser.ParseOne to turn a statement into a parser.Statement and AST is a tree.Statement which implements NodeFormatter.

@ajwerner
Copy link
Contributor

ajwerner commented Nov 3, 2022

Oh, I don't know what I'm talking about, prettyStatement does exactly the right thing.

func prettyStatement(p tree.PrettyCfg, stmt string) (string, error) {
stmts, err := parser.Parse(stmt)
if err != nil {
return "", err
}
var formattedStmt strings.Builder
for idx := range stmts {
formattedStmt.WriteString(p.Pretty(stmts[idx].AST))
if len(stmts) > 1 {
formattedStmt.WriteString(";")
}
formattedStmt.WriteString("\n")
}
return formattedStmt.String(), nil
}

@knz
Copy link
Contributor Author

knz commented Nov 3, 2022

if you know there's just 1 stmt, ParseOne is adequate

@maryliag
Copy link
Contributor

maryliag commented Nov 3, 2022

Options:

  • I parse the string and then call the pretty function using Go, which uses default parameter
  • I call the prettify function using sql, which does the same parser of the string, but let me use different parameters

Since I do want different parameters than the default (I want different value for the line width and I do want some alignment. and not the PrettyNoAlign), I will keep the SQL function, with the adjustments of the parameters

@knz
Copy link
Contributor Author

knz commented Nov 3, 2022

(I want different value for the line width and I do want some alignment. and not the PrettyNoAlign

You do not need the SQL function for that. You can do this:

cfg := tree.DefaultPrettyCfg()
cfg.XX = YY
cfg.ZZ = WW
cfg.Pretty(<yourstmt>)

let's avoid the layers of indirections through SQL executor which are not needed here.

maryliag added a commit to maryliag/cockroach that referenced this issue Nov 3, 2022
Previously, we were using `PrettyAlignAndDeindent`
parameter option for the usage of `prettify_statement`
on statement details endpoint and insights details
endpoints, which was subjected to a quadratic explosion.

This commit updates those uses to parameter
`PrettyAlignOnly` (1) on sql api and changes the
statement details endpoint to use the Go function of
Pretty instead, so it doesn't required a SQL
connection/execution.

Part Of cockroachdb#91197

Release note: None
maryliag added a commit to maryliag/cockroach that referenced this issue Nov 3, 2022
Previously, we were using `PrettyAlignAndDeindent`
parameter option for the usage of `prettify_statement`
on statement details endpoint and insights details
endpoints, which was subjected to a quadratic explosion.

This commit updates those uses to parameter
`PrettyAlignOnly` (1) on sql api and changes the
statement details endpoint to use the Go function of
Pretty instead, so it doesn't required a SQL
connection/execution.

Part Of cockroachdb#91197

Release note: None
craig bot pushed a commit that referenced this issue Nov 8, 2022
91214: server, ui: update prettify parameter r=maryliag a=maryliag

Previously, we were using `PrettyAlignAndDeindent`
parameter option for the usage of `prettify_statement`
on statement details endpoint and insights details
endpoints, which was subjected to a quadratic explosion.

This commit updates those uses to parameter
`PrettyAlignOnly` (1) on sql api and changes the
statement details endpoint to use the Go function of
Pretty instead, so it doesn't required a SQL
connection/execution.

Part Of #91197

Release note: None

91438: server, sql: version gate idx recs in persisted stats iterator, status server r=ericharmeling a=ericharmeling

Fixes #91346.

This PR version gates the persisted stats iterator and status server statement details queries.

Release note: None

Co-authored-by: maryliag <[email protected]>
Co-authored-by: Eric Harmeling <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Nov 8, 2022
Previously, we were using `PrettyAlignAndDeindent`
parameter option for the usage of `prettify_statement`
on statement details endpoint and insights details
endpoints, which was subjected to a quadratic explosion.

This commit updates those uses to parameter
`PrettyAlignOnly` (1) on sql api and changes the
statement details endpoint to use the Go function of
Pretty instead, so it doesn't required a SQL
connection/execution.

Part Of #91197

Release note: None
@rytaft
Copy link
Collaborator

rytaft commented Nov 8, 2022

Is there any urgent action needed from SQL Queries for this issue? Drew mentions that we may want to limit the number of rows returned

@michae2
Copy link
Collaborator

michae2 commented Sep 9, 2023

  • statement bundles (uses PrettyNoAlign, somewhat fine)

Even without the exponential blowup, it looks like this usage can cause stack overflows when collecting statement diagnostics bundles on queries with very large IN sets. For example, something like this can cause a stack overflow:

CREATE TABLE foo (id INT PRIMARY KEY, v STRING);
SELECT crdb_internal.request_statement_bundle('SELECT * FROM foo WHERE id IN (_, _, __more1000_plus__)', 0.0, '0', '0');
SELECT * FROM foo WHERE id IN (0, 1, 2, 3, ..., 999998, 999999);

Here's a demonstration using python:

# michae2@michae2-crl-mbp ~crdb % python <<EOF | cockroach demo
print("CREATE TABLE foo (id INT PRIMARY KEY, v STRING);")
print("SELECT crdb_internal.request_statement_bundle('SELECT * FROM foo WHERE id IN (_, _, __more1000_plus__)', 0.0, '0', '0');")
print("SELECT * FROM foo WHERE id IN (")
print(*range(1000000), sep=", ")
print(");")
EOF

CREATE TABLE

Time: 3ms

  crdb_internal.request_statement_bundle
------------------------------------------
                    t
(1 row)

Time: 4ms

runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc053780380 stack=[0xc053780000, 0xc073780000]
fatal error: stack overflow

runtime stack:
runtime.throw({0x105df1916?, 0x10a833da0?})
	GOROOT/src/runtime/panic.go:1047 +0x5d fp=0x70000f8f6d78 sp=0x70000f8f6d48 pc=0x10003dd1d
runtime.newstack()
	GOROOT/src/runtime/stack.go:1105 +0x5bd fp=0x70000f8f6f28 sp=0x70000f8f6d78 pc=0x10005877d
runtime.morestack()
	src/runtime/asm_amd64.s:574 +0x8b fp=0x70000f8f6f30 sp=0x70000f8f6f28 pc=0x100073a8b

goroutine 4062 [running]:
runtime.mapaccess2(0x1054271e0, 0xc0053fc540, 0xc0537803d0?)
	GOROOT/src/runtime/map.go:456 +0x217 fp=0xc053780390 sp=0xc053780388 pc=0x100014e17
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).iDoc(0xc07377d590, {0xe62b?, 0xf8?}, {0x1073e0a80?, 0xc04edd8c20?}, 0x10d900108?)
	github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:210 +0x72 fp=0xc053780400 sp=0xc053780390 pc=0x100f8fc52
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b19fa0)
	github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:132 +0x145 fp=0xc0537805e0 sp=0xc053780400 pc=0x100f8ee25
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b19fe0)
	github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:141 +0x2c5 fp=0xc0537807c0 sp=0xc0537805e0 pc=0x100f8efa5
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b1a000)
	github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:141 +0x2c5 fp=0xc0537809a0 sp=0xc0537807c0 pc=0x100f8efa5
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b19fc0)
	github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:132 +0x188 fp=0xc053780b80 sp=0xc0537809a0 pc=0x100f8ee68
...

@knz
Copy link
Contributor Author

knz commented Sep 10, 2023

The question however is how realistic that is in practice?

Unfortunately this came up in practice (see link above) so at least somewhat realistic.

@knz
Copy link
Contributor Author

knz commented Sep 11, 2023

Making this better will be a non-trivial programming exercise. The power of the current framework comes directly from the conciseness of the rendering functions. Making this better by adding an extra mandatory argument (to count the call depth), or even worse by attempting to "flatten" the control flow by moving the recursion to a loop, would make the code unmaintainable and unextensible.

My recommendation here would be to use code generation. Implement the rules in some kind of DSL then auto-generate the recursive code with the extra counting data structure passed explicitly by the code generator.

mgartner added a commit to mgartner/cockroach that referenced this issue Sep 11, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 100,000. While still an
internal error, this is preferable to a stack overflow which will crash
the process.

Informs cockroachdb#91197

Release note: None
@mgartner
Copy link
Collaborator

Making this better will be a non-trivial programming exercise.

I came up with #110374, which catches the case @michae2 described, but curious what you think.

@mgartner
Copy link
Collaborator

I've also created #110375 because I don't believe we strictly need to pretty-print statements in statement bundles.

@knz
Copy link
Contributor Author

knz commented Sep 11, 2023

I came up with #110374, which catches the case @michae2 described, but curious what you think.

👍 let's continue the convo over there.

mgartner added a commit to mgartner/cockroach that referenced this issue Sep 11, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 100,000. While still an
internal error, this is preferable to a stack overflow which will crash
the process.

Informs cockroachdb#91197

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 12, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
@michae2 michae2 moved this from 23.2 Release to 24.1 Release in SQL Queries Sep 12, 2023
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 13, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 14, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 15, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 15, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 19, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 19, 2023
This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs cockroachdb#91197

Release note: None
craig bot pushed a commit that referenced this issue Sep 20, 2023
110374: util/pretty: mitigate stack overflows of Pretty r=mgartner a=mgartner

#### errorutil: moved kv-specific errors out of errorutil

Errors for missing store and node descriptors have been moved from the
errorutil package to the kvpb package, because the errorutil package is
a low-level package to aid in working with errors in general. It should
not contain facilities for creating specific errors as this muddles the
package and can lead to import cycles.

Release note: None

#### errorutil: move SendReport to new sentryutil package

The `SendReport` function has been moved out of the errorutil package
and into a new sentryutil package. This avoids muddling the errorutil
package with a Sentry-specific function, and it breaks errorutil's
dependence on `pkg/settings` and `pkg/util/log/logcrash`.

Release note: None

#### util/pretty: mitigate stack overflows of Pretty

This commit reduces the chance of a stack overflow from recursive calls
of `*beExec.be`. The `Pretty` function will now return an internal error
if the recursive depth of `be` surpasses 10,000.

Informs #91197

Release note: None


Co-authored-by: Marcus Gartner <[email protected]>
@mgartner mgartner moved this from 24.1 Release to 24.2 Release in SQL Queries Nov 28, 2023
@mgartner mgartner moved this from 24.2 Release to Backlog in SQL Queries May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

8 participants