Skip to content

Commit

Permalink
sql: avoid a RSG error on tenant specs containing a cast
Browse files Browse the repository at this point in the history
Prior to this commit, the RSG parse-after-format test was choking
on `ALTER TENANT 'foo' MINUTE`. This commit fixes that.

This is achieved by introducing a new rule during pretty-printing: if
the expression in a TenantSpec has the property "well-enclosed
expression" (`alreadyDelimitedAsSyntacticDExpr` in code), it can be
pretty-printed without enclosing parentheses. Otherwise, parentheses
must be added.

More details follow for the interested reader.

----

There are two salient aspects of this patch.

- why it works (what needed to be done). This is explained by the
  definition of this new interface type
  `alreadyDelimitedAsSyntacticDExpr`, included for reference below.
- why we only needed to do this now, i.e. what makes ALTER TENANT
  special (and why we haven't needed to do so earlier).

The second point is perhaps the most interesting of the two.

----

**As to why we only needed to do this now, and why we may not do
this in too many other places.**

The problem, and its solution, exists every time we would like
to include a syntactic expression inside a grammar rule, where
the expression is followed by more keywords.

As of this writing, *we do not have other syntax rules that do this*
besides ALTER/DROP TENANT. This is why ALTER/DROP TENANT was the first
time we actually had to deal with this problem.

This, in turn, raises two additional questions.

The first is why we found it so important to make ALTER/DROP TENANT
accept an expression in third position. The reason for this is that we
are building a service control plane where we want to pass the
identity of the tenant as a variable (via a placeholder), and we will
in the future want to place subqueries in that position. This will
make us able to compose the SQL syntax with a more complex
orchestration logic.

The other question is why we haven't tried to use expressions in
other places.

For example, we know there is appetite to make the table name in
ALTER TABLE a variable so it can be passed via a placeholder. However,
in doing so we still should be careful: the mechanism that ALTER
TENANT uses, that of parsing the expression as a `d_expr` instead of
`a_expr` or `b_expr`, is only possible when *every syntax form
following the expression only uses a keyword that will never be part
of the expression*. That is, if/when someone chooses to define a
grammar rule:

```
rule: A B d_expr C D
```

... they are committing to a future where the keyword C will _never_
be used as an extension form for one of the `d_expr` expression syntax.

We control this future for ALTER/DROP TENANT, which is a CockroachDB
oddity, but we can't know what other SQL databases will ever add to
their CREATE TABLE and other syntax  which we may want to emulate
later.

----

**How the special case is defined**

Quoth the docstring for the new interface:

```
// alreadyDelimitedAsSyntacticDExpr is an interface that marks
// Expr types for which there is never an ambiguity when
// the expression syntax is followed by a non-reserved
// keyword. When this property is true, that expression
// can be pretty-printed without enclosing parentheses in
// a context followed by more non-reserved keywords, and
// result in syntax that is still unambiguous.
// That is, given an expression E and an arbitrary following
// word X, the syntax "E X" is always unambiguously parsed
// as "(E) X".
//
// This property is obviously true of "atomic" expressions such as
// string and number literals, and also obviously true of
// well-enclosed expressions "(...)" / "[...]". However, it is not
// always true of other composite expression types. For example,
// "A::B" (CastExpr) is not well-delimited because there are
// identifiers/keywords such that "A::B C" can be parsed as "A::(B
// C)". Consider "'a'::INTERVAL" and the non-reserved keyword
// "MINUTE".
//
// This property is closely related to the d_expr syntactic rule in
// the grammar, hence its name. *Approximately* the expression types
// produced by the d_expr rule tend to exhibit the "well-delimited"
// property. However, this is not a proper equivalence: certain Expr
// types are _also_ produced by other parsing rules than d_expr, so
// inspection of the contents of the Expr object is necessary to
// determine whether it is well-delimited or not (for example, some
// FuncExpr objects are well-delimited, and others are not).
// Therefore, it is not generally correct to assign the property to
// all the d_expr expression *types*. We can only do so for a few
// types for which we know that *all possible objects* of that type
// are well-delimited, such as Subquery, NumVal or Placeholder.
```

Release note: None
  • Loading branch information
knz committed Jan 24, 2023
1 parent 15266fd commit cca03b2
Show file tree
Hide file tree
Showing 2 changed files with 64 additions and 0 deletions.
9 changes: 9 additions & 0 deletions pkg/sql/parser/testdata/alter_tenant
Original file line number Diff line number Diff line change
Expand Up @@ -190,3 +190,12 @@ ALTER TENANT 'foo' RENAME TO bar
ALTER TENANT ('foo') RENAME TO (bar) -- fully parenthesized
ALTER TENANT '_' RENAME TO bar -- literals removed
ALTER TENANT 'foo' RENAME TO _ -- identifiers removed

# Regression test for #95612
parse
ALTER TENANT INTERVAL 'string' MINUTE RESET CLUSTER SETTING ident
----
ALTER TENANT ('string'::INTERVAL MINUTE) SET CLUSTER SETTING ident = DEFAULT -- normalized!
ALTER TENANT ((('string')::INTERVAL MINUTE)) SET CLUSTER SETTING ident = (DEFAULT) -- fully parenthesized
ALTER TENANT ('_'::INTERVAL MINUTE) SET CLUSTER SETTING ident = DEFAULT -- literals removed
ALTER TENANT ('string'::INTERVAL MINUTE) SET CLUSTER SETTING ident = DEFAULT -- identifiers removed
55 changes: 55 additions & 0 deletions pkg/sql/sem/tree/alter_tenant.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,12 +55,67 @@ type TenantSpec struct {
All bool
}

// alreadyDelimitedAsSyntacticDExpr is an interface that marks
// Expr types for which there is never an ambiguity when
// the expression syntax is followed by a non-reserved
// keyword. When this property is true, that expression
// can be pretty-printed without enclosing parentheses in
// a context followed by more non-reserved keywords, and
// result in syntax that is still unambiguous.
// That is, given an expression E and an arbitrary following
// word X, the syntax "E X" is always unambiguously parsed
// as "(E) X".
//
// This property is obviously true of "atomic" expressions such as
// string and number literals, and also obviously true of
// well-enclosed expressions "(...)" / "[...]". However, it is not
// always true of other composite expression types. For example,
// "A::B" (CastExpr) is not well-delimited because there are
// identifiers/keywords such that "A::B C" can be parsed as "A::(B
// C)". Consider "'a'::INTERVAL" and the non-reserved keyword
// "MINUTE".
//
// This property is closely related to the d_expr syntactic rule in
// the grammar, hence its name. *Approximately* the expression types
// produced by the d_expr rule tend to exhibit the "well-delimited"
// property. However, this is not a proper equivalence: certain Expr
// types are _also_ produced by other parsing rules than d_expr, so
// inspection of the contents of the Expr object is necessary to
// determine whether it is well-delimited or not (for example, some
// FuncExpr objects are well-delimited, and others are not).
// Therefore, it is not generally correct to assign the property to
// all the d_expr expression *types*. We can only do so for a few
// types for which we know that *all possible objects* of that type
// are well-delimited, such as Subquery, NumVal or Placeholder.
type alreadyDelimitedAsSyntacticDExpr interface {
Expr
alreadyDelimitedAsSyntacticDExpr()
}

func (*UnresolvedName) alreadyDelimitedAsSyntacticDExpr() {}
func (*ParenExpr) alreadyDelimitedAsSyntacticDExpr() {}
func (*Subquery) alreadyDelimitedAsSyntacticDExpr() {}
func (*Placeholder) alreadyDelimitedAsSyntacticDExpr() {}
func (*NumVal) alreadyDelimitedAsSyntacticDExpr() {}
func (*StrVal) alreadyDelimitedAsSyntacticDExpr() {}
func (dNull) alreadyDelimitedAsSyntacticDExpr() {}

// Format implements the NodeFormatter interface.
func (n *TenantSpec) Format(ctx *FmtCtx) {
if n.All {
ctx.WriteString("ALL")
} else if n.IsName {
// Beware to enclose the expression within parentheses if it is
// not a simple identifier and is not already enclosed in
// parentheses.
_, canOmitParentheses := n.Expr.(alreadyDelimitedAsSyntacticDExpr)
if !canOmitParentheses {
ctx.WriteByte('(')
}
ctx.FormatNode(n.Expr)
if !canOmitParentheses {
ctx.WriteByte(')')
}
} else {
ctx.WriteByte('[')
ctx.FormatNode(n.Expr)
Expand Down

0 comments on commit cca03b2

Please sign in to comment.