Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to identifier parsing if expression parsing fails #1513

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

yoavcloud
Copy link
Contributor

@yoavcloud yoavcloud commented Nov 11, 2024

This fixes the following issue: #1496

The parser encounters situations when the next keyword indicates an expression, but in fact it should be parsed as an identifier. Example from Snowflake: SELECT MAX(interval) from tbl. When the parser encounters the interval word it tries to parse an Expr::Interval but it fails because in the context of this query, interval is an identifier that Snowflake (unlike most other dialects) allows in unquoted form.

The suggested approach is to try to parse the expression, but if that fails, fallback to parse an identifier under certain conditions. In addition, each dialect can now declare which keywords it reserves for use as an identifier in unquoted form.

Lastly, changed the parsing of the DEFAULT word to Expr::Default instead of an identifier.

@yoavcloud yoavcloud changed the title Revert to identifier parsing if expression parsing fails Fallback to identifier parsing if expression parsing fails Nov 11, 2024
@yoavcloud yoavcloud force-pushed the expr_ident_mixup branch 2 times, most recently from 77243eb to 6685cce Compare November 13, 2024 12:03
src/parser/mod.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Show resolved Hide resolved
src/ast/mod.rs Outdated Show resolved Hide resolved
}
}

fn parse_ident_expr(&mut self, w: &Word) -> Result<Expr, ParserError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm thinking we would need to rename the function, it seems to be doing more than identifiers (and based on the name its unclear what the word w input argument is for)

tests/sqlparser_common.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Outdated Show resolved Hide resolved
Ok(Some(expr)) => Ok(expr),
// This word indicated an expression prefix but parsing failed. Two options:
// 1. Malformed statement
// 2. The dialect may allow this word as identifier as well as indicating an expression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments are indeed helpful to follow thanks! Thinking here we could clarify 2. even further with the SELECT MAX(interval) from tbl example?

src/parser/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me overall! Left some minor comments then one regarding tests to demonstrate the new behavior

Ok(None) => Ok(self.parse_expr_prefix_by_unnreserved_word(&w)?),

// If parsing of the word as a special expression failed, we are facing two options:
// 1. The statement is malformed, e.g. `SELECT INTERVAL '1 DAI`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// 1. The statement is malformed, e.g. `SELECT INTERVAL '1 DAI`
// 1. The statement is malformed, e.g. `SELECT INTERVAL '1 DAY`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional, DAI is what makes this statement malformed :-)

Comment on lines 1403 to 1404
/// Parse an expression prefix.
pub fn parse_prefix2(&mut self) -> Result<Expr, ParserError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this version seems unused we can remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shit... Yes, on it.

}

// Tries to parse an expression by a word that is not known to have a special meaning in the dialect.
fn parse_expr_prefix_by_unnreserved_word(&mut self, w: &Word) -> Result<Expr, ParserError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn parse_expr_prefix_by_unnreserved_word(&mut self, w: &Word) -> Result<Expr, ParserError> {
fn parse_expr_prefix_by_unreserved_word(&mut self, w: &Word) -> Result<Expr, ParserError> {

// special expression (to maintain backwards compatibility of parsing errors).
Err(e) => {
if !self.dialect.is_reserved_for_identifier(w.keyword) {
if let Ok(expr) = self.maybe_parse_internal(|parser| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let Ok(expr) = self.maybe_parse_internal(|parser| {
if let Some(expr) = self.maybe_parse(|parser| {

it looks like we can use the normal maybe_parse here since it doesn't have the special requirement?

}

/// Run a parser method `f`, reverting back to the current position if unsuccessful.
pub fn maybe_parse_internal<T, F>(&mut self, mut f: F) -> Result<T, ParserError>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn maybe_parse_internal<T, F>(&mut self, mut f: F) -> Result<T, ParserError>
pub fn try_parse<T, F>(&mut self, mut f: F) -> Result<T, ParserError>

We can probably call try_parse which could hint that it returns an error (vs maybe_parse which hints at an option)

@@ -214,6 +216,16 @@ impl Dialect for SnowflakeDialect {
fn supports_show_like_before_in(&self) -> bool {
true
}

fn is_reserved_for_identifier(&self, kw: Keyword) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh are we missing tests for the new behavior (I can't seem to find them in the current PR if so)?

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! cc @alamb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants