-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added parsing for PostgreSQL operations #267
Conversation
Pull Request Test Coverage Report for Build 278078999
💛 - Coveralls |
@nickolay can you please take a look? |
This seems good. Could you fix the conflicts with the recent merge, and use the |
5e10970
to
0df8011
Compare
@nickolay can you have a look? |
src/tokenizer.rs
Outdated
@@ -406,7 +434,18 @@ impl<'a> Tokenizer<'a> { | |||
'|' => { | |||
chars.next(); // consume the '|' | |||
match chars.peek() { | |||
Some('|') => self.consume_and_return(chars, Token::StringConcat), | |||
Some('/') if dialect_of!(self is PostgreSqlDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nickolay, am I right that this is not going to work with custom dialect as you suggest here #243 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, but I don't follow. In the comment you mention I said I didn't think that simply allowing $
to start an identifier matched what the PosgreSQL documentation said about this.
In this PR you make |/
conditionally parse as SquareRoot
, which makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have custom Dialect
https://github.com/alex-dukhno/database/blob/master/src/parser/src/lib.rs#L221-L231, is it true that it wouldn't parse |/
into SquareRoot
because of if dialect_of!(self is PostgreSqlDialect)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see now. Yes, I assumed |/
is something very Postgres-specific, so limiting it to PostgreSqlDialect (and GenericDialect, which I didn't mention explicitly) seemed to make sense.
In the older comment you're referring to I described a stop-gap measure that would be used until someone figured out the proper way to deal with $...
in the parser.
We could make the new operators parse unconditionally for now if that makes your life easier, but without a clear use-case (and accompanying testcase) this will likely break in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to have dialect_of
checks and have a solution for dealing with $...
. Having my knowledge of crate I'd suggest to add Variable(Ident)
or something similar to ast::Expr
enum. It has to be done in a separate PR of course. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have @-mentioned you in #265, where this discussion can continue.
The hard part is describing what specifically we're trying to implement and why. Your suggestion, for example, solves the problem of representing $
(but not another leader) followed by an identifier (quoted or not; consisting of characters from an unspecified set) anywhere an arbitrary expression is allowed (but not in more restricted contexts, like OFFSET __ ROWS
). I'm not sure if this matches Postgres or the "common denominator" of the usual dialects; answering that requires some research. From the examples I've seen so far, a new Value
variant with {leader, ident}
fields might be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for taking so long!
src/tokenizer.rs
Outdated
@@ -406,7 +434,18 @@ impl<'a> Tokenizer<'a> { | |||
'|' => { | |||
chars.next(); // consume the '|' | |||
match chars.peek() { | |||
Some('|') => self.consume_and_return(chars, Token::StringConcat), | |||
Some('/') if dialect_of!(self is PostgreSqlDialect) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, but I don't follow. In the comment you mention I said I didn't think that simply allowing $
to start an identifier matched what the PosgreSQL documentation said about this.
In this PR you make |/
conditionally parse as SquareRoot
, which makes sense to me.
src/parser.rs
Outdated
Token::CubeRoot => Ok(Expr::UnaryOp { | ||
op: UnaryOperator::PGCbrt, | ||
expr: Box::new(self.parse_subexpr(0)?), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably not right, as this parses WHERE ||/27 = 3
as ||/ ( 27=3 )
, which is not a valid expression. I don't see precedence mentioned in the documentation you linked to, and I think a better guess would be to default all the unary operators to the PLUS_MINUS_PREC
(moving the new ops to the tok @
case below).
094bad4
to
1f2b8d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ready to merge this, waiting for your decision on removing the dialect-specific parsing.
} else if Token::DoubleColon == tok { | ||
self.parse_pg_cast(expr) | ||
} else if Token::ExclamationMark == tok { | ||
// PostgreSQL factorial operation | ||
Ok(Expr::UnaryOp { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we'd consider how postfix factorial interacted with other operators (the postfix ::type
, another factorial, and prefix operators), but I don't think this has to block merging this PR.
Excellent, thank you! |
Added support to parse mathematical operations for PostgreSQL
See table 9.4