added parsing for PostgreSQL operations #267

alex-dukhno · 2020-08-30T14:02:39Z

Added support to parse mathematical operations for PostgreSQL
See table 9.4

coveralls · 2020-08-30T14:04:22Z

Pull Request Test Coverage Report for Build 278078999

80 of 90 (88.89%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.04%) to 91.872%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/parser.rs	24	26	92.31%
tests/sqlparser_postgres.rs	25	28	89.29%
src/tokenizer.rs	17	22	77.27%

Totals
Change from base Build 242635691:	-0.04%
Covered Lines:	4668
Relevant Lines:	5081

💛 - Coveralls

alex-dukhno · 2020-08-30T14:45:48Z

@nickolay can you please take a look?

nickolay · 2020-09-07T02:06:21Z

This seems good. Could you fix the conflicts with the recent merge, and use the dialect_of! macro to limit the new operations to the PostgreSQL dialect?

alex-dukhno · 2020-09-07T17:06:54Z

@nickolay can you have a look?
thank you

alex-dukhno · 2020-09-21T15:20:58Z

src/tokenizer.rs

@@ -406,7 +434,18 @@ impl<'a> Tokenizer<'a> {
                '|' => {
                    chars.next(); // consume the '|'
                    match chars.peek() {
-                        Some('|') => self.consume_and_return(chars, Token::StringConcat),
+                        Some('/') if dialect_of!(self is PostgreSqlDialect) => {


@nickolay, am I right that this is not going to work with custom dialect as you suggest here #243 (comment)?

I'm sorry, but I don't follow. In the comment you mention I said I didn't think that simply allowing $ to start an identifier matched what the PosgreSQL documentation said about this.

In this PR you make |/ conditionally parse as SquareRoot, which makes sense to me.

I have custom Dialect https://github.com/alex-dukhno/database/blob/master/src/parser/src/lib.rs#L221-L231, is it true that it wouldn't parse |/ into SquareRoot because of if dialect_of!(self is PostgreSqlDialect)?

Ah, I see now. Yes, I assumed |/ is something very Postgres-specific, so limiting it to PostgreSqlDialect (and GenericDialect, which I didn't mention explicitly) seemed to make sense.

In the older comment you're referring to I described a stop-gap measure that would be used until someone figured out the proper way to deal with $... in the parser.

We could make the new operators parse unconditionally for now if that makes your life easier, but without a clear use-case (and accompanying testcase) this will likely break in the future.

I think it's better to have dialect_of checks and have a solution for dealing with $.... Having my knowledge of crate I'd suggest to add Variable(Ident) or something similar to ast::Expr enum. It has to be done in a separate PR of course. WDYT?

I have @-mentioned you in #265, where this discussion can continue.

The hard part is describing what specifically we're trying to implement and why. Your suggestion, for example, solves the problem of representing $ (but not another leader) followed by an identifier (quoted or not; consisting of characters from an unspecified set) anywhere an arbitrary expression is allowed (but not in more restricted contexts, like OFFSET __ ROWS). I'm not sure if this matches Postgres or the "common denominator" of the usual dialects; answering that requires some research. From the examples I've seen so far, a new Value variant with {leader, ident} fields might be more appropriate.

nickolay

I'm sorry for taking so long!

nickolay · 2020-09-26T22:46:45Z

src/tokenizer.rs

@@ -406,7 +434,18 @@ impl<'a> Tokenizer<'a> {
                '|' => {
                    chars.next(); // consume the '|'
                    match chars.peek() {
-                        Some('|') => self.consume_and_return(chars, Token::StringConcat),
+                        Some('/') if dialect_of!(self is PostgreSqlDialect) => {


I'm sorry, but I don't follow. In the comment you mention I said I didn't think that simply allowing $ to start an identifier matched what the PosgreSQL documentation said about this.

In this PR you make |/ conditionally parse as SquareRoot, which makes sense to me.

src/ast/operator.rs

tests/sqlparser_postgres.rs

src/ast/mod.rs

src/ast/operator.rs

nickolay · 2020-09-26T23:35:08Z

src/parser.rs

+            Token::CubeRoot => Ok(Expr::UnaryOp {
+                op: UnaryOperator::PGCbrt,
+                expr: Box::new(self.parse_subexpr(0)?),


This is probably not right, as this parses WHERE ||/27 = 3 as ||/ ( 27=3 ), which is not a valid expression. I don't see precedence mentioned in the documentation you linked to, and I think a better guess would be to default all the unary operators to the PLUS_MINUS_PREC (moving the new ops to the tok @ case below).

src/parser.rs

src/tokenizer.rs

tests/sqlparser_postgres.rs

nickolay

I'm ready to merge this, waiting for your decision on removing the dialect-specific parsing.

src/parser.rs

nickolay · 2020-09-28T00:23:07Z

src/parser.rs

        } else if Token::DoubleColon == tok {
            self.parse_pg_cast(expr)
+        } else if Token::ExclamationMark == tok {
+            // PostgreSQL factorial operation
+            Ok(Expr::UnaryOp {


Ideally we'd consider how postfix factorial interacted with other operators (the postfix ::type, another factorial, and prefix operators), but I don't think this has to block merging this PR.

nickolay · 2020-09-30T02:29:42Z

Excellent, thank you!

alex-dukhno added 2 commits September 7, 2020 18:12

added parsing for PostgreSQL operations

c1dd27d

use dialect_of to limit tokenizer to PostgreSql dialect

0df8011

alex-dukhno force-pushed the PG-specific-operations branch from 5e10970 to 0df8011 Compare September 7, 2020 15:25

alex-dukhno commented Sep 21, 2020

View reviewed changes

nickolay suggested changes Sep 26, 2020

View reviewed changes

address review comments

1f2b8d3

alex-dukhno force-pushed the PG-specific-operations branch from 094bad4 to 1f2b8d3 Compare September 27, 2020 11:28

alex-dukhno requested a review from nickolay September 27, 2020 11:35

nickolay approved these changes Sep 28, 2020

View reviewed changes

address dialect_of!(self is PostgreSqlDialect)

045dbf5

nickolay merged commit 926b03a into apache:main Sep 30, 2020

alex-dukhno deleted the PG-specific-operations branch September 30, 2020 06:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added parsing for PostgreSQL operations #267

added parsing for PostgreSQL operations #267

alex-dukhno commented Aug 30, 2020

coveralls commented Aug 30, 2020 •

edited

Loading

alex-dukhno commented Aug 30, 2020

nickolay commented Sep 7, 2020

alex-dukhno commented Sep 7, 2020

alex-dukhno Sep 21, 2020

nickolay Sep 26, 2020

alex-dukhno Sep 27, 2020

nickolay Sep 28, 2020

alex-dukhno Sep 28, 2020

nickolay Sep 29, 2020

nickolay left a comment

nickolay Sep 26, 2020

nickolay Sep 26, 2020

nickolay left a comment •

edited

Loading

nickolay Sep 28, 2020

nickolay commented Sep 30, 2020

added parsing for PostgreSQL operations #267

added parsing for PostgreSQL operations #267

Conversation

alex-dukhno commented Aug 30, 2020

coveralls commented Aug 30, 2020 • edited Loading

Pull Request Test Coverage Report for Build 278078999

💛 - Coveralls

alex-dukhno commented Aug 30, 2020

nickolay commented Sep 7, 2020

alex-dukhno commented Sep 7, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickolay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickolay left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nickolay commented Sep 30, 2020

coveralls commented Aug 30, 2020 •

edited

Loading

nickolay left a comment •

edited

Loading