Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing Trailing Comma Option #1212

Merged
merged 10 commits into from
Jun 7, 2024
Merged

Enhancing Trailing Comma Option #1212

merged 10 commits into from
Jun 7, 2024

Conversation

MohamedAbdeen21
Copy link
Contributor

@MohamedAbdeen21 MohamedAbdeen21 commented Apr 11, 2024

Fixes

This change addresses #1211 and arrow-datafusion#9949.

Changes included

Added

  1. Add supports_trailing_commas() and supports_projection_trailing_commas() to the Dialect trait, enabling supports_projection_trailing_commas() for BigQuery and SnowFlake, and supports_trailing_commas() for DuckDb link
  2. Use new Dialect trait methods to set trailing_comma parser option when creating new parsers using Parser::new(dialect), instead of defaulting to false for all dialects and waiting for the user to set it manually through ParserOptions.
  3. Error on trailing commas in projection list when dialect doesn't support it (Note that this doesn't break backwards compatibility) *.

Bug fixes

  1. Handling the case mentioned in Error on some GRANTs when trailing commas are enabled #1211, by temporarily disabling trailing_comma option when parsing GRANT statements. *

Breaking changes

  1. Trailing commas were allowed in all create statements, regardless of the dialect. This should not be the case since this behavior is dialect specific. trailing_comma is now considered when parsing create statements.

* I can use some help regarding changes 3 and 4:
3. The keyword is hard-coded, and only works for projection lists.
4. This may cause issues if the user expects the GRANT statement to allow trailing commas.

Please, if you have other alternative approaches or suggestions, be sure to share them.

Are these changes tested?

Yes

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MohamedAbdeen21

The idea sounds good and I left a few small comments

I think the biggest concern I have is now there seems to two ways ways of setting trailing comma support -- globally via ParserOptions and then per dialect

I think if we want to make trailing comma support part of the dialect, we should remove the parser options and just use the dialect trait everywhere

cc @ankrgyl who I think added trailing comma support initially

@@ -157,6 +157,14 @@ pub trait Dialect: Debug + Any {
// return None to fall back to the default behavior
None
}
/// Does the dialect support trailing commas around the query?
fn supports_trailing_commas(&self) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supports_trailing_commas suggests to me that it is supporting trailing commas everywhere, and supports_projection_trailing_commas suggests just for projection

If supports_trailing_commas means just for queries, can you please name it to something more specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option should try to allow trailing commas everywhere. I'm not really following what you mean by "just for queries"

Copy link
Contributor Author

@MohamedAbdeen21 MohamedAbdeen21 Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed something while going over the tests, Create table by default allows trailing commas, because it doesn't use the parse_comma_separated function, which was added later.

Edit: made a temporary change to create table to remove default trailing commas. I say temporary because we probably can rewrite it using parse_comma_separated, but that should be in a follow-up

tests/sqlparser_common.rs Outdated Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Apr 12, 2024

Pull Request Test Coverage Report for Build 8665267578

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 45 of 50 (90.0%) changed or added relevant lines in 6 files are covered.
  • 23 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.04%) to 88.107%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/dialect/bigquery.rs 1 2 50.0%
src/dialect/duckdb.rs 1 2 50.0%
src/dialect/mod.rs 3 4 75.0%
src/dialect/snowflake.rs 1 2 50.0%
src/parser/mod.rs 13 14 92.86%
Files with Coverage Reduction New Missed Lines %
src/lib.rs 2 11.78%
src/tokenizer.rs 21 90.37%
Totals Coverage Status
Change from base Build 8622482464: 0.04%
Covered Lines: 21322
Relevant Lines: 24200

💛 - Coveralls

@MohamedAbdeen21
Copy link
Contributor Author

MohamedAbdeen21 commented Apr 12, 2024

I think the biggest concern I have is now there seems to two ways ways of setting trailing comma support -- globally via ParserOptions and then per dialect

I think that's completely ok, given that setting it manually overrides the dialect settings. This can give the users flexibility when choosing a dialect for their projects. Keep in mind that trailing comma option is mainly for users who build their queries programmatically.

I think if we want to make trailing comma support part of the dialect, we should remove the parser options and just use the dialect trait everywhere

While I don't like how we have an entire ParserOptions struct that only has 2 options, this will break backwards compatibility since ParserOptions is a pub struct.

Edit: Adding to the first point, maybe we can extend the flexibility further and add a projection_trailing_comma option? I do have it stashed already, let me know if you think that's a good idea.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for this PR @MohamedAbdeen21 -- I am sorry it took so long to provide a review. I am quite worried about the change in existing behavior for parsing trailing commas. Can you please fix those and I'll give it another look?

@@ -3354,8 +3354,13 @@ fn parse_create_table_clone() {

#[test]
fn parse_create_table_trailing_comma() {
let sql = "CREATE TABLE foo (bar int,)";
all_dialects().one_statement_parses_to(sql, "CREATE TABLE foo (bar INT)");
let dialect = TestedDialects {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about this change -- I think it means that the trailing comma will no longer be supported by all dialects

Copy link
Contributor Author

@MohamedAbdeen21 MohamedAbdeen21 May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that all dialects allowed trailing commas, only in create statements, because create statements are parsed by a special function and not by the parse_comma_separated that actually considers the trailing_commas option. So that was a bug not a feature; however, as mentioned before, that behavior can be restored using the parsingOptions struct.

PS: I pointed that out in our previous conversation and recommended a follow-up to migrate the create statement to use the general parse_comma_separated function.

@@ -292,7 +292,7 @@ impl<'a> Parser<'a> {
index: 0,
dialect,
recursion_counter: RecursionCounter::new(DEFAULT_REMAINING_DEPTH),
options: ParserOptions::default(),
options: ParserOptions::new().with_trailing_commas(dialect.supports_trailing_commas()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a good change

@@ -8530,6 +8535,9 @@ impl<'a> Parser<'a> {
with_privileges_keyword: self.parse_keyword(Keyword::PRIVILEGES),
}
} else {
let old_value = self.options.trailing_commas;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a change we should discuss in a separate PR as I think it will cause some statements that used to parse to stop working (even if though it is more consistent)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that will break any existing statements. The original behavior was that if trailing commas were enabled and a DCL used 'SELECT', the parsing would fail.

This change temporarily ignores the trailing commas option when parsing DCL.

This is definitely not an ideal solution and should have its own PR (and therefore the corresponding issue should not be closed, even tho this was supposed to be the main target issue), but this has to be pushed with the new dialect trait method to pass the tests.

@MohamedAbdeen21
Copy link
Contributor Author

MohamedAbdeen21 commented May 3, 2024

Thanks for the review @alamb, the only breaking change here is the create statements when trailing_commas are not enabled in the parserOptions, which I think was a bug mistaken for a feature.

The other change with the DCL doesn't affect the current behavior; if anything, statements that used to fail will now parse.

I'm sorry that the PR is addressing multiple points at once, with some of the changes being temporary solutions that require follow-ups. I'll be happy to break this down into 2 different PRs if necessary, but that will be some time as I'm currently working on a different PR.

Sorry for making the review difficult, I'll try to make future PRs more focused to ease the review process.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @MohamedAbdeen21 and @lovasoa

@alamb alamb merged commit 6d4776b into apache:main Jun 7, 2024
@MohamedAbdeen21
Copy link
Contributor Author

Thanks for the review! @alamb @lovasoa

@alamb
Copy link
Contributor

alamb commented Jun 7, 2024

Sorry for the delay -- I am spread quite thin these days -- though we are discussing a few solutions to improve the situation #1294

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants