-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change SQL dialect to PostgreSQL #214
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,7 +21,7 @@ | |
|
||
use sqlparser::{ | ||
ast::{ColumnDef, ColumnOptionDef, Statement as SQLStatement, TableConstraint}, | ||
dialect::{keywords::Keyword, Dialect, GenericDialect}, | ||
dialect::{keywords::Keyword, Dialect, PostgreSqlDialect}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another approach might be to make the SQL dialect be configurable on I think @andygrove has said in the past using DataFusion to mimic engines such as MySQL is a good usecase too There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's an interesting approach - we'd have to be careful about our sqlparser => expr conversions, but that could be quite useful as a feature 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking about it more over lunch: if we enable a bring-your-own-dialect setup, we'd need a decent testing strategy to support this. In this example, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That does sound fairly complicated.... I personally / project wise don't have a need for a MySQL specific mimic (postgres is good enough for us in IOx) but I think perhaps we should let others weigh in here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is tricky for sure. I would be fine with having Postgres as the default and officially supported (well tested) dialect, while also allowing users to provide a dialect at their own risk (and have this be well documented) but even that might create an undue burden on maintainers. It would definitely be good to try and find out more about our user's requirements here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to be totally clear, I'm definitely not married to the contents of this PR as they stand; if the consensus is that we do want to support multiple dialects, I'm happy to rework it towards that goal 🙂 If that's a route we want to explore in depth, we could perhaps build some sort of dialect config setup that allows for controlling parser overrides in DF, but that feels a bit too heavyweight if we don't have tonnes of use cases right now (and hence no-one to drive or own that work). Here's my immediate idea:
Does that sound vaguely sensible? Full disclosure: my goal here is to get a DataFusion-powered service queryable via BI tooling, and these tools often use the more esoteric features of the Postgres dialect to bootstrap their UIs (think table listings etc). I still have a fair bit of work to do on this, but I can at least sort of see a path to having it working now. I suspect this will be beneficial for lots of other use cases, but admittedly I don't have any evidence for that claim beyond intuition 😅 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think this makes sense
👍
This also sounds good -- if you wrote it up as a ticket, I bet it is a good "first issue" type ticket for someone else in the community to do if they were looking for something to learn DataFusion code a bit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the continued discussion and progress on this. I think the outcome is good. Just for context, I have built database gateways in the past that mimic MySQL and Hive protocols and dialects (different projects) and although I don't work on any projects these days that need either, I am just trying to keep options open for future users of DataFusion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I definitely wouldn't want to impose unnecessary constraints on people who did want to use other dialects :) If people are happy with the ideas in my last comment, I'll close this, get some new issues logged, and start tackling those. |
||
parser::{Parser, ParserError}, | ||
tokenizer::{Token, Tokenizer}, | ||
}; | ||
|
@@ -78,7 +78,7 @@ pub struct DFParser<'a> { | |
impl<'a> DFParser<'a> { | ||
/// Parse the specified tokens | ||
pub fn new(sql: &str) -> Result<Self, ParserError> { | ||
let dialect = &GenericDialect {}; | ||
let dialect = &PostgreSqlDialect {}; | ||
DFParser::new_with_dialect(sql, dialect) | ||
} | ||
|
||
|
@@ -97,7 +97,7 @@ impl<'a> DFParser<'a> { | |
|
||
/// Parse a SQL statement and produce a set of statements with dialect | ||
pub fn parse_sql(sql: &str) -> Result<Vec<Statement>, ParserError> { | ||
let dialect = &GenericDialect {}; | ||
let dialect = &PostgreSqlDialect {}; | ||
DFParser::parse_sql_with_dialect(sql, dialect) | ||
} | ||
|
||
|
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these were added in apache/arrow#8135 . Perhaps @wqc200 has some comment about how / if this feature is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should remove support for scalar variables in the logical plan. This seems unrelated to changing the default SQL dialect. Users can use ScalarVariable without using SQL.