-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for parsing comma-separated column name lists #458
base: main
Are you sure you want to change the base?
Conversation
if chars.peek().is_none() { | ||
return Ok((ColumnName::empty(), FieldEnding::InputExhausted)); | ||
} | ||
if chars.next_if_eq(&',').is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make the delimiter a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really hope we don't ever need to support configurable delimiters...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahaha yea fair enough
( | ||
" a , ", | ||
Some(vec![column_name!("a"), ColumnName::empty()]), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm... thinking about this. do we want trailing delimiters to mean an empty column follows? I feel like SQL has annoying behavior like this but I've gotten used to trailing delimiters being optional (e.g. in rust syntax)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but we already allow a whole bunch of similar corner cases already, such as "."
parsing as a column name with two empty field names. If an engine cares doesn't like those, it's welcome to check for and reject them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(if we decide to reject any of the corner cases, we probably need to start rejecting them all, which is a lot of extra code and testing to worry about -- but we can do it as a separate PR if we decide it's really needed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good :)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #458 +/- ##
==========================================
+ Coverage 78.41% 78.50% +0.08%
==========================================
Files 55 55
Lines 11806 11875 +69
Branches 11806 11875 +69
==========================================
+ Hits 9258 9322 +64
- Misses 2041 2044 +3
- Partials 507 509 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks
What changes are proposed in this pull request?
Some Delta table properties are comma-separated lists of column names, so we need a way to reliably parse them. Integrate list parsing into the existing column name parsing, so that commas are handled correctly in the presence of special characters, escaping, etc.
This PR affects the following public APIs
New public API.
How was this change tested?
New unit tests.