-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement missing join types for Python dataframe #503
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dandandan !
IMO this is doing more than it should.
datafusion/src/logical_plan/plan.rs
Outdated
@@ -50,6 +52,28 @@ pub enum JoinType { | |||
Anti, | |||
} | |||
|
|||
impl FromStr for JoinType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will tie the DataFusions' API to Python datafusion API at the naming level. I am not sure we should do that:
names imo are not really datafusion's: Rust favors enums over strings. They are relevant in Python because in Python strings are prevalent. In this context, shouldn't the API for mapping a string to a join type remain specific to Pythons' implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it would be better if the python api also used enum (or typed string union) too, at least for autocompletion and optional type checking, but keeping it for now as strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
music to my hears... I ❤️ agree.
datafusion/src/logical_plan/plan.rs
Outdated
"semi" => Ok(JoinType::Semi), | ||
"anti" => Ok(JoinType::Anti), | ||
how => { | ||
return Err(DataFusionError::Internal(format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an internal error, as people would be able to write anything from Python.
Codecov Report
@@ Coverage Diff @@
## master #503 +/- ##
==========================================
+ Coverage 75.92% 75.95% +0.02%
==========================================
Files 154 155 +1
Lines 26381 26444 +63
==========================================
+ Hits 20031 20086 +55
- Misses 6350 6358 +8
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
Which issue does this PR close?
Closes #495
Rationale for this change
More complete functionality for joins in Python
What changes are included in this PR?
FromStr
Are there any user-facing changes?