-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: separate get_common_type / get_result_type for temporal type #6250
Conversation
// TODO: make get_result_type must after coerce type. | ||
// if type isn't coerced, we need get common type, and then get result type. | ||
let common_type = mathematics_temporal_common_type(lhs_type, rhs_type); | ||
common_type.and_then(|t| mathematics_temporal_result_type(&t, &t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
/// Returns the result type of applying mathematics operations such as | ||
/// `+` to arguments of `lhs_type` and `rhs_type`. | ||
fn mathematics_temporal_result_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for get result type for temporal_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this set of rules in a separate function makes a lot of sense to me. Thank you @jackwener
@@ -777,9 +638,17 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool { | |||
|
|||
/// Coercion rules for Temporal columns: the type that both lhs and rhs can be | |||
/// casted to for the purpose of a date computation | |||
fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> { | |||
pub(crate) fn mathematics_temporal_common_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for getting common type for temporal_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very good and makes a lot of sense.
Given this logic is also used for comparisons of temporal types I think mathematics_temporal_common_type
may be a misleading name I think leaving the name as temporarl_coercion
and updating the docstrings might be the most readable
I also think it would be good to note in the comments that it doesn't handle the interval arithmetic (where the input types are different) only comparisons and arithmetic where the types are the same (like timestamp + timestamp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually it would be great to support coercion for functions where the types don't have to be the same (e.g. #5652) but this is already much better than what we have today
if is_datetime(lhs_type) | ||
|| is_datetime(rhs_type) | ||
|| is_interval(lhs_type) | ||
|| is_interval(rhs_type) => | ||
if is_interval(lhs_type) && is_interval(rhs_type) => | ||
{ | ||
if is_interval(lhs_type) && is_datetime(rhs_type) && *op == Operator::Minus { | ||
return Err(DataFusionError::Plan( | ||
"interval can't subtract timestamp/date".to_string(), | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original code in here exist two motivation
- get result type
- get common type
I separate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me -- thank you @jackwener I think this is an improvement
It might be good to look into adding back in the special error message for interval - date/time but we can do that as a follow on PR.
I had some suggestions related to code readability but I don't think any of that should block this PR from merging. It could be addressed as a follow on PR or never
@@ -327,10 +327,10 @@ select '1 month'::interval + '1980-01-01T12:00:00'::timestamp; | |||
|
|||
# Exected error: interval (scalar) - date / timestamp (scalar) | |||
|
|||
query error DataFusion error: type_coercion\ncaused by\nError during planning: interval can't subtract timestamp/date | |||
query error DataFusion error: type_coercion\ncaused by\nError during planning: Interval\(MonthDayNano\) \- Date32 can't be evaluated because there isn't a common type to coerce the types to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of remember this specific error was explicitly requested at some point (and I tried to improve it in #6230)
The error is now misleading -- the error is not because we can't coerce interval and date to a common time, but that subtraction on interval and date is not supported
But on the other hand, the coercion message is pretty confusing to start with, so maybe we could merge this PR and fix it in another 🤔
@@ -777,9 +638,17 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool { | |||
|
|||
/// Coercion rules for Temporal columns: the type that both lhs and rhs can be | |||
/// casted to for the purpose of a date computation | |||
fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> { | |||
pub(crate) fn mathematics_temporal_common_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very good and makes a lot of sense.
Given this logic is also used for comparisons of temporal types I think mathematics_temporal_common_type
may be a misleading name I think leaving the name as temporarl_coercion
and updating the docstrings might be the most readable
I also think it would be good to note in the comments that it doesn't handle the interval arithmetic (where the input types are different) only comparisons and arithmetic where the types are the same (like timestamp + timestamp)
|
||
/// Returns the result type of applying mathematics operations such as | ||
/// `+` to arguments of `lhs_type` and `rhs_type`. | ||
fn mathematics_temporal_result_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this set of rules in a separate function makes a lot of sense to me. Thank you @jackwener
@@ -777,9 +638,17 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool { | |||
|
|||
/// Coercion rules for Temporal columns: the type that both lhs and rhs can be | |||
/// casted to for the purpose of a date computation | |||
fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> { | |||
pub(crate) fn mathematics_temporal_common_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually it would be great to support coercion for functions where the types don't have to be the same (e.g. #5652) but this is already much better than what we have today
8cf1b80
to
f7c840b
Compare
thanks @alamb review carefully. |
Which issue does this PR close?
Closes #.
Rationale for this change
related to #5927
following #6221
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?