Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: separate get_common_type / get_result_type for temporal type #6250

Merged
merged 1 commit into from
May 5, 2023

Conversation

jackwener
Copy link
Member

@jackwener jackwener commented May 5, 2023

Which issue does this PR close?

Closes #.

Rationale for this change

related to #5927

following #6221

What changes are included in this PR?

  • separate result type into new file
  • separate get common type / get result type for temporal type
  • refactor code of handling type for temporal type

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt) labels May 5, 2023
Comment on lines 142 to 145
// TODO: make get_result_type must after coerce type.
// if type isn't coerced, we need get common type, and then get result type.
let common_type = mathematics_temporal_common_type(lhs_type, rhs_type);
common_type.and_then(|t| mathematics_temporal_result_type(&t, &t))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's related a comment from @alamb in #6223


/// Returns the result type of applying mathematics operations such as
/// `+` to arguments of `lhs_type` and `rhs_type`.
fn mathematics_temporal_result_type(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for get result type for temporal_type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this set of rules in a separate function makes a lot of sense to me. Thank you @jackwener

@@ -777,9 +638,17 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {

/// Coercion rules for Temporal columns: the type that both lhs and rhs can be
/// casted to for the purpose of a date computation
fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
pub(crate) fn mathematics_temporal_common_type(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for getting common type for temporal_type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very good and makes a lot of sense.

Given this logic is also used for comparisons of temporal types I think mathematics_temporal_common_type may be a misleading name I think leaving the name as temporarl_coercion and updating the docstrings might be the most readable

I also think it would be good to note in the comments that it doesn't handle the interval arithmetic (where the input types are different) only comparisons and arithmetic where the types are the same (like timestamp + timestamp)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually it would be great to support coercion for functions where the types don't have to be the same (e.g. #5652) but this is already much better than what we have today

Comment on lines -131 to -140
if is_datetime(lhs_type)
|| is_datetime(rhs_type)
|| is_interval(lhs_type)
|| is_interval(rhs_type) =>
if is_interval(lhs_type) && is_interval(rhs_type) =>
{
if is_interval(lhs_type) && is_datetime(rhs_type) && *op == Operator::Minus {
return Err(DataFusionError::Plan(
"interval can't subtract timestamp/date".to_string(),
));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original code in here exist two motivation

  • get result type
  • get common type

I separate it.

@jackwener jackwener requested a review from alamb May 5, 2023 12:57
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @jackwener I think this is an improvement

It might be good to look into adding back in the special error message for interval - date/time but we can do that as a follow on PR.

I had some suggestions related to code readability but I don't think any of that should block this PR from merging. It could be addressed as a follow on PR or never

@@ -327,10 +327,10 @@ select '1 month'::interval + '1980-01-01T12:00:00'::timestamp;

# Exected error: interval (scalar) - date / timestamp (scalar)

query error DataFusion error: type_coercion\ncaused by\nError during planning: interval can't subtract timestamp/date
query error DataFusion error: type_coercion\ncaused by\nError during planning: Interval\(MonthDayNano\) \- Date32 can't be evaluated because there isn't a common type to coerce the types to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of remember this specific error was explicitly requested at some point (and I tried to improve it in #6230)

The error is now misleading -- the error is not because we can't coerce interval and date to a common time, but that subtraction on interval and date is not supported

But on the other hand, the coercion message is pretty confusing to start with, so maybe we could merge this PR and fix it in another 🤔

@@ -777,9 +638,17 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {

/// Coercion rules for Temporal columns: the type that both lhs and rhs can be
/// casted to for the purpose of a date computation
fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
pub(crate) fn mathematics_temporal_common_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very good and makes a lot of sense.

Given this logic is also used for comparisons of temporal types I think mathematics_temporal_common_type may be a misleading name I think leaving the name as temporarl_coercion and updating the docstrings might be the most readable

I also think it would be good to note in the comments that it doesn't handle the interval arithmetic (where the input types are different) only comparisons and arithmetic where the types are the same (like timestamp + timestamp)

datafusion/expr/src/type_coercion/binary_result_type.rs Outdated Show resolved Hide resolved

/// Returns the result type of applying mathematics operations such as
/// `+` to arguments of `lhs_type` and `rhs_type`.
fn mathematics_temporal_result_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this set of rules in a separate function makes a lot of sense to me. Thank you @jackwener

@@ -777,9 +638,17 @@ fn is_time_with_valid_unit(datatype: DataType) -> bool {

/// Coercion rules for Temporal columns: the type that both lhs and rhs can be
/// casted to for the purpose of a date computation
fn temporal_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
pub(crate) fn mathematics_temporal_common_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually it would be great to support coercion for functions where the types don't have to be the same (e.g. #5652) but this is already much better than what we have today

@jackwener jackwener force-pushed the separate_file branch 2 times, most recently from 8cf1b80 to f7c840b Compare May 5, 2023 13:30
@github-actions github-actions bot removed the physical-expr Physical Expressions label May 5, 2023
@jackwener jackwener changed the title refactor: separate result type into new file refactor: separate get_common_type / get_result_type for temporal type May 5, 2023
@jackwener
Copy link
Member Author

thanks @alamb review carefully.

@jackwener jackwener merged commit cf233c8 into apache:main May 5, 2023
@jackwener jackwener deleted the separate_file branch May 5, 2023 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants