-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supports casting between ANSI interval types and integral types #5353
Supports casting between ANSI interval types and integral types #5353
Conversation
Signed-off-by: Chong Gao <[email protected]>
Signed-off-by: Chong Gao <[email protected]>
build |
f570c04
to
98a9e29
Compare
Depends on this: #5352. Spark change is: Cpu throws a exception for the following scenario because
Gpu do not need to do this. CPU already checked this in the analysis phase. |
build |
@@ -561,6 +561,39 @@ object GpuCast extends Arm { | |||
GpuIntervalUtils.castStringToDayTimeIntervalWithThrow( | |||
input.asInstanceOf[ColumnVector], dayTime) | |||
|
|||
// cast(`day time interval` as integral) | |||
case (dt: DataType, _: LongType) if GpuTypeShims.isSupportedDayTimeType(dt) => | |||
GpuIntervalUtils.dayTimeIntervalToLong(input.asInstanceOf[ColumnVector], dt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But input is not guaranteed to be a ColumnVector
for nested types. Like casting an array of DayTimeInterval to an Array of longs. These need to be ColumnView, but you should be able to treat it exactly the same as a ColumnVector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/GpuTypeShims.scala
Show resolved
Hide resolved
@pytest.mark.parametrize('integral_type', integral_types) | ||
def test_cast_day_time_interval_to_integral_no_overflow(integral_type): | ||
assert_gpu_and_cpu_are_equal_collect( | ||
lambda spark: unary_op_df(spark, DayTimeIntervalGen(start_field='day', end_field='day', min_value=timedelta(seconds=-128 * 86400), max_value=timedelta(seconds=127 * 86400))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If we want to have these all as separate queries, then lets have a separate test for each one so we can parallelize the execution and not have one test failure keep another test from running. If we are okay with them being a single test where one failure can mask another, like it is here, then can we combine them into a single query so it can run faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, combined into a single query to run faster.
build |
No need to update
|
@revans2 Help review, thanks. |
Contributes #5113
Closes #5111
Supports:
Signed-off-by: Chong Gao [email protected]