Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS [databricks] #8650

Merged
merged 11 commits into from
Jul 7, 2023

Conversation

thirtiseven
Copy link
Collaborator

@thirtiseven thirtiseven commented Jul 3, 2023

Closes #308

This PR adds GPU support for functions TIMESTAMP_SECONDS, TIMESTAMP_MILLIS, and TIMESTAMP_MICROS, which are used to convert the number of seconds/milliseconds/microseconds from the Unix epoch to a timestamp.

Related PRs in Spark: 28534 and 28956.

@sameerz sameerz added the feature request New feature or request label Jul 3, 2023
Signed-off-by: Haoyang Li <[email protected]>
@thirtiseven
Copy link
Collaborator Author

@firestarman Thanks for review! I think I have addressed all your comments.

@thirtiseven thirtiseven self-assigned this Jul 5, 2023
@thirtiseven thirtiseven marked this pull request as ready for review July 5, 2023 10:50
@thirtiseven
Copy link
Collaborator Author

build

}
case DoubleType | FloatType =>
(input: GpuColumnVector) => {
// basicly copied from GpuCast
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was copied from GpuCast would it be simpler to just use GpuCast for this??

GpuCast.castTo(input, input.dataType, TimestampType, false, false, false)

Is it because of the shim hasCastFloatTimestampUpcast?

Copy link
Collaborator Author

@thirtiseven thirtiseven Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it because of the shim hasCastFloatTimestampUpcast?

Yes, but spark#31831 added the upcast at the same version as the hasCastFloatTimestampUpcast from false to true, so GpuCast can be called here safely. I didn't notice that, updated.

# (-62135510400, 253402214400) is the range of seconds that can be represented by timestamp_seconds
# considering the influence of time zone.
seconds_gens = [LongGen(min_val=-62135510400, max_val=253402214400), IntegerGen(), ShortGen(), ByteGen(),
DoubleGen(min_exp=0, max_exp=32), ts_float_gen, DecimalGen(16, 6), DecimalGen(13, 3), DecimalGen(10, 0), DecimalGen(6, 6)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test negative scale decimal values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
case dt: DecimalType =>
(input: GpuColumnVector) => {
// SecondsToTimestamp only supports decimals with a scale of 6 or less, which can be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. It indicates that Spark's SecondsToTimestamp does not support decimals with a scaled of 6 or less. But it does. You even test it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SecondsToTimestamp does support decimals with a scale of 6 or less, which can be represented as microseconds. An exception will be thrown if the scale is more than 6. I updated the comment here.

Signed-off-by: Haoyang Li <[email protected]>
@thirtiseven
Copy link
Collaborator Author

build

Signed-off-by: Haoyang Li <[email protected]>
@thirtiseven
Copy link
Collaborator Author

@firestarman Thanks for the review. All done, please take another look.

@revans2
Copy link
Collaborator

revans2 commented Jul 6, 2023

build

revans2
revans2 previously approved these changes Jul 6, 2023
@firestarman
Copy link
Collaborator

Looks good to me, only one nit

@thirtiseven thirtiseven changed the title Support TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS Support TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS [Databricks] Jul 7, 2023
@thirtiseven thirtiseven changed the title Support TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS [Databricks] Support TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS [databricks] Jul 7, 2023
firestarman
firestarman previously approved these changes Jul 7, 2023
@thirtiseven
Copy link
Collaborator Author

build

Signed-off-by: Haoyang Li <[email protected]>
@thirtiseven
Copy link
Collaborator Author

build

@thirtiseven thirtiseven merged commit df14638 into NVIDIA:branch-23.08 Jul 7, 2023
@thirtiseven thirtiseven deleted the timestamp_support branch August 18, 2023 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Spark 3.1 adding support for TIMESTAMP_SECONDS, TIMESTAMP_MILLIS and TIMESTAMP_MICROS functions
4 participants