-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support rebase checking for nested dates and timestamps #9617
Conversation
Signed-off-by: Nghia Truong <[email protected]>
534e647
to
7343b17
Compare
build |
build |
s"${SparkShimImpl.int96ParquetRebaseReadKey} is EXCEPTION") | ||
} | ||
case "CORRECTED" => // Good | ||
case "EXCEPTION" | "CORRECTED" => // Good |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is EXCEPTION
ok now? Do we have any tests that exercise this path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXCEPTION
is handled somewhere else, after loading data. This is the planning step, thus we are OK to run on the GPU with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have test for it too. That is one of the two that I modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was confused because I didn't see EXCEPTION
explicitly in the tests, but I checked the docs, and it is the default setting for datetimeRebaseModeInRead
, so it is covered.
…IA#9617)" This reverts commit 401d0d8. Signed-off-by: Nghia Truong <[email protected]>
…" (#9641) This reverts commit 401d0d8. Signed-off-by: Nghia Truong <[email protected]>
…IA#9617)" This reverts commit 401d0d8. Signed-off-by: Nghia Truong <[email protected]> # Conflicts: # sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala
…owed timestamp [databricks] (#9736) * Add check for nested types * Add check for nested types * Recursively check for rebasing * Extract common code * Allow nested type in rebase check * Enable nested timestamp in roundtrip test * Fix another test Signed-off-by: Nghia Truong <[email protected]> * Enable `LEGACY` rebase in read * Remove comment * Change function/class signatures * Complete modification * Misc Signed-off-by: Nghia Truong <[email protected]> * Add explicit type Signed-off-by: Nghia Truong <[email protected]> * Rename file and add some stuff in DateTimeRebaseHelpers.scala * Move file and rename class * Adopt new enum type Signed-off-by: Nghia Truong <[email protected]> * Add name for the enum classes * Change exception messages * Does not yet support legacy rebase in read Signed-off-by: Nghia Truong <[email protected]> * Change legacy to corrected mode Signed-off-by: Nghia Truong <[email protected]> * Extract common code Signed-off-by: Nghia Truong <[email protected]> * Rename functions Signed-off-by: Nghia Truong <[email protected]> * Reformat Signed-off-by: Nghia Truong <[email protected]> * Make classes serializable Signed-off-by: Nghia Truong <[email protected]> * Revert "Support rebase checking for nested dates and timestamps (#9617)" This reverts commit 401d0d8. Signed-off-by: Nghia Truong <[email protected]> # Conflicts: # sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala * Implement date time rebase * Optimize rebase op * Change comment Signed-off-by: Nghia Truong <[email protected]> * Move tests * Add test for datatime rebase Signed-off-by: Nghia Truong <[email protected]> * Various changes Signed-off-by: Nghia Truong <[email protected]> * Various changes Signed-off-by: Nghia Truong <[email protected]> # Conflicts: # sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuParquetScan.scala * Fix compile errors Signed-off-by: Nghia Truong <[email protected]> * Fix comments Signed-off-by: Nghia Truong <[email protected]> * Fix indentations Signed-off-by: Nghia Truong <[email protected]> * Change comments and indentations Signed-off-by: Nghia Truong <[email protected]> * Allow nested check for rebase * Write different timestamp types in test * Fix conversion if timestamp is not micros * Rename var * Dont have to down cast after up cast Signed-off-by: Nghia Truong <[email protected]> * Change comment Signed-off-by: Nghia Truong <[email protected]> * Still cast timestamp to the old type after rebasing Signed-off-by: Nghia Truong <[email protected]> * Rename test Signed-off-by: Nghia Truong <[email protected]> * Should not transform non-datetime types Signed-off-by: Nghia Truong <[email protected]> * Fix test * Update tests Signed-off-by: Nghia Truong <[email protected]> * Enable int96 rebase in write * Change tests Signed-off-by: Nghia Truong <[email protected]> * Complete tests Signed-off-by: Nghia Truong <[email protected]> * Revert unrelated changes Signed-off-by: Nghia Truong <[email protected]> * Change configs Signed-off-by: Nghia Truong <[email protected]> * Merge tests Signed-off-by: Nghia Truong <[email protected]> * Simplify test data Signed-off-by: Nghia Truong <[email protected]> * Add a new write test Signed-off-by: Nghia Truong <[email protected]> * Add a mixed rebase test Signed-off-by: Nghia Truong <[email protected]> * Change tests Signed-off-by: Nghia Truong <[email protected]> * Fix `seed` in tests Signed-off-by: Nghia Truong <[email protected]> * Rename tests Signed-off-by: Nghia Truong <[email protected]> * Remove seed override * Change TimestampGen Signed-off-by: Nghia Truong <[email protected]> * Remove default seed Signed-off-by: Nghia Truong <[email protected]> * Add default seed Signed-off-by: Nghia Truong <[email protected]> * Remove default seed Signed-off-by: Nghia Truong <[email protected]> --------- Signed-off-by: Nghia Truong <[email protected]>
This implements recursive checking to see if the input column contains nested dates/timestamps that need to be rebased.
Some directly related tests are fixed. Some others need to be cleanup/fixed after having
LEGACY
rebase mode support in parquet reading.Partially contributes to #1126.