Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump hudi to 0.15.0 #70

Merged
merged 1 commit into from
Jul 18, 2024
Merged

Bump hudi to 0.15.0 #70

merged 1 commit into from
Jul 18, 2024

Conversation

istreeter
Copy link
Collaborator

As part of this change we also bump spark to 3.5.x when using Hudi. And bump scala to 2.13.x. Previously we were pinned to earlier versions because of compatibility with Hudi 0.14.0.

This PR is implemented in a way that we retain the flexibility of easily supporting a different version of Spark for the Hudi docker image. I anticipate we might need this flexibility if Iceberg/Delta are faster to add support for Spark 4.x.

As part of this change we also bump spark to 3.5.x when using Hudi. And
bump scala to 2.13.x. Previously we were pinned to earlier versions
because of compatibility with Hudi 0.14.0.

This PR is implemented in a way that we retain the flexibility of
easily supporting a different version of Spark for the Hudi docker
image. I anticipate we might need this flexibility if Iceberg/Delta are
faster to add support for Spark 4.x.
@istreeter
Copy link
Collaborator Author

This PR also disables the HudiSpec. This is an unfortunate necessity, because since Hudi 0.15.0 it is not possible to use Spark to read partition column values if TimestampBasedKeyGenerator is used. This will be fixed in Hudi 1.0.0, see apache/hudi#11615.

This issue only affects reading Hudi, so it does not affect the ability of this loader to write Hudi using version 0.15.0. Until HudiSpec is re-enabled, we need to use other means to check that this loader can write Hudi correctly.

Copy link
Contributor

@benjben benjben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👌

@istreeter istreeter merged commit fbce127 into develop Jul 18, 2024
2 checks passed
@istreeter istreeter deleted the hudi-0.15.0 branch July 18, 2024 09:35
zhaow-de added a commit to alloy-ch/rcplus-alloy-snowplow-lake-loader that referenced this pull request Oct 4, 2024
…patch-for-alloy

* commit '7ab2edc3fd4d81ffb4d5f3285d02330def7672b1':
  Upgrade common-streams to 0.8.0-M5
  Delete files asynchronously (snowplow-incubator#82)
  Upgrade common-streams 0.8.0-M4 (snowplow-incubator#81)
  Avoid error on duplicate view name (snowplow-incubator#80)
  Add option to exit on missing Iglu schemas (snowplow-incubator#79)
  common-streams 0.8.x with refactored health monitoring (snowplow-incubator#78)
  Create table concurrently with subscribing to stream of events (snowplow-incubator#77)
  Iceberg fail fast if missing permissions on the catalog (snowplow-incubator#76)
  Make alert messages more human-readable (snowplow-incubator#75)
  Hudi loader should fail early if missing permissions on Glue catalog (snowplow-incubator#72)
  Add alert & retry for delta/s3 initialization (snowplow-incubator#74)
  Implement alerting and retrying mechanisms
  Bump aws-hudi to 1.0.0-beta2 (snowplow-incubator#71)
  Bump hudi to 0.15.0 (snowplow-incubator#70)
  Allow disregarding Iglu field's nullability when creating output columns (snowplow-incubator#66)
  Extend health probe to report unhealthy on more error scenarios (snowplow-incubator#69)
  Fix bad rows resizing (snowplow-incubator#68)
oguzhanunlu pushed a commit that referenced this pull request Nov 1, 2024
As part of this change we also bump spark to 3.5.x when using Hudi. And
bump scala to 2.13.x. Previously we were pinned to earlier versions
because of compatibility with Hudi 0.14.0.

This PR is implemented in a way that we retain the flexibility of
easily supporting a different version of Spark for the Hudi docker
image. I anticipate we might need this flexibility if Iceberg/Delta are
faster to add support for Spark 4.x.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants