-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛Destination S3 and GCS - Fixed connector's bug that prevent writing streams with more than 50GB #5890
🐛Destination S3 and GCS - Fixed connector's bug that prevent writing streams with more than 50GB #5890
Conversation
…ance destination issues reproducing and debugging
/test connector=connectors/destination-s3
|
/test connector=connectors/destination-s3
|
...-gcs/src/test/java/io/airbyte/integrations/destination/gcs/avro/GcsAvroFormatConfigTest.java
Outdated
Show resolved
Hide resolved
...-gcs/src/test/java/io/airbyte/integrations/destination/gcs/avro/GcsAvroFormatConfigTest.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/airbyte/integrations/destination/s3/util/S3StreamTransferManagerHelper.java
Outdated
Show resolved
Hide resolved
LGTM |
…-for-gcs-and-s3 # Conflicts: # docs/integrations/destinations/s3.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! Appreciate the screen shots. Appreciate all the test you added.
Small feedback from me - it would have been easier to review if the PR was split into the base S3 changes, and a follow up PR for GCS changes.
Please also go through the updating connector checklist in the future so it's easy for me to know what is/what is not done.
Thanks!
airbyte-integrations/connectors/destination-gcs/src/main/resources/spec.json
Show resolved
Hide resolved
...src/main/java/io/airbyte/integrations/destination/s3/util/S3StreamTransferManagerHelper.java
Outdated
Show resolved
Hide resolved
/publish connector=connectors/destination-s3
|
/publish connector=connectors/destination-gcs
|
Released new S3 connector, but can't release GCS due to other blocking bug - #6134 |
Merging the latest master with a fix didn't help. It seems like we've got new issues in master that prevent merging. So still can't publish GCS and merge this |
/publish connector=connectors/destination-gcs
|
What
Currently, we can't write to S3 and GCS objects with more than 50GB of each
How
The reason is that servers allow 10,000 blocks max and currently we have block size hardcoded as 5MB. but servers allow to have it up to 5GB each.
To fix the described issue - added a new arg that allows to configure it. Basically, to reach the max object limit of 5TB - the max block's size of 525 (MB) is enough. The new arg supports size from 5MB (min for server) to 525MB.
https://cloud.google.com/storage/quotas
https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html
Tested locally:
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described here