Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connector UX: deprecate PART_SIZE_MB in connectors specs using S3/GCS storage #11389

Closed
ChristopheDuong opened this issue Mar 24, 2022 · 0 comments · Fixed by #13753
Closed
Assignees
Labels
type/enhancement New feature or request

Comments

@ChristopheDuong
Copy link
Contributor

Tell us about the problem you're trying to solve

Following up on work from #10260, we changed the buffering logic to set some upper limits on the buffer size while buffering a stream before invoking the upload operations using StreamTransferManager.

Therefore, the amount of data being transferred is not unknown anymore (fixed to 200MB), and we don't need to configure an optimal number of part sizes as described here:

Server allows up to 10,000 parts to be uploaded for a single object, and each part must be identified by a unique number from 1 to 10,000.
Therefore the maximum amount of data that can be written to a stream is 10000/numStreams * partSize.
The total object size can be at most 5 TB, so there is no reason to set this higher than 525MB.

Describe the solution you’d like

Deprecate and remove PART_SIZE_MB fields from connectors based on StreamTransferManager

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants