-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/cloud: replace WriteFile(Reader) with Writer #65057
Conversation
This will need a rebase over #65033 but I wanted to start letting CI chew on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 17 files at r1, 10 of 15 files at r2, 7 of 8 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @dt)
pkg/storage/cloud/cloud_io.go, line 256 at r2 (raw file):
} func BackgroundPipe(ctx context.Context, fn func(ctx context.Context, pr io.Reader) error) WriteCloserWithError {
nit: comment needed for exported function (and all aexported methods).
pkg/storage/cloud/cloud_io.go, line 300 at r3 (raw file):
// WriteFile is a helper for writing the content of a Reader to the given path // of an ExternalStorage. func WriteFile(ctx context.Context, basename string, src io.Reader, dest ExternalStorage) error {
should the signature be changed (ctx, src, dest, basename) to be more of a "src -> dest" pattern?
pkg/storage/cloud/cloud_io.go, line 307 at r3 (raw file):
if _, err := io.Copy(w, src); err != nil { _ = w.CloseWithError(err) return err
you could keep both errors if you so choose: return errors.CombineErrors(err, w.CloseWithError(err))
pkg/storage/cloud/amazon/s3_storage.go, line 293 at r2 (raw file):
return nil, err } uploader := s3manager.NewUploader(sess)
not saying that this is bad... but we're changing implementation to use upload vs put object.
We should probably add this in the pr descriptor.
s3 manager seems to have settings around concurrency (default upload == 5). What does that mean for us
(if anything)? Should that be configurable? Perhaps a TODO is in order to evaluate this.
pkg/storage/cloud/azure/azure_storage.go, line 135 at r2 (raw file):
_, err := azblob.UploadStreamToBlockBlob( ctx, r, blob, azblob.UploadStreamToBlockBlobOptions{ BufferSize: 4 << 20,
This seems rather arbitrary?
I'm also curious/worried about the implication of this call, particularly in the face of slowness/unavailabilty of azure. You could imagine having bunch of these uploads starting ,all buffering 4 MB?
pkg/storage/cloud/gcp/gcs_storage.go, line 162 at r2 (raw file):
func (g *gcsStorage) WriteFile(ctx context.Context, basename string, content io.ReadSeeker) error { const maxAttempts = 3 err := retry.WithMaxAttempts(ctx, base.DefaultRetryOptions(), maxAttempts, func() error {
Do we know if gcs retries? Is it safe to drop this?
pkg/storage/cloud/httpsink/http_storage.go, line 179 at r3 (raw file):
func (h *httpStorage) WriteFile(ctx context.Context, basename string, content io.ReadSeeker) error { return contextutil.RunWithTimeout(ctx, fmt.Sprintf("PUT %s", basename),
I assume we're dropping this because, presumably the caller could set deadlines themselves...
Do we know if anything is effected by this? Is this a safe change to make?
pkg/storage/cloud/userfile/file_table_storage.go, line 269 at r3 (raw file):
// retry we are not able to seek to the start of `content` and try again, // resulting in bytes being missed across txn retry attempts. // See chunkWriter.WriteFile for more information about writing semantics.
Can you explain why it's safe to remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @miretskiy)
pkg/storage/cloud/cloud_io.go, line 256 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
nit: comment needed for exported function (and all aexported methods).
Done.
pkg/storage/cloud/cloud_io.go, line 300 at r3 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
should the signature be changed (ctx, src, dest, basename) to be more of a "src -> dest" pattern?
Heh, I thought of this after I'd already changed all the calls, but decided I didn't quite care enough about a temporary shim to go change them all again.
pkg/storage/cloud/cloud_io.go, line 307 at r3 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
you could keep both errors if you so choose:
return errors.CombineErrors(err, w.CloseWithError(err))
nifty.
pkg/storage/cloud/amazon/s3_storage.go, line 293 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
not saying that this is bad... but we're changing implementation to use upload vs put object.
We should probably add this in the pr descriptor.
s3 manager seems to have settings around concurrency (default upload == 5). What does that mean for us
(if anything)? Should that be configurable? Perhaps a TODO is in order to evaluate this.
Oh, yeah, there's a huge implicit TODO around this whole change that we will need to test and tune each cloud afterwards, as well as revisit the callers to make them make use of a streaming write pattern.
I just wanted to get a commit out as soon as I got the initial sweeping interface change compiling / passing tests, then do targeted changes for those.
pkg/storage/cloud/azure/azure_storage.go, line 135 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
This seems rather arbitrary?
I'm also curious/worried about the implication of this call, particularly in the face of slowness/unavailabilty of azure. You could imagine having bunch of these uploads starting ,all buffering 4 MB?
We'll need to account for the "a bunch of uploads starting" anyway and reserve off a monitor for them. Indeed, we already needed to do this before, but for the whole file size, vs some fixed buffer like 4mb.
The other SDKs have similar chunk buffers, and AWS has a minimum of 5MB, so there is little point in going much smaller if we're going to have to pick at least that as the "cloud upload overhead" reservation size to grab off a monitor. Indeed, we may want to go higher, since each chunk is an API req and these clouds charge per request as well as per GB, so a too-small chunk size could cause some $$ request counts.
pkg/storage/cloud/gcp/gcs_storage.go, line 162 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Do we know if gcs retries? Is it safe to drop this?
I'm not really sure -- we need to experiment with it. I guess I could leave some retries here, but they'd now just be around creating the Writer, not actually doing the writes -- that'd be up to the caller, in how it uses that Writer. I'm assuming it does do internal retries, because if it doesn't we'll want to do our own chunking, retry chunks, then stitch them after, but I have to imagine that's what the SDK is already doing and we'll just want to configure it.
pkg/storage/cloud/httpsink/http_storage.go, line 179 at r3 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
I assume we're dropping this because, presumably the caller could set deadlines themselves...
Do we know if anything is effected by this? Is this a safe change to make?
Right, we now just return a writer, and the caller can decide when to write to it, how long to wait for that write, etc or e.g. put a timeout like this around an io.Copy
call, but this thing here should just provide a writer which the caller decides how to use.
The http client itself has an internal timeout.
pkg/storage/cloud/userfile/file_table_storage.go, line 269 at r3 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Can you explain why it's safe to remove this?
Hilariously, it is apparently doing nothing.
The internal executor isn't stateful, so the BEGIN
begins a transaction... which then implicitly just ends. The COMMIT
one above errors ("no transaction in progress" but we were throwing away the error. I only realized this when I started doing COMMIT
(or ROLLBACK
) in Close()
(and CloseWithError
respectively).
I think if we want the semantics it says it has, we'll need to thread a kvclient.Txn
around, but that can be its own change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the "scary" changes aside, I think this PR is great. So, , and we'll shake whatever problems may arise pretty soon...
Reviewed 1 of 8 files at r3, 1 of 2 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)
pkg/storage/cloud/gcp/gcs_storage.go, line 162 at r2 (raw file):
Previously, dt (David Taylor) wrote…
I'm not really sure -- we need to experiment with it. I guess I could leave some retries here, but they'd now just be around creating the Writer, not actually doing the writes -- that'd be up to the caller, in how it uses that Writer. I'm assuming it does do internal retries, because if it doesn't we'll want to do our own chunking, retry chunks, then stitch them after, but I have to imagine that's what the SDK is already doing and we'll just want to configure it.
I'm fine with a TODO. I really like the direction of this PR.
I wonder if we should keep the WriteFile API side-by-side with Writer, so that smaller files can be put in a single API call? The multipart upload APIs often involve create/init call then 1 or more add-block calls, and then a finish call, so could be 2 extra calls per file, for small files, compared to the direct PUT method. |
OTOH, making all put use the Writer API actually tests the writer API. Also, at least in the google case, there's just no difference: we were already using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 22 of 22 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt and @miretskiy)
pkg/storage/cloud/cloud_io.go, line 276 at r5 (raw file):
} type backroundPipe struct {
nit: backgroundPipe
Personally, I'm not a fan of too many utility functions like WriteFile. I
think writer is totally fine.
…On Wed, May 12, 2021 at 10:02 PM David Taylor ***@***.***> wrote:
OTOH, making all put use the Writer API actually tests the writer API.
Also, at least in the google case, there's just no difference: we were
*already* using the Writer API and just running io.Copy to completion
inside the put call. Maybe it should be up to an individual implementation
of ExternalStorage.Writer to decide if it wants to buffer and to use a
different API to put vs stream. Hmm.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#65057 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANA4FVGJO2Y5BBJ7BAU4X3TTNMXM5ANCNFSM44YWXM5Q>
.
|
To be clear, I was suggesting not a utility function, but rather that I was musing that this might allow callers who are writing a small file / are already holding the whole thing in memory to use the put method, which could do the operation in a single API call for some clouds, vs opening for writing which is almost always 2+n (1 to open, one to finish and n to move the chunks of payload), so 3 for these where n is small. If we were writing hundreds of thousands of small files, 3 calls each vs 1 call each could be a big deal for request-count billing. (separately we should avoid making 100k small files, but that's another issue). But I think I've convinced myself that maybe the |
This changes all callers of ExternalStorage.WriteFile() to cloud.WriteFile(ExternalStorage). This shim just calls the old method under the hood, but will allow swapping ExternalStorage's API to a Writer that this shim can then use. Release note: none.
This changes the ExternalStorage API's writing method from WriteFile which takes an io.Reader and writes its content to the requested destination or returns an error encountered while doing so to instead have Writer() which returns an io.Writer pointing to the destination that can be written to later and then closed to finish the upload (or CloseWithError'ed to cancel it). All existing callers use the shim and are unaffected, but can later choose to change to a push-based model and use the Writer directly. This is left to a follow-up change. Release note: none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)
pkg/storage/cloud/cloud_io.go, line 300 at r3 (raw file):
Previously, dt (David Taylor) wrote…
Heh, I thought of this after I'd already changed all the calls, but decided I didn't quite care enough about a temporary shim to go change them all again.
turned out I did care enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r3, 1 of 22 files at r5, 1 of 2 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)
TFTRs! bors r+ |
Build succeeded: |
This changes the ExternalStorage API's writing method from WriteFile which takes an io.Reader
and writes its content to the requested destination or returns an error encountered while doing
so to instead have Writer() which returns an io.Writer pointing to the destination that can be
written to later and then closed to finish the upload (or CloseWithError'ed to cancel it).
All existing callers use the shim and are unaffected, but can later choose to change to a
push-based model and use the Writer directly. This is left to a follow-up change.
(note: first commit is just adding a shim for existing callers and switching them)
Release note: none.