Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/cloud: replace WriteFile(Reader) with Writer #65057

Merged
merged 2 commits into from
May 13, 2021

Conversation

dt
Copy link
Member

@dt dt commented May 12, 2021

This changes the ExternalStorage API's writing method from WriteFile which takes an io.Reader
and writes its content to the requested destination or returns an error encountered while doing
so to instead have Writer() which returns an io.Writer pointing to the destination that can be
written to later and then closed to finish the upload (or CloseWithError'ed to cancel it).

All existing callers use the shim and are unaffected, but can later choose to change to a
push-based model and use the Writer directly. This is left to a follow-up change.

(note: first commit is just adding a shim for existing callers and switching them)

Release note: none.

@dt dt requested review from adityamaru, miretskiy and a team May 12, 2021 14:22
@dt dt requested a review from a team as a code owner May 12, 2021 14:22
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@dt
Copy link
Member Author

dt commented May 12, 2021

This will need a rebase over #65033 but I wanted to start letting CI chew on it.

@dt dt force-pushed the cloud-writer branch from 7a2fb86 to 3f6625d Compare May 12, 2021 18:29
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 17 files at r1, 10 of 15 files at r2, 7 of 8 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @dt)


pkg/storage/cloud/cloud_io.go, line 256 at r2 (raw file):

}

func BackgroundPipe(ctx context.Context, fn func(ctx context.Context, pr io.Reader) error) WriteCloserWithError {

nit: comment needed for exported function (and all aexported methods).


pkg/storage/cloud/cloud_io.go, line 300 at r3 (raw file):

// WriteFile is a helper for writing the content of a Reader to the given path
// of an ExternalStorage.
func WriteFile(ctx context.Context, basename string, src io.Reader, dest ExternalStorage) error {

should the signature be changed (ctx, src, dest, basename) to be more of a "src -> dest" pattern?


pkg/storage/cloud/cloud_io.go, line 307 at r3 (raw file):

	if _, err := io.Copy(w, src); err != nil {
		_ = w.CloseWithError(err)
		return err

you could keep both errors if you so choose: return errors.CombineErrors(err, w.CloseWithError(err))


pkg/storage/cloud/amazon/s3_storage.go, line 293 at r2 (raw file):

		return nil, err
	}
	uploader := s3manager.NewUploader(sess)

not saying that this is bad... but we're changing implementation to use upload vs put object.
We should probably add this in the pr descriptor.
s3 manager seems to have settings around concurrency (default upload == 5). What does that mean for us
(if anything)? Should that be configurable? Perhaps a TODO is in order to evaluate this.


pkg/storage/cloud/azure/azure_storage.go, line 135 at r2 (raw file):

		_, err := azblob.UploadStreamToBlockBlob(
			ctx, r, blob, azblob.UploadStreamToBlockBlobOptions{
				BufferSize: 4 << 20,

This seems rather arbitrary?
I'm also curious/worried about the implication of this call, particularly in the face of slowness/unavailabilty of azure. You could imagine having bunch of these uploads starting ,all buffering 4 MB?


pkg/storage/cloud/gcp/gcs_storage.go, line 162 at r2 (raw file):

func (g *gcsStorage) WriteFile(ctx context.Context, basename string, content io.ReadSeeker) error {
	const maxAttempts = 3
	err := retry.WithMaxAttempts(ctx, base.DefaultRetryOptions(), maxAttempts, func() error {

Do we know if gcs retries? Is it safe to drop this?


pkg/storage/cloud/httpsink/http_storage.go, line 179 at r3 (raw file):

func (h *httpStorage) WriteFile(ctx context.Context, basename string, content io.ReadSeeker) error {
	return contextutil.RunWithTimeout(ctx, fmt.Sprintf("PUT %s", basename),

I assume we're dropping this because, presumably the caller could set deadlines themselves...
Do we know if anything is effected by this? Is this a safe change to make?


pkg/storage/cloud/userfile/file_table_storage.go, line 269 at r3 (raw file):

	// retry we are not able to seek to the start of `content` and try again,
	// resulting in bytes being missed across txn retry attempts.
	// See chunkWriter.WriteFile for more information about writing semantics.

Can you explain why it's safe to remove this?

@dt dt force-pushed the cloud-writer branch from 3f6625d to cd4ab2a Compare May 12, 2021 21:44
Copy link
Member Author

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @miretskiy)


pkg/storage/cloud/cloud_io.go, line 256 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

nit: comment needed for exported function (and all aexported methods).

Done.


pkg/storage/cloud/cloud_io.go, line 300 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

should the signature be changed (ctx, src, dest, basename) to be more of a "src -> dest" pattern?

Heh, I thought of this after I'd already changed all the calls, but decided I didn't quite care enough about a temporary shim to go change them all again.


pkg/storage/cloud/cloud_io.go, line 307 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

you could keep both errors if you so choose: return errors.CombineErrors(err, w.CloseWithError(err))

nifty.


pkg/storage/cloud/amazon/s3_storage.go, line 293 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

not saying that this is bad... but we're changing implementation to use upload vs put object.
We should probably add this in the pr descriptor.
s3 manager seems to have settings around concurrency (default upload == 5). What does that mean for us
(if anything)? Should that be configurable? Perhaps a TODO is in order to evaluate this.

Oh, yeah, there's a huge implicit TODO around this whole change that we will need to test and tune each cloud afterwards, as well as revisit the callers to make them make use of a streaming write pattern.

I just wanted to get a commit out as soon as I got the initial sweeping interface change compiling / passing tests, then do targeted changes for those.


pkg/storage/cloud/azure/azure_storage.go, line 135 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

This seems rather arbitrary?
I'm also curious/worried about the implication of this call, particularly in the face of slowness/unavailabilty of azure. You could imagine having bunch of these uploads starting ,all buffering 4 MB?

We'll need to account for the "a bunch of uploads starting" anyway and reserve off a monitor for them. Indeed, we already needed to do this before, but for the whole file size, vs some fixed buffer like 4mb.

The other SDKs have similar chunk buffers, and AWS has a minimum of 5MB, so there is little point in going much smaller if we're going to have to pick at least that as the "cloud upload overhead" reservation size to grab off a monitor. Indeed, we may want to go higher, since each chunk is an API req and these clouds charge per request as well as per GB, so a too-small chunk size could cause some $$ request counts.


pkg/storage/cloud/gcp/gcs_storage.go, line 162 at r2 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Do we know if gcs retries? Is it safe to drop this?

I'm not really sure -- we need to experiment with it. I guess I could leave some retries here, but they'd now just be around creating the Writer, not actually doing the writes -- that'd be up to the caller, in how it uses that Writer. I'm assuming it does do internal retries, because if it doesn't we'll want to do our own chunking, retry chunks, then stitch them after, but I have to imagine that's what the SDK is already doing and we'll just want to configure it.


pkg/storage/cloud/httpsink/http_storage.go, line 179 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I assume we're dropping this because, presumably the caller could set deadlines themselves...
Do we know if anything is effected by this? Is this a safe change to make?

Right, we now just return a writer, and the caller can decide when to write to it, how long to wait for that write, etc or e.g. put a timeout like this around an io.Copy call, but this thing here should just provide a writer which the caller decides how to use.

The http client itself has an internal timeout.


pkg/storage/cloud/userfile/file_table_storage.go, line 269 at r3 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Can you explain why it's safe to remove this?

Hilariously, it is apparently doing nothing.

The internal executor isn't stateful, so the BEGIN begins a transaction... which then implicitly just ends. The COMMIT one above errors ("no transaction in progress" but we were throwing away the error. I only realized this when I started doing COMMIT (or ROLLBACK) in Close() (and CloseWithError respectively).

I think if we want the semantics it says it has, we'll need to thread a kvclient.Txn around, but that can be its own change.

@miretskiy miretskiy self-requested a review May 12, 2021 23:40
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the "scary" changes aside, I think this PR is great. So, :lgtm_strong: , and we'll shake whatever problems may arise pretty soon...

Reviewed 1 of 8 files at r3, 1 of 2 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)


pkg/storage/cloud/gcp/gcs_storage.go, line 162 at r2 (raw file):

Previously, dt (David Taylor) wrote…

I'm not really sure -- we need to experiment with it. I guess I could leave some retries here, but they'd now just be around creating the Writer, not actually doing the writes -- that'd be up to the caller, in how it uses that Writer. I'm assuming it does do internal retries, because if it doesn't we'll want to do our own chunking, retry chunks, then stitch them after, but I have to imagine that's what the SDK is already doing and we'll just want to configure it.

I'm fine with a TODO. I really like the direction of this PR.

@dt
Copy link
Member Author

dt commented May 13, 2021

I wonder if we should keep the WriteFile API side-by-side with Writer, so that smaller files can be put in a single API call? The multipart upload APIs often involve create/init call then 1 or more add-block calls, and then a finish call, so could be 2 extra calls per file, for small files, compared to the direct PUT method.

@dt
Copy link
Member Author

dt commented May 13, 2021

OTOH, making all put use the Writer API actually tests the writer API.

Also, at least in the google case, there's just no difference: we were already using the Writer API and just running io.Copy to completion inside the put call. Maybe it should be up to an individual implementation of ExternalStorage.Writer to decide if it wants to buffer and to use a different API to put vs stream. Hmm.

@dt dt force-pushed the cloud-writer branch from cd4ab2a to f20fb9d Compare May 13, 2021 02:19
Copy link
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 22 of 22 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @miretskiy)


pkg/storage/cloud/cloud_io.go, line 276 at r5 (raw file):

}

type backroundPipe struct {

nit: backgroundPipe

@miretskiy
Copy link
Contributor

miretskiy commented May 13, 2021 via email

@dt
Copy link
Member Author

dt commented May 13, 2021

To be clear, I was suggesting not a utility function, but rather that ExternalStorage API would have two methods for writing files: one which puts a file in its entirety (WriteFile, which was here before) and then one which opens a file for incremental writing (aka Writer, added here), with potentially wholly separate implementations, backed by different endpoints for the respective clouds.

I was musing that this might allow callers who are writing a small file / are already holding the whole thing in memory to use the put method, which could do the operation in a single API call for some clouds, vs opening for writing which is almost always 2+n (1 to open, one to finish and n to move the chunks of payload), so 3 for these where n is small. If we were writing hundreds of thousands of small files, 3 calls each vs 1 call each could be a big deal for request-count billing. (separately we should avoid making 100k small files, but that's another issue).

But I think I've convinced myself that maybe the Writer implementation should just deal with this internally -- if this is an issue for a given cloud, we could service the initial Write calls by writing to a buffer, and only once it exceeds a certain size do we open the underlying cloud writer, drain it and then send subsequent writes straight to it while if closed before it fills, we instead put it via the other cloud API. I guess there's a slight downside in that it forces us to buffer a second copy of these small puts. Anyway, I think I might be getting lost in premature optimization here.

This changes all callers of ExternalStorage.WriteFile() to cloud.WriteFile(ExternalStorage).
This shim just calls the old method under the hood, but will allow swapping ExternalStorage's
API to a Writer that this shim can then use.

Release note: none.
@dt dt force-pushed the cloud-writer branch from f20fb9d to ccb55d7 Compare May 13, 2021 13:05
This changes the ExternalStorage API's writing method from WriteFile which takes an io.Reader
and writes its content to the requested destination or returns an error encountered while doing
so to instead have Writer() which returns an io.Writer pointing to the destination that can be
written to later and then closed to finish the upload (or CloseWithError'ed to cancel it).

All existing callers use the shim and are unaffected, but can later choose to change to a
push-based model and use the Writer directly. This is left to a follow-up change.

Release note: none.
@dt dt force-pushed the cloud-writer branch from ccb55d7 to ac52981 Compare May 13, 2021 13:35
Copy link
Member Author

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)


pkg/storage/cloud/cloud_io.go, line 300 at r3 (raw file):

Previously, dt (David Taylor) wrote…

Heh, I thought of this after I'd already changed all the calls, but decided I didn't quite care enough about a temporary shim to go change them all again.

turned out I did care enough

@miretskiy miretskiy requested review from adityamaru and miretskiy May 13, 2021 13:50
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 8 files at r3, 1 of 22 files at r5, 1 of 2 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @dt, and @miretskiy)

@dt
Copy link
Member Author

dt commented May 13, 2021

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented May 13, 2021

Build succeeded:

@craig craig bot merged commit 964a2ad into cockroachdb:master May 13, 2021
@dt dt deleted the cloud-writer branch May 13, 2021 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants