Experiment: Parallel Multipart Upload #22

nblair · 2019-01-25T14:33:02Z

This pull request captures an experiment with using parallel multipart upload and Google Cloud Storage Composite Objects.

This change set includes a new client, MultipartUploader that encapsulates the splitting of an incoming stream, uploading them in parallel, and finally issuing a compose request to combine them at close.

Some observations:

The Java client prefers that you pass a byte[] rather than an InputStream, and it does so for an interesting reason. The byte[] variant automatically can retry on some failure cases and is recommended, the InputStream variant does not, since the stream can only be used once.
The compose request operation has a hard limit of 32 chunks. This can be a little tricky to accommodate. I chose a configuration parameter for chunkSize to allow deployers to customize based on workloads (some repository formats have larger assets than others). If the chunkSize is too low, you could easily run into a circumstance with 31 chunks of a small size and 1 giant chunk that we have to wait for.

Does it perform better or worse than single thread? I setup a single NXRM instance with the plugin on a n1-highmem-4 GCP instance, and used another GCP instance in the same VPC to generate a diverse workload. I ran the same workload with the current master build (using single threaded uploads).

The results? Higher CPU utilization and lower overall network I/O:

I don't believe this to be suitable approach for NXRM blobstores at this time.

This was a valuable experiment. I will be porting the increase to the connections per route configuration and some of the integration tests to the final product. I do not intend to merge this change set at this time; opening this PR just to capture the results. Relates to #1.

Previously, a MPU would succeed in writing all the chunks and composing into the desired destination. The chunkNames argument however included the final destination name (as it is used for the "first" chunk); when deferredCleanup would finally complete the destination file would also be deleted. This corrects the behavior by excluding the first chunk name in deferredCleanup via a filter on the argument. This also includes a fix for a possible OutOfMemoryError condition being raised with blobs that have a really large final chunk. We avoid the issue by avoiding a call to readChunk (and a Arrays.copyOfRange on the chunk) for the final chunk. Instead, we pass the remaining InputStream to the storage client, with the unfortunate side effect of calling a deprecated method. Per the docs, the storage#create method is deprecated because it is not retryable. That means we may have failed uploads where the first 31 chunks succeed, and the last fails without retry; that highlights the importance of the tuning option and the log messaging alerting of the issue. Also includes: * an integration test to mimic storage facet's write temp and move behavior * new tests on the uploader with larger blobs * increases the maxConnectionsPerRoute to equal maxTotalConnections

nblair added 7 commits October 28, 2018 17:40

feat: parallel multipart upload for blob content

2ef9d19

first chunk happens on the calling thread, 2nd+ in the executor

b8355be

add missing mock behavior for 'store a blob' test

a32441c

wip ADR document

53e777a

remove unused method

8734395

remove unused code

f2803ab

nblair closed this Jan 25, 2019

nblair mentioned this pull request Mar 2, 2019

Update to support NXRM 3.15 #25

Merged

nblair mentioned this pull request Dec 13, 2019

Improve throughput by 50+% via parallel uploads and disabling gzip compression #53

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment: Parallel Multipart Upload #22

Experiment: Parallel Multipart Upload #22

nblair commented Jan 25, 2019

Experiment: Parallel Multipart Upload #22

Experiment: Parallel Multipart Upload #22

Conversation

nblair commented Jan 25, 2019