-
Notifications
You must be signed in to change notification settings - Fork 16
Improve throughput by 50+% via parallel uploads and disabling gzip compression #53
Conversation
Previously, a MPU would succeed in writing all the chunks and composing into the desired destination. The chunkNames argument however included the final destination name (as it is used for the "first" chunk); when deferredCleanup would finally complete the destination file would also be deleted. This corrects the behavior by excluding the first chunk name in deferredCleanup via a filter on the argument. This also includes a fix for a possible OutOfMemoryError condition being raised with blobs that have a really large final chunk. We avoid the issue by avoiding a call to readChunk (and a Arrays.copyOfRange on the chunk) for the final chunk. Instead, we pass the remaining InputStream to the storage client, with the unfortunate side effect of calling a deprecated method. Per the docs, the storage#create method is deprecated because it is not retryable. That means we may have failed uploads where the first 31 chunks succeed, and the last fails without retry; that highlights the importance of the tuning option and the log messaging alerting of the issue. Also includes: * an integration test to mimic storage facet's write temp and move behavior * new tests on the uploader with larger blobs * increases the maxConnectionsPerRoute to equal maxTotalConnections
Previously, a MPU would succeed in writing all the chunks and composing into the desired destination. The chunkNames argument however included the final destination name (as it is used for the "first" chunk); when deferredCleanup would finally complete the destination file would also be deleted. This corrects the behavior by excluding the first chunk name in deferredCleanup via a filter on the argument. This also includes a fix for a possible OutOfMemoryError condition being raised with blobs that have a really large final chunk. We avoid the issue by avoiding a call to readChunk (and a Arrays.copyOfRange on the chunk) for the final chunk. Instead, we pass the remaining InputStream to the storage client, with the unfortunate side effect of calling a deprecated method. Per the docs, the storage#create method is deprecated because it is not retryable. That means we may have failed uploads where the first 31 chunks succeed, and the last fails without retry; that highlights the importance of the tuning option and the log messaging alerting of the issue. Also includes: * an integration test to mimic storage facet's write temp and move behavior * new tests on the uploader with larger blobs * increases the maxConnectionsPerRoute to equal maxTotalConnections
…y/nexus-blobstore-google-cloud into multipart-upload
Recent profiling sessions discovered that the google-cloud-storage java library gzip compresses streams on write to the bucket. Since the overwhelming majority of content we write through this plugin is already compressed, this has no benefit and results in high CPU utilization and low network throughput. By disabling gzip compression on the fixed-size chunk writes, we observe a 30-50% increase in network throughput. The edge case that this commit does not address is the final chunk; since we don't have the size and we can't chunk it, we have to call the deprecated storage write method that takes an InputStream. This variant has no option available to disable gzip compression.
src/main/java/org/sonatype/nexus/blobstore/gcloud/internal/MultipartUploader.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just some minor suggestions. +1 for the docs!
chunk = readChunk(current); | ||
} | ||
else { | ||
log.info("Upload for {} has hit Google Cloud Storage multipart-compose limits; " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since composeLimitHit
is being kept track of, it would probably be useful to log it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since
composeLimitHit
is being kept track of....
Fixed in e394e05
|
||
private final ListeningExecutorService executorService = MoreExecutors.listeningDecorator( | ||
Executors.newCachedThreadPool( | ||
new ThreadFactoryBuilder().setNameFormat("nexus-google-cloud-storage-multipart-upload-%d") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a NexusThreadFactory
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a
NexusThreadFactor
.
Done in e394e05
src/main/java/org/sonatype/nexus/blobstore/gcloud/internal/MultipartUploader.java
Show resolved
Hide resolved
|
||
@Override | ||
protected void doStop() throws Exception { | ||
executorService.shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Logging at info feels a bit spammy. Maybe log at debug and log something at info if awaitTermination
times out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Logging at info feels a bit spammy. Maybe log at debug and log something at info if awaitTermination times out?
Glad you pointed this out. It felt weird to wait a few minutes on orderly shutdown for any uploading threads to complete. Now that I think about it, there may be other services stopping that may not be available when these uploads complete, making the partial uploads useless.
I was looking at the S3 blobstore, and it looks like it just executes a shutdownNow()
without waiting. I'm thinking I should follow suit - wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In e394e05 I went ahead and followed the S3 Blobstore approach, so the questionable log statement is now gone.
BlobInfo blobInfo = BlobInfo.newBuilder(bucket, chunkName).build(); | ||
Blob blob = storage.create(blobInfo, chunk, BlobTargetOption.disableGzipContent()); | ||
singleChunk = Optional.of(blob); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting.
Fixed in e394e05
} | ||
catch(Exception e) { | ||
throw new BlobStoreException("Error uploading blob", e, null); | ||
} finally { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting.
Fixed in e394e05
This change: * Changes MultipartUploader#doStop to use shutdownNow rather than waiting for parts to finish uploading; there is no guarantee the other services we would need to complete the blob upload are still available, we might as well stop quickly. * uses Dropwizard metrics' InstrumentedExecutorService to allow us to peek at the thread pool utilization in the MultipartUploader. * Corrects formatting
This pull request is a byproduct of my recent learning on how to use Google Cloud's Stackdriver Profiler.
I setup a NXRM 3.19.1-01 instance with version 0.9.2 of this plugin, and fired up a stress test (internal tool) to see what I could see in Profiler.
One interesting stack immediately stood out and surprised me:
Drilling deeper into that wide column:
Why is the Google Cloud Storage client compressing content on the way to the bucket?
The content in NXRM workloads predominantly already compressed (jars, zips, tar.gz, docker images, etc). Adding an additional layer of compression would have little benefit, as it wouldn't reduce the content size by much.
Furthermore, the width of these columns in the profiler indicated a significant amount of time was spent on CPU for writes, which isn't ideal.
A little more research and I came across this issue in the google-cloud-java project:
googleapis/google-cloud-java#3822
The comments related to performance issues due to gzip compression being enabled by default immediately clicked.
To date, the released versions of this plugin have used the
InputStream
variants of storage write methods.The feature described in #3822 above are not available on the
InputStream
variants (which are deprecated) - only on thebyte[]
variants.This led me to think about #22 which leveraged the
byte[]
variants.I revived the multipart-upload branch, brought it up to master, and disabled gzip compression. I found right away that this had a positive impact:
Stackdriver Profiler shows here that a) the stack to upload content was shorter (gzip compression gone), and b) the stack was narrower, using overall less CPU time.
The net effect from our internal stress test is a 51% improvement in throughput across the board:
Wait - this PR includes two major changes: disabling gzip compression AND using parallel threads to write to the bucket. Aren't you skipping a step?
Yes. I have another branch available - https://github.com/sonatype-nexus-community/nexus-blobstore-google-cloud/tree/single-thread-buffered-upload. In that branch, I skipped the thread pool and uploaded the chunks on the main request thread, using gzip compression disabled. While the performance was better than the baseline (about 25%), it still did not match the gains by the parallel version. I'll keep that branch around as an alternative if we need to explore it further.
This pull request is fairly large and makes the following changes:
nexus.gcs.multipartupload.chunksize
(default is5MB
) will trigger parallel uploads of the content. These uploads are performed in a dedicated thread pool with a size limited to the number of available processors.It relates to the following issue #s: