Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Enhancement :Enable Passing of requestPath to CRT and monitor the progress using CRTs S3MetaRequestProgress #4379

Merged
merged 12 commits into from
Sep 12, 2023

Conversation

joviegas
Copy link
Contributor

@joviegas joviegas commented Aug 31, 2023

Motivation and Context

Modifications

  • Removed the code where S3CrtRequestBodyStreamAdapter is passed to CRTHttp client
  • Now we just pass null is request Path is passed to the S3MetaRequest.
  • The Progress listener will listen to S3MetaRequestProgress from CRT

Testing

  • No need of new Test since this is internal behaviour test, the existing test cases should cover this.
  • Executed Integ tests.

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

License

  • I confirm that this pull request can be released under the Apache 2 license

@joviegas joviegas requested a review from a team as a code owner August 31, 2023 19:10
@joviegas joviegas changed the title Performance Enhancement :Enable Passing of requestPath to CRT, Monitor Progress Updates for putObject Operations Performance Enhancement :Enable Passing of requestPath to CRT and monitor the progress using CRTs S3MetaRequestProgress Aug 31, 2023
@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch 3 times, most recently from 63065aa to 6c97310 Compare August 31, 2023 21:15
@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 6c97310 to 0e28441 Compare September 1, 2023 15:30
Copy link
Contributor

@millems millems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is it possible to test that what we're intending to fix actually got fixed?
  • Is there performance testing to verify that this change didn't hurt performance?
  • Do we already have tests that assert the progress listener changes were backwards compatible?
  • Are there any path-processing nuances that may have changed when we switched to the CRT? E.g. handling of the current working directory or expansion of ~/?

progressUpdater.registerCompletion(returnFuture);

try {
assertNotUnsupportedArn(putObjectRequest.bucket(), "upload");

CompletableFuture<PutObjectResponse> crtFuture =
s3AsyncClient.putObject(putObjectRequest, requestBody);
s3AsyncClient.putObject(putObjectRequest, AsyncRequestBody.fromFile(uploadFileRequest.source()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CRT is reading directly from disk, does this publisher get used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRT directly gets this from file. The Publisher here is in the S3Client Interface and a non null is required just to get the content length a I have updated the code to throw an exception if we try to read the Java SDKs Publisher in such cases.

Comment on lines 22 to 25
/**
* S3CrtProgressListener delegates events to the underlying delegateListener if defined, avoiding null checks for API when using
* S3MetaRequestProgress for GET calls.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a weird application of polymorphism. Instead of S3CrtProgressListener, can we just have a NoOpPublisherListener that we create if delegateListener is null? That just means we need the null check in the setup code of S3CrtResponseHandlerAdapter instead of per-method here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created NoOpPublisherListener

Comment on lines 46 to 51
public static final S3InternalSdkHttpExecutionAttribute<Path> SOURCE_REQ_PATH =
new S3InternalSdkHttpExecutionAttribute<>(Path.class);


public static final S3InternalSdkHttpExecutionAttribute<PublisherListener> CRT_PROGRESS_LISTENER =
new S3InternalSdkHttpExecutionAttribute<>(PublisherListener.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is marked as @SdkInternalApi, so it can't be used by the transfer manager module, like we're doing above. Do we need to move a subset to a @SdkProtectedApi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 46 to 47
public static final S3InternalSdkHttpExecutionAttribute<Path> SOURCE_REQ_PATH =
new S3InternalSdkHttpExecutionAttribute<>(Path.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nitpicks, since this actually seems like a protected API.

  • SOURCE - Do we ever expect to use this for something other than a PutObject - like a GetObject, where it's actually the DESTINATION?
  • REQ - We should almost always use the full word for something instead of shortening it, so REQUEST. It looks like a lot of the existing properties are request properties, though, so I don't think REQUEST is necessary.

If it's always the source, what about PUT_OBJECT_PATH? UPLOAD_PATH?
If it's potentially also destination, what about UPLOAD_DOWNLOAD_PATH?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matt..
Named it as OBJECT_FILE_PATH so that it can be used for any file path for get/put

@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 58fabbd to e047e6c Compare September 7, 2023 16:42
@joviegas
Copy link
Contributor Author

joviegas commented Sep 7, 2023

  • Is it possible to test that what we're intending to fix actually got fixed?

Thanks Matt for this suggestion , tested this and realized the CRT doesnot actually fix the issue where upload file size is more than RAM size. I have cut a different ticket to CRT. However , this fix is still needed for performance enhancement. Removed the mention of the orginal issue.

  • Is there performance testing to verify that this change didn't hurt performance?

Yes , I saw that the latencies was reduced to half , I have updated the internal doc.

  • Do we already have tests that assert the progress listener changes were backwards compatible?

Yes , we already have test .

  • Are there any path-processing nuances that may have changed when we switched to the CRT? E.g. handling of the current working directory or expansion of ~/?

No.

@@ -46,9 +48,14 @@ public final class S3CrtResponseHandlerAdapter implements S3MetaRequestResponseH
private final SdkHttpResponse.Builder respBuilder = SdkHttpResponse.builder();
private volatile S3MetaRequest metaRequest;

public S3CrtResponseHandlerAdapter(CompletableFuture<Void> executeFuture, SdkAsyncHttpResponseHandler responseHandler) {
private final PublisherListener<S3MetaRequestProgress> progressListener;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to invoke this.progressListener.subscriberOnError(); in onErrorResponseComplete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already calling this.progressListener.subscriberOnError(exception); from failResponseHandlerAndFuture which is called from errorPayload != null also errorPayload == null for valid errorPayloads

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failResponseHandlerAndFuture doesn't seem to be invoked from onErrorResponseComplete if it successfully marshalls the error response though. Can we add a test to verify it? This test may not suffice now that we are relying on the real CRT client to report progress and it is using a mock s3 client.

        responsePublisher.send(ByteBuffer.wrap(errorPayload))
                         .thenRun(responsePublisher::complete)
                         .handle((ignore, throwable) -> {
                             if (throwable != null) {
                                 failResponseHandlerAndFuture(throwable);
                                 return null;
                             }
                             completeFutureAndCloseRequest();
                             return null;
                         });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added new mock tests and removed explicit onError

@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from e047e6c to 6af3397 Compare September 7, 2023 23:25
@@ -107,8 +107,7 @@ public void publisherSubscribe(Subscriber<? super S3MetaRequestProgress> subscri

@Override
public void subscriberOnNext(S3MetaRequestProgress s3MetaRequestProgress) {
incrementBytesTransferred(Math.toIntExact(s3MetaRequestProgress.getBytesTransferred()),
s3MetaRequestProgress.getContentLength());
incrementBytesTransferred(Math.toIntExact(s3MetaRequestProgress.getBytesTransferred()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should not cast it to int. Can we change incrementBytesTransferred to take a long instead?

@@ -46,9 +48,14 @@ public final class S3CrtResponseHandlerAdapter implements S3MetaRequestResponseH
private final SdkHttpResponse.Builder respBuilder = SdkHttpResponse.builder();
private volatile S3MetaRequest metaRequest;

public S3CrtResponseHandlerAdapter(CompletableFuture<Void> executeFuture, SdkAsyncHttpResponseHandler responseHandler) {
private final PublisherListener<S3MetaRequestProgress> progressListener;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failResponseHandlerAndFuture doesn't seem to be invoked from onErrorResponseComplete if it successfully marshalls the error response though. Can we add a test to verify it? This test may not suffice now that we are relying on the real CRT client to report progress and it is using a mock s3 client.

        responsePublisher.send(ByteBuffer.wrap(errorPayload))
                         .thenRun(responsePublisher::complete)
                         .handle((ignore, throwable) -> {
                             if (throwable != null) {
                                 failResponseHandlerAndFuture(throwable);
                                 return null;
                             }
                             completeFutureAndCloseRequest();
                             return null;
                         });

@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 0791331 to 65ddbca Compare September 8, 2023 16:50
@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 9247f21 to 137862e Compare September 8, 2023 22:15
@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 8df43eb to 1cac2de Compare September 10, 2023 00:18
@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 1cac2de to aaa43a0 Compare September 11, 2023 19:25
@joviegas joviegas force-pushed the joviegas/crt_upload_request_path branch from 5d6b70e to 152d896 Compare September 11, 2023 22:14
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 9 Code Smells

55.9% 55.9% Coverage
0.0% 0.0% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@joviegas joviegas merged commit d59470c into master Sep 12, 2023
6 of 7 checks passed
@joviegas joviegas deleted the joviegas/crt_upload_request_path branch May 15, 2024 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants