Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor http execution to compute md5sum when needed #852

Conversation

balamurugana
Copy link
Member

No description provided.

@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch 3 times, most recently from 1d431b3 to 3c12062 Compare February 21, 2020 14:50
@harshavardhana
Copy link
Member

@balamurugana can you list the behavior pattern here under what conditions what is expected now?

@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch from 3c12062 to b081339 Compare February 22, 2020 02:54
@balamurugana
Copy link
Member Author

@balamurugana can you list the behavior pattern here under what conditions what is expected now?

In an essence, PUT/POST requests contain MD5 sum all the time. As putObject() is mostly interesting here, below are the detail.

  1. With PutObjectOptions(), API users have control on memory usage on unknown sized stream e.g. new PutObjectOptions(-1, 5 * MB) limits 5MB memory usage of each part and 50GB of overall stream size. As we have to hold maximum of 5MB in memory, sha256 and md5 sums are computed.
  2. API users have control on memory usage on known sized stream as well e.g. if new PutObjectOptions(3 * GB, 5 * MB) limits 5MB memory usage and sha256 and md5 are computed.
  3. If RandomAccessFile is an input stream, no memory allocation involved, but sha256 and md5 are computed for its parts upload.

Generic behaviour for all PUT/POST requests including putObject().

  1. MD5 sum is calculated and sent to all anonymous requests.
  2. For insecure HTTP requests, SHA256 and MD5 sums are calculated and sent to all user requests.
  3. For secure HTTP requests, MD5 sums is calculated and sent to all user requests.

@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch 2 times, most recently from f774bc4 to 8286ec0 Compare February 25, 2020 05:59
Copy link
Contributor

@nitisht nitisht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@balamurugana I tried the below scenario, I expected to see content-md5 header but don't see it - may be I am missing something, can you take a look

File initialFile = new File("/home/nitish/dev/minio-java-example/minio-6.0.14-DEV-all.jar");
InputStream targetStream = new FileInputStream(initialFile);
PutObjectOptions opts = new PutObjectOptions(-1, 5242880);
		
minioClient.traceOn(System.out);
minioClient.putObject(bucketName,objectName, targetStream, opts);
System.out.println("File is successfully uploaded as minio.jar to "+bucketName+" bucket.");
minioClient.traceOff();

Here is the trace

---------START-HTTP---------
PUT /minio-java-example-1gfvsjp/test/minio.jar?uploadId=1cd44338-7c44-4557-95d7-a393add69e7a&partNumber=1 HTTP/1.1
Content-Encoding: aws-chunked
x-amz-decoded-content-length: 5242880
Host: localhost:9000
User-Agent: MinIO (amd64; amd64) minio-java/dev
x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD
x-amz-date: 20200304T104134Z
Authorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20200304/us-east-1/s3/aws4_request, SignedHeaders=content-encoding;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length, Signature=*REDACTED*

HTTP/1.1 200
Accept-Ranges: bytes
Content-Length: 0
Content-Security-Policy: block-all-mixed-content
ETag: "367a2829d8deac72e071e77fa167b084-1"
Server: MinIO/DEVELOPMENT.GOGET
Vary: Origin
X-Amz-Request-Id: 15F913C818624B47
X-Xss-Protection: 1; mode=block
Date: Wed, 04 Mar 2020 10:41:34 GMT

----------END-HTTP----------
---------START-HTTP---------
PUT /minio-java-example-1gfvsjp/test/minio.jar?uploadId=1cd44338-7c44-4557-95d7-a393add69e7a&partNumber=2 HTTP/1.1
Content-Encoding: aws-chunked
x-amz-decoded-content-length: 2747014
Host: localhost:9000
User-Agent: MinIO (amd64; amd64) minio-java/dev
x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD
x-amz-date: 20200304T104134Z
Authorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20200304/us-east-1/s3/aws4_request, SignedHeaders=content-encoding;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length, Signature=*REDACTED*

HTTP/1.1 200
Accept-Ranges: bytes
Content-Length: 0
Content-Security-Policy: block-all-mixed-content
ETag: "6f972f5a4ef3995e08037d4699282f00-1"
Server: MinIO/DEVELOPMENT.GOGET
Vary: Origin
X-Amz-Request-Id: 15F913C8249E26C2
X-Xss-Protection: 1; mode=block
Date: Wed, 04 Mar 2020 10:41:34 GMT

----------END-HTTP----------
---------START-HTTP---------
POST /minio-java-example-1gfvsjp/test/minio.jar?uploadId=1cd44338-7c44-4557-95d7-a393add69e7a HTTP/1.1
Accept-Encoding: identity
Host: localhost:9000
User-Agent: MinIO (amd64; amd64) minio-java/dev
x-amz-content-sha256: e9bef9a4a3758527c9b7ae33746a8690cf64fb5f30a33e6491aa985f3bb371de
x-amz-date: 20200304T104134Z
Authorization: AWS4-HMAC-SHA256 Credential=*REDACTED*/20200304/us-east-1/s3/aws4_request, SignedHeaders=accept-encoding;host;x-amz-content-sha256;x-amz-date, Signature=*REDACTED*

HTTP/1.1 200
Accept-Ranges: bytes
Content-Length: 349
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
ETag: "7558ddc42289cc5e3b43da544a15b494-2"
Server: MinIO/DEVELOPMENT.GOGET
Vary: Origin
X-Amz-Request-Id: 15F913C83136C65F
X-Xss-Protection: 1; mode=block
Date: Wed, 04 Mar 2020 10:41:34 GMT

@balamurugana
Copy link
Member Author

@nitisht

@balamurugana I tried the below scenario, I expected to see content-md5 header but don't see it - may be I am missing something, can you take a look

You are not using this PR. This PR removes STREAMING-AWS4-HMAC-SHA256-PAYLOAD, but your trace shows.

@nitisht
Copy link
Contributor

nitisht commented Mar 4, 2020

@nitisht

@balamurugana I tried the below scenario, I expected to see content-md5 header but don't see it - may be I am missing something, can you take a look

You are not using this PR. This PR removes STREAMING-AWS4-HMAC-SHA256-PAYLOAD, but your trace shows.

I checked out the PR and built the jar, but let me recheck

@nitisht
Copy link
Contributor

nitisht commented Mar 5, 2020

I checked out the PR and built the jar, but let me recheck

Looks like I was using the wrong jar. I see content-md5 with correct jar.

nitisht
nitisht previously approved these changes Mar 5, 2020
Copy link
Contributor

@nitisht nitisht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM & Tested

@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch 5 times, most recently from dd7ab00 to e42fa6b Compare March 13, 2020 03:59
@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch 5 times, most recently from ab0a9e4 to 7ec77b2 Compare March 18, 2020 03:45
@nitisht
Copy link
Contributor

nitisht commented Mar 19, 2020

@balamurugana could you pls resolve the conflicts

kannappanr
kannappanr previously approved these changes Mar 19, 2020
Copy link
Contributor

@kannappanr kannappanr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kannappanr
Copy link
Contributor

@balamurugana Can you please rebase this PR again?

@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch 2 times, most recently from 0708952 to c093b09 Compare March 19, 2020 17:39
kannappanr
kannappanr previously approved these changes Mar 19, 2020
Copy link
Contributor

@kannappanr kannappanr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kannappanr kannappanr requested a review from BigUstad March 19, 2020 18:52
@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch from c093b09 to b297cb7 Compare March 19, 2020 18:57
@balamurugana balamurugana force-pushed the refactor-http-execution-to-compute-md5-when-needed branch from b297cb7 to 98a91fd Compare March 19, 2020 19:13
Copy link
Contributor

@BigUstad BigUstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kannappanr kannappanr merged commit 9c54333 into minio:master Mar 19, 2020
@balamurugana balamurugana deleted the refactor-http-execution-to-compute-md5-when-needed branch March 20, 2020 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants