Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add S3 upload integrity check #26

Merged
merged 8 commits into from
Apr 1, 2020

Conversation

gkolakowski-ias
Copy link
Contributor

No description provided.

# authorization must be aws-v2 or none
s3proxy.authorization=aws-v2
s3proxy.authorization=AWS_V2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to easily convert it to AuthenticationType:

AuthenticationType.valueOf(s3ProxyProperties.getProperty(S3ProxyConstants.PROPERTY_AUTHORIZATION));

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this code doesn't call S3Proxy.Builder.fromProperties?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I'll try to refactor it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really messy. In order to keep this PR clean and simple, I suggest to do tests refactoring separately. In this PR I'll write it a way so that s3proxy.authorization=aws-v2 remains unchanged.

if (partNumber1 == partNumber2) {
return 0;
}
return partNumber1 > partNumber2 ? 1 : -1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace body with:

return Integer.compare(o1.getPartNumber(), o2.getPartNumber())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is available since java 1.7. This project uses 1.6.

@@ -1,6 +1,6 @@
s3proxy.endpoint=https://127.0.0.1:0
s3proxy.endpoint=http://127.0.0.1:0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I leave https I get com.amazonaws.SdkClientException: Unable to execute HTTP request: Unrecognized SSL message, plaintext connection?. What I am doing wrong. Even without my changes it fails with s3proxy:1.7.0.

Comment on lines 390 to 398
// When S3 combines the parts of a multipart upload into the final object, the ETag value is set to the
// hex-encoded MD5 hash of the concatenated binary-encoded (raw bytes) MD5 hashes of each part followed by
// "-" and the number of parts.
MessageDigest md = MessageDigest.getInstance("MD5");
md.reset();
for (PartETag partETag : parts) {
md.update(BinaryUtils.fromHex(partETag.getETag()));
}
// Represent byte array as a 32-digit number hexadecimal format followed by "-<partCount>".
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to a source for these comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -241,6 +247,23 @@ public StreamTransferManager partSize(long partSize) {
return this;
}

/**
* Sets whether data integrity check should be performed during upload.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some documentation explaining the two ways that integrity is checked, and that the final check is based on undocumented behaviour of S3 that may change, so they may get a false alarm exception. Also just note how this may lead to exceptions in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I added IntegrityCheckException to improve error handling. But I've also noticed that when a part upload fails, logs do not contain any information related. Sample logs when I deliberately make the request fail:

12:09:25.340 [main] INFO  o.g.s.org.eclipse.jetty.util.log - Logging initialized @1232ms
12:09:25.384 [main] INFO  o.g.s.o.eclipse.jetty.server.Server - jetty-9.2.z-SNAPSHOT
12:09:25.421 [main] INFO  o.g.s.o.e.j.server.ServerConnector - Started ServerConnector@9e1cb34{HTTP/1.1}{127.0.0.1:53212}
12:09:25.422 [main] INFO  o.g.s.o.eclipse.jetty.server.Server - Started @1316ms
12:09:26.365 [main] INFO  a.m.s3upload.StreamTransferManager - Initiated multipart upload to s3proxy-1209536669/stuff with full ID f8d39bbe-b92e-47ab-bb16-9debbe56ec47
12:09:28.818 [pool-3-thread-2] INFO  a.m.s3upload.MultiPartOutputStream - Called close() on [MultipartOutputStream for parts 5001 - 10000]
12:09:28.823 [pool-3-thread-1] INFO  a.m.s3upload.MultiPartOutputStream - Called close() on [MultipartOutputStream for parts 1 - 5000]
12:09:31.401 [main] INFO  o.g.s.o.e.j.server.ServerConnector - Stopped ServerConnector@9e1cb34{HTTP/1.1}{127.0.0.1:0}

java.lang.NullPointerException
	at alex.mojaki.s3upload.ExecutorServiceResultsHandler$1.next(ExecutorServiceResultsHandler.java:64)
	at alex.mojaki.s3upload.ExecutorServiceResultsHandler.awaitCompletion(ExecutorServiceResultsHandler.java:96)
	at alex.mojaki.s3upload.StreamTransferManager.complete(StreamTransferManager.java:358)
	at alex.mojaki.s3upload.test.StreamTransferManagerTest.testTransferManager(StreamTransferManagerTest.java:211)
	at alex.mojaki.s3upload.test.StreamTransferManagerTest.testTransferManager(StreamTransferManagerTest.java:156)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? What did you do in this case to make the request fail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.withPartSize(-1) or .setMd5Digest("123");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I modify UploadPartRequest in alex.mojaki.s3upload.StreamTransferManager#uploadStreamPart

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I reproduced this and ensured that a message shows in d3e4e20

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks!

@alexmojaki
Copy link
Owner

See gkolakowski-ias#2

@alexmojaki alexmojaki merged commit 3ffa3d1 into alexmojaki:master Apr 1, 2020
@alexmojaki
Copy link
Owner

Thanks @gkolakowski-ias for your PR, and @gaul for your reviewing help!

I have released this in 2.2.0, eventually (it can take quite some time) it will appear on https://mvnrepository.com/artifact/com.github.alexmojaki/s3-stream-upload

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants