Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: S3 source key #1926

Merged
merged 2 commits into from
Oct 19, 2022

Conversation

asifsmohammed
Copy link
Collaborator

@asifsmohammed asifsmohammed commented Oct 15, 2022

Signed-off-by: Asif Sohail Mohammed [email protected]

Description

This PR fixes the bug in S3 source where keys with spaces are not decoded correctly.

I tested it locally with and without the code change. Also made changes to SqsWorkerIT to have space in the key. Tested the IT also with and without the changes to verify.

 ./gradlew :data-prepper-plugins:s3-source:integrationTest --tests "org.opensearch.dataprepper.plugins.source.SqsWorkerIT" -Dtests.s3source.region=us-east-1 -Dtests.s3source.bucket=<bucket-name> -Dtests.s3source.queue.url=<queue-url>

Issues Resolved

resolves #1923

Check List

  • New functionality includes testing.
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Asif Sohail Mohammed <[email protected]>
@asifsmohammed asifsmohammed requested a review from a team as a code owner October 15, 2022 19:18
@codecov-commenter
Copy link

codecov-commenter commented Oct 15, 2022

Codecov Report

Merging #1926 (a8957d8) into main (a66da8e) will decrease coverage by 0.06%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #1926      +/-   ##
============================================
- Coverage     93.94%   93.88%   -0.07%     
+ Complexity     1536     1535       -1     
============================================
  Files           197      197              
  Lines          4561     4561              
  Branches        367      367              
============================================
- Hits           4285     4282       -3     
- Misses          194      196       +2     
- Partials         82       83       +1     
Impacted Files Coverage Δ
...rwarder/discovery/AwsCloudMapPeerListProvider.java 92.68% <0.00%> (-2.44%) ⬇️
...opensearch/dataprepper/pipeline/ProcessWorker.java 88.46% <0.00%> (-1.93%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks!

We should have a unit test for this in SqsWorkerTest. In general bug fixes are often good candidates for unit tests since we often have a concrete example of what to do and not do. It can verify that the object reference uses the getUrlDecodedKey() instead of getKey(). Perhaps you mock them to return different values.

I also think the integration test should be distinct. But, I'm fine accepting the PR without that change.

@asifsmohammed
Copy link
Collaborator Author

asifsmohammed commented Oct 15, 2022

I tried adding unit test here, but this method populateS3Reference is nested with in 3 levels of private methods. I can make this method package private and test it. Let me know if there's a better approach here for unit testing.

Or I can use reflection to test it.

Signed-off-by: Asif Sohail Mohammed <[email protected]>
@Test
void populateS3Reference_should_interact_with_getUrlDecodedKey() throws NoSuchMethodException, InvocationTargetException, IllegalAccessException {
// Using reflection to unit test a private method as part of bug fix.
final Method method = SqsWorker.class.getDeclaredMethod("populateS3Reference", S3EventNotification.S3EventNotificationRecord.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also make this method package protected specifically for testing purposes. I do think there is probably some opportunity for refactoring, but this should work for now.


@Test
void populateS3Reference_should_interact_with_getUrlDecodedKey() throws NoSuchMethodException, InvocationTargetException, IllegalAccessException {
// Using reflection to unit test a private method as part of bug fix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor and non-blocking: It makes sense to include a link to the Bug for future reference

@@ -245,4 +247,30 @@ void processSqsMessages_should_invoke_delete_if_input_is_not_valid_JSON_and_dele
verify(sqsMessagesDeletedCounter).increment(1);
verify(sqsMessagesFailedCounter).increment();
}

@Test
void populateS3Reference_should_interact_with_getUrlDecodedKey() throws NoSuchMethodException, InvocationTargetException, IllegalAccessException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not test an s3Entity with a key that has a space in it? Why do we need reflection?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried testing it, I didn't find a way to create an instance of S3EventNotificationRecord without using mocks. So with the mocks we are just returning whatever we define in when then condition.

I used refection here to test a private method. I'm sure there are ways to improve the logic here.

@@ -219,7 +219,7 @@ private boolean isEventNameCreated(final S3EventNotification.S3EventNotification
private S3ObjectReference populateS3Reference(final S3EventNotification.S3EventNotificationRecord s3EventNotificationRecord) {
final S3EventNotification.S3Entity s3Entity = s3EventNotificationRecord.getS3();
return S3ObjectReference.bucketAndKey(s3Entity.getBucket().getName(),
s3Entity.getObject().getKey())
s3Entity.getObject().getUrlDecodedKey())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a static method in the model class. And since #1628 it has been in our code. I think it might be better to continue to use getKey, and move the code for getUrlDecodedKey() into S3Worker. That will make it easier to test.

I'm fine with following on with this later as well.

@asifsmohammed asifsmohammed merged commit 11cd112 into opensearch-project:main Oct 19, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 19, 2022
* Fix: S3 source key bug fix

Signed-off-by: Asif Sohail Mohammed <[email protected]>
(cherry picked from commit 11cd112)
@dlvenable dlvenable added this to the v2.1 milestone Oct 19, 2022
asifsmohammed added a commit that referenced this pull request Oct 19, 2022
* Fix: S3 source key bug fix

Signed-off-by: Asif Sohail Mohammed <[email protected]>
(cherry picked from commit 11cd112)

Co-authored-by: Asif Sohail Mohammed <[email protected]>
@dlvenable dlvenable modified the milestones: v2.1, v2.0.1 Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unable to read from S3 key with spaces
5 participants