Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick #18370 to 7.8: [Filebeat] Unescape characters in s3 file names #18412

Merged
merged 2 commits into from
May 11, 2020
Merged

Cherry-pick #18370 to 7.8: [Filebeat] Unescape characters in s3 file names #18412

merged 2 commits into from
May 11, 2020

Conversation

kaiyan-sheng
Copy link
Contributor

Cherry-pick of PR #18370 to 7.8 branch. Original message:

What does this PR do?

This PR unescape 3-byte encoded substring to regular string. For example: "%3D" to "=".

Why is it important?

When user uses folders in S3 bucket to organize log files, SQS notification actually encode the log file path like below:

{"Records":[{"eventVersion":"2.1","eventSource":"aws:s3","awsRegion":"us-east-1","eventTime":"2020-05-07T21:02:38.676Z","eventName":"ObjectCreated:Put","userIdentity":{"principalId":"AWS:AIDAWHL7AXDB2IME26NB3"},"requestParameters":{"sourceIPAddress":"174.29.210.150"},"responseElements":{"x-amz-request-id":"B86806D8E85F4A5F","x-amz-id-2":"yw1VtzJIcckqIpEjmp/oeesJj3yplnGR48PDCX0/nS8Qj5VXWGdzCrjSzAv0cRio/oAXihdDwscZE7pm32CHQH8gbjM60Qu6"},"s3":{"s3SchemaVersion":"1.0","configurationId":"ObjectCreated","bucket":{"name":"test-fb-ks","ownerIdentity":{"principalId":"A2EOMKP5A4DS45"},"arn":"arn:aws:s3:::test-fb-ks"},"object":{"key":"year%3D2020/month%3D05/test1.txt","size":6,"eTag":"3e7705498e8be60520841409ebc69bc1","sequencer":"005EB47773C10D8A81"}}}]}

File in S3 is year=2020/month=05/test1.txt but in SQS is converted to year%3D2020/month%3D05/test1.txt.

This conversion needs to be undo when Filebeat tries to read the S3 file pointed by SQS message. Otherwise, this file will not be found.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

https://discuss.elastic.co/t/filebeat-s3-issue/230441

* upescape characters in s3 file names

(cherry picked from commit 1e2ec4e)
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 11, 2020
@kaiyan-sheng kaiyan-sheng self-assigned this May 11, 2020
@kaiyan-sheng kaiyan-sheng added the Team:Platforms Label for the Integrations - Platforms team label May 11, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@elasticmachine
Copy link
Collaborator

elasticmachine commented May 11, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview stats

Expand to view the summary

Build stats

Test stats 🧪

Test Results
Failed 0
Passed 1182
Skipped 128
Total 1310

@kaiyan-sheng kaiyan-sheng merged commit b69ea36 into elastic:7.8 May 11, 2020
@kaiyan-sheng kaiyan-sheng deleted the backport_18370_7.8 branch May 11, 2020 15:53
@kaiyan-sheng kaiyan-sheng restored the backport_18370_7.8 branch May 11, 2020 15:53
@kaiyan-sheng kaiyan-sheng deleted the backport_18370_7.8 branch May 11, 2020 15:53
@kaiyan-sheng kaiyan-sheng restored the backport_18370_7.8 branch May 11, 2020 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport needs_team Indicates that the issue/PR needs a Team:* label review Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants