Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat][aws-s3] "+" character replaced by a space in the s3 key #33998

Closed
pdelormekpler opened this issue Dec 8, 2022 · 3 comments
Closed
Labels
bug Filebeat Filebeat Stalled Team:Cloud-Monitoring Label for the Cloud Monitoring team

Comments

@pdelormekpler
Copy link

pdelormekpler commented Dec 8, 2022

Hi,

I'm testing the aws-s3 input in Filebeat to ship logs from S3 to ElasticSearch.

I'm using filebeat 8.5.2 with this input config:

filebeat.inputs:
  - type: aws-s3
    bucket_arn: arn:aws:s3:::logs-XXX
    bucket_list_prefix: NNN/insight_daily/daily_insight_content/
    access_key_id: ...
    secret_access_key: ...
    number_of_workers: 5
    bucket_list_interval: 60s
    expand_event_list_from_field: Records
    default_region: eu-west-1

Everything starts fine:

  1. Filebeat is able to list the content of the bucket / subfolder
  2. But it cannot find any file later, because the key contains the "+" character which is replaced by a space.

Logs:

"error":{"message":"failed processing S3 event for object key \"NNN/insight_daily/daily_insight_content/2022-04-01
T00:00:00 00:00/1.log\" in bucket \"logs-XXX\": failed to get s3 object (elapsed_time_ns=116900493): s3 GetObject failed
: operation error S3: GetObject, https response error StatusCode: 404, RequestID: HQ9MQ4S7ZGTZ131S, HostID: G+8aU7zzkfzHIUjjTJM5AdKsJKRHKovA81Gi1wOAx3
nmOgE/EFQbkYdl9QVwrgFd2DJQjlIJfk0=, NoSuchKey:

In my s3 bucket, the path is: NNN/insight_daily/daily_insight_content/2022-04-01T00:00:00+00:00/1.log

I saw a related PR merged, but it seems it's not working for "+" at least.

Thank you for your help.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 8, 2022
@pdelormekpler pdelormekpler changed the title [Filebeat] "+" character replaced by a space in the s3 key [Filebeat][aws-s3] "+" character replaced by a space in the s3 key Dec 8, 2022
@tetianakravchenko tetianakravchenko added the Team:Cloud-Monitoring Label for the Cloud Monitoring team label Jan 24, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 24, 2023
@tetianakravchenko
Copy link
Contributor

@elastic/obs-cloud-monitoring fyi

@aspacca
Copy link

aspacca commented Jan 25, 2023

@pdelormekpler the s3 object key unescaping on the PR you mentioned is indeed what makes the + sign be translated to a space char.

see this go playground snippet if you want to understand what happens.

there is indeed a bug in the filebeat input, since we should unescape the s3 object key only when the value comes from an s3-sqs notification. in your case, since you are using the direct s3 listing input, the key is not escaped and we should not unescape it.

@aspacca aspacca added bug Filebeat Filebeat labels Jan 25, 2023
@botelastic
Copy link

botelastic bot commented Jan 25, 2024

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Filebeat Filebeat Stalled Team:Cloud-Monitoring Label for the Cloud Monitoring team
Projects
None yet
Development

No branches or pull requests

3 participants