Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to read from S3 key with spaces #1923

Closed
dlvenable opened this issue Oct 15, 2022 · 0 comments · Fixed by #1926
Closed

[BUG] Unable to read from S3 key with spaces #1923

dlvenable opened this issue Oct 15, 2022 · 0 comments · Fixed by #1926
Assignees
Labels
bug Something isn't working
Milestone

Comments

@dlvenable
Copy link
Member

dlvenable commented Oct 15, 2022

Describe the bug

The Data Prepper S3 source is unable to read from keys with spaces in them.

To Reproduce

Simple pipeline:

log-pipeline:
  source:
    s3:
      notification_type: "sqs"
      codec:
        newline:
      sqs:
        queue_url: "https://sqs.us-east-1.amazonaws.com/123456789012/MyPipeline"
      aws:
        region: "us-east-1"

  processor:
  sink:
    - stdout:

Here is a sample file:

cat "has some spaces.log"
This file has spaces.
Two to be precise.

Upload that file:

aws s3 cp "has some spaces.log" s3://s3-source-manual-test/

Expected behavior

The S3 object should be read just like the one without spaces.

Actual Outputs

2022-10-15T16:56:08,659 [Thread-1] WARN  org.opensearch.dataprepper.plugins.source.SqsWorker - Unable to process S3Object: s3ObjectReference=[bucketName=***, key=has+some+spaces.log].
software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404, Request ID: ***, Extended Request ID: ***)

Environment (please complete the following information):

Data Prepper Docker build from main.

docker run -p 4900:4900 -p 2021:2021 \
  -v ${PWD}/pipelines.yaml:/usr/share/data-prepper/pipelines/pipelines.yaml \
  -v ${HOME}/.aws:/root/.aws \
  opensearch-data-prepper:2.1.0-SNAPSHOT

Files uploaded via macOS using the AWS CLI.

Additional context

The following values come from the S3 Console:

  • S3 URI: s3://***/has some spaces.log
  • Object URL: https://***.s3.amazonaws.com/has+some+spaces.log
  • ARN: arn:aws:s3:::***/has some spaces.log

SQS Body:

{"Records":[{"eventVersion":"2.1","eventSource":"aws:s3","awsRegion":"us-east-1","eventTime":"2022-10-15T16:55:22.934Z","eventName":"ObjectCreated:Put","userIdentity":{"principalId":"AWS:***:***"},"requestParameters":{"sourceIPAddress":"x.y.x.y"},"responseElements":{"x-amz-request-id":"***","x-amz-id-2":"***"},"s3":{"s3SchemaVersion":"1.0","configurationId":"SQSSourceTest","bucket":{"name":"***","ownerIdentity":{"principalId":"***"},"arn":"arn:aws:s3:::***"},"object":{"key":"has+some+spaces.log","size":41,"eTag":"f7d5180f521d7cc51b6bfa64d72fca3b","sequencer":"***"}}}]}
@dlvenable dlvenable added bug Something isn't working untriaged labels Oct 15, 2022
@asifsmohammed asifsmohammed self-assigned this Oct 15, 2022
@dlvenable dlvenable added this to the v2.1 milestone Oct 15, 2022
@dlvenable dlvenable modified the milestones: v2.1, v2.0.1 Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants