Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a model for binary data #4096

Open
dlvenable opened this issue Feb 8, 2024 · 0 comments
Open

Create a model for binary data #4096

dlvenable opened this issue Feb 8, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@dlvenable
Copy link
Member

Is your feature request related to a problem? Please describe.

Data Prepper has sources which can pull binary data (mostly in base64) format. And we are adding some new processors which can decompress binary data. It would be good to handle binary data consistently so that we don't too much code spread across the project which will result in some processor combinations breaking a pipeline.

I'd like Data Prepper's sources and sinks to know their own encodings as much as possible.

Describe the solution you'd like

Create a new BinaryData model in data-prepper-api. Allow this to be set and retrieved from the Event model. This model can also be designed to avoid unnecessary encoding/decoding.

When a Data Prepper source gets binary data, it wraps it in the BinaryData model. Similarly, when writing to a sink use that same model.

There are some situations where the source cannot know the encoding. For example, JSON could have binary data encoded as base64 or base64. In such cases, the pipeline author will need to know the encoding and convert it accordingly.

class BinaryData {
  public byte[] getBinaryData();
  
  public static fromBase64Data(String base64) { ... }
}

There may also be an good way to decouple the binary data from the encoding itself.

Describe alternatives you've considered (Optional)

There may some useful third party libraries that have a similar solution we could make use of. Though, I'd still propose we keep our interface and use that for the internals.

Additional context

Coming from this comment: #4016 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants