Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

Allow all operators to batch and pass entries through the pipeline #67

Closed
djaglowski opened this issue Mar 17, 2021 · 1 comment
Closed
Labels
needs design Requires a design proposal before implementation

Comments

@djaglowski
Copy link
Member

See this issue for additional context.

The general idea is that we could save computation further down the line if we are able to keep related logs together.

Most input operators have an implied local context that could be used, but we need to consider carefully whether or not this really applies in some situations. What is the implied context for a file_input operator that is reading in all files in a directory? Are they necessarily related by resource, or just directory? If a syslog_input operator is receiving from multiple systems, how do we handle batching?

Most parsers can likely just iterate over the batch of entries.

This might require formalizing the concept of a resource at a level above entry:

struct {
  resource map[string]string
  entries []entry.Entry
}
@djaglowski djaglowski added the needs design Requires a design proposal before implementation label Mar 17, 2021
@djaglowski
Copy link
Member Author

Closing this issue because I think at this point there is a general aspiration to migrate the codebase to work with pdata natively, which would effectively cover this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs design Requires a design proposal before implementation
Projects
None yet
Development

No branches or pull requests

1 participant