Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for routing rules or pre-defined pipelines in input packages #566

Open
kaiyan-sheng opened this issue Jul 14, 2023 · 5 comments
Labels
Team:Fleet Label for the Fleet team

Comments

@kaiyan-sheng
Copy link
Contributor

kaiyan-sheng commented Jul 14, 2023

We want to create a new input package called AWS Firehose that streams logs (and metrics in the future) from Amazon Kinesis Data Firehose into Elastic Stack. This input package needs to reroute incoming logs/metrics to different data streams based on the pre-defined rerouting rules or pipelines.

For example: We would like to route the log entry to aws.cloudtrail dataset if the value of aws.cloudwatch.log_stream field has CloudTrail in there.

processors:
  - reroute:
    if: ctx['aws.cloudwatch.log_stream'].contains('CloudTrail')
    dataset: aws.cloudtrail
    namespace: default

Either supporting routing rules directly or supporting pre-defined pipelines will work for our use case.

cc @jsoriano @mrodm @tommyers-elastic

@jlind23
Copy link
Collaborator

jlind23 commented Jul 18, 2023

Thanks @kaiyan-sheng for creating this. As of today this not a priority for us as we delivered the first part related to standard packages. I will add this to our short term roadmap though. cc @nimarezainia

@joshdover
Copy link
Contributor

Is this really an input package if it has routing rules that are tied to specific destination data streams? In my mind, input packages should be very lightweight and not bake in any assumptions about how the user wants to use the data or where they want the data to end up. They should only serve as a starting point for building out a custom integration with Elastic Agent as the data shipper.

Why not use an integration package for this use case?

@kaiyan-sheng
Copy link
Contributor Author

kaiyan-sheng commented Jul 19, 2023

@joshdover Good point! I'm not sure about this. Do we have any documentation/definition on input package vs integration package?
For the package we are creating, the only thing it does is reroute. Firehose will push logs into Elastic, and we use this package with predefined pipelines to route to the correct destination data streams. If we make it an integration package, it will work for sure. But since besides rerouting, it's only bringing logs(future for metrics too) from multiple services from AWS, I thought it fits as an input package.

@ruflin
Copy link
Contributor

ruflin commented Jul 31, 2023

They should only serve as a starting point for building out a custom integration with Elastic Agent as the data shipper.

We might have to come up with some better naming. I see where this assumption comes from. This works well in the context of logfile, but I think falls apart with firehose and similar. Here not agent policy is needed in the first place.

Putting aside the term input package: What we need here is "something" that sets up a target dataset, has some mappings in place and potentially some default routing rules. Users could add later their own. Where this eventually overlaps with input packages is the idea, that when data is routed to a specific dataset, this target dataset for which an integration is setup, should also get the same default mappings. @andresrc has started to do some write ups on the different "layers" related to this.

I seems for now the best path forwards is creating and integration(?).

@jlind23
Copy link
Collaborator

jlind23 commented Aug 30, 2023

@joshdover As asked above by @ruflin would the best path forward here be creating a standard integrations instead of an input packages?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Fleet Label for the Fleet team
Projects
None yet
Development

No branches or pull requests

4 participants