Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS Ingest processor #181

Open
ruflin opened this issue Nov 19, 2018 · 13 comments
Open

ECS Ingest processor #181

ruflin opened this issue Nov 19, 2018 · 13 comments
Labels
enhancement New feature or request

Comments

@ruflin
Copy link
Contributor

ruflin commented Nov 19, 2018

With ECS we know the exact structure of some fields. Based on this some common processing happens. A few examples:

source.ip -> geoip processor to enrich with geo information
user_agent.original -> user_agent enrichement

Since Elasticsearch 6.5 it is possible to have a pipeline that calls an other pipeline: https://www.elastic.co/guide/en/elasticsearch/reference/6.5/pipeline-processor.html We could provide an ECS pipeline that does all this default processing. All that users would have to do is adding it to their ingest pipeline.

Over time we could add more processing in such a pipeline. For example if we have a convention that all http.request.method should be upper case the Uppercase Processor could be applied to this field and similar things.

@webmat
Copy link
Contributor

webmat commented Nov 20, 2018

It just occurred to me that you proposed this on the ECS repo, not Beats.

So you're thinking of reference implementations (more or less), correct? They can be used for Beats, but they would be published here independent of Beats implementation details, and useable by anyone with a need for them?

I love the idea.

I think we should experiment concretely with them on Beats first, before building too much "infrastructure" to support this here, however. WDYT?

@ruflin
Copy link
Contributor Author

ruflin commented Nov 20, 2018

The idea came definitively from the duplicated code we have in the ingest pipelines of Beats but it was meant to be more general. I don't think it's something we should do right now but I need a place to write down the idea and get other thoughts on it.

@webmat
Copy link
Contributor

webmat commented Nov 20, 2018

Here's some ideas that have been occurring to me while working on the Beats migrations. They're broken up in very small ideas, but we could likely join a few together to create these reference pipelines:

  • Categorize private/public IP addresses
    • Replace nginx module's, that doesn't support IPv6
  • Extract source IP out of x-forwarded-for
    • See nginx module implementation, that gets the first public IP in the list, not just the first IP... (apparently there's some subtlety to this?)
  • GeoIP
  • User Agent parsing
    • Perform all necessary field renames
    • consider populating user_agent.version and user_agent.os.version with full version strings. Right now we only have the numbers broken out in separate fields.
  • Break apart / reconstruct URL?

@webmat
Copy link
Contributor

webmat commented Nov 22, 2018

Other ideas

  • populate related.ip automatically based on expected ECS ip fields
  • calculate network.community_id based on expected ECS ip fields

@webmat
Copy link
Contributor

webmat commented Dec 3, 2018

@webmat
Copy link
Contributor

webmat commented Dec 4, 2018

  • Extract subdomain and determine "registerable domain" based on domain & public suffix list. Consider the Painless DomainSplit function

cc @andrewkroh

@webmat
Copy link
Contributor

webmat commented Dec 18, 2018

  • Perform the extraction of IP address and domain from .address fields

@webmat
Copy link
Contributor

webmat commented Jan 7, 2019

  • Save original @timestamp value in event.created, prior to parsing an event's timestamp

@webmat
Copy link
Contributor

webmat commented Jan 9, 2019

  • Fallback pipeline to populate message based on a few key fields

@webmat
Copy link
Contributor

webmat commented Jan 28, 2019

@webmat
Copy link
Contributor

webmat commented Apr 8, 2019

@webmat
Copy link
Contributor

webmat commented Jun 7, 2019

  • Create an ingest pipeline that converts GELF to ECS

@webmat
Copy link
Contributor

webmat commented Aug 27, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants