Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transition Beats to ECS #8655

Closed
ruflin opened this issue Oct 19, 2018 · 14 comments
Closed

Transition Beats to ECS #8655

ruflin opened this issue Oct 19, 2018 · 14 comments
Assignees
Labels

Comments

@ruflin
Copy link
Member

ruflin commented Oct 19, 2018

With 7.0 Beats will transition to ECS: https://github.com/elastic/ecs This meta issue is to track all changes needed in Beats. The list will be extended over time

Migration Strategy

The overall migration strategy is to add a alias layer to 7.x which is opt-in to be backward compatible with 6.x data if needed. For some of the core fields used in the Infra / Logging UI aliases are introduced in 6.x for the 7.x data.

6.x (6.6 / 6.7)

7.0

  • Remove old fields which had a 1-1 mapping like beat.hostname
  • Make agent.* overwritable for apm-server move agent metadata to a processor #9952
  • Make sure all alias from the migration contain the migrate: * flag

Fields changes

Libbeat adjustments

Beats processors

Auditbeat

Filebeat

Filebeat modules

Filebeat Module migrations

Metricbeat modules

Packetbeat

Journalbeat

Heartbeat

Winlogbeat

Varia

See also all issues tagged "ecs"

Others

Open questions:

  • Should we rename co.elastic.logs/fileset to co.elastic.logs/dataset for autodiscovery (@exekias )
  • Should we change the metricsets config option in Metricbeat?
  • Proposal by @ruflin Keep it for now as we keep also the field fileset and metricset around

Notes

  • The code side is not changed as part of this migration.
  • The filebeat generated files must often be updated. Use the following to commands: INTEGRATION_TESTS=1 GENERATE=1 nosetests tests/system/test_modules.py -v, x-pack: MODULES_PATH=./module INTEGRATION_TESTS=1 GENERATE=1 nosetests tests/system/test_xpack_modules.py -v.
@webmat
Copy link
Contributor

webmat commented Nov 14, 2018

@ruflin

  • I've created a section listing all modules. We can expand modules to one line per fileset only when needed. I get the feeling we'll end up doing some modules with 1 PR for the whole module, rather than per fileset.
  • I've also checked the tasks that were listed and have been merged (e.g. beat.name). If any of those required follow-up work, please add subtasks.

@exekias
Copy link
Contributor

exekias commented Nov 16, 2018

@ruflin about fileset -> dataset. This relies in Filebeat docs, they name these fileset: https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-modules-overview.html

In my opinion that makes sense, as we are talking about files. They will generate datasets, and that's correct too, but as as long as we name this fileset in Filebeat docs I think the annotation should keep that nomenclature.

@ruflin
Copy link
Member Author

ruflin commented Nov 19, 2018

@exekias We must change it in the docs too. Would this solve the issue?

@exekias
Copy link
Contributor

exekias commented Nov 23, 2018

If we completely rename the thing, I would say yes, annotations must follow

@ruflin ruflin added the Team:Integrations Label for the Integrations team label Nov 23, 2018
@webmat
Copy link
Contributor

webmat commented Nov 26, 2018

@ruflin I've updated the "Beats processors" section. The list and fields to change should be pretty comprehensive now. Please take a look, to confirm I haven't missed something.

cc @roncohen

@webmat
Copy link
Contributor

webmat commented Dec 20, 2018

@ruflin Just added this to the "Field changes" section. I think this would be best solved by moving ECS docs to asciidoc on the doc website:

  • Some ECS field definitions casually refer to other ECS Readme sections in the Beats docs. We need to address this better

@sophiec20
Copy link
Contributor

sophiec20 commented Jan 16, 2019

For our UI ML Module automated testing, we do the following:

  • restore a data snapshot
  • call the Kibana API data recogniser
  • call the Kibana API module setup - this creates ML jobs, datafeeds and visualisations and dashboards
  • the run the ML jobs
  • then check ML results are visible
  • then check you can click through to the newly created dashboards (this bit is manual)

We currently use

  • filebeat-* where fileset.module": "nginx" AND "fileset.name": "access"
  • filebeat-* where fileset.module": "apache2" AND "fileset.name": "access"
  • auditbeat-* where "event.type": "syscall" for docker containers and hosts (tbc if ECS changes will affect ML in 7.0)

As we start with data snapshots in our existing test framework, is the beats team able to supply snapshots of indices containing
a) pure new 7.0 ECS data and b) backward compatible indices?
With (a) being the priority.

@ruflin
Copy link
Member Author

ruflin commented Jan 17, 2019

With 7.0 you will be able for the above queries to just rely on event.dataset: nginx.access as an example. BTW we also renamed apache2 to apache to be in line with the metricbeat module.

I assume the data you are looking for is nginx and apache data for the logs. What I could produce is a few lines of example data based on our test suite logs. Would that be enough? Or you need larger logs? If you have larger log files for nginx and apache I can easily create the data.

@sophiec20
Copy link
Contributor

@ruflin Can we please start with some example data snapshots?
(We do have larger logs, but I'm not sure if we retained them in their original raw format as they were anonymised. Will need to check and will share with you if we can).

@ruflin
Copy link
Member Author

ruflin commented Jan 22, 2019

Our tests logs can be found here:

I initially thought I provide you with a snapshot or es_archiver zip file from ES. But I think it's easier if the one that works on these files ingests the data himself. Like this also your apache files can be used and it does not have to go through me anymore.

To make the module work with any file path, var.paths must be adjusted in the module config: https://github.com/elastic/beats/blob/master/filebeat/modules.d/apache.yml.disabled

For testing use the snapshot builds:

@ruflin
Copy link
Member Author

ruflin commented Jan 24, 2019

@webmat Above I did check the checkbox around http.request.method to normalise it. I suggest we skip this for now.

@webmat
Copy link
Contributor

webmat commented Jan 24, 2019

@ruflin Understood. If I can get around to it in time would you have any objections, though?

Not 100% sure I can (e.g. if we don't have what we need in field generation), but I'd like to get it done if possible.

@ruflin
Copy link
Member Author

ruflin commented Jan 24, 2019

@webmat No objections :-)

@ruflin
Copy link
Member Author

ruflin commented Feb 5, 2019

Closing this issue as all the checkboxes have been done except the following 3:

  • http method lower case: We will figure out something at a later stage
  • osquery module: According to @webmat conversion to ECS does not make sense
  • Breaking changes files: PR is open and will be merged soonish.

A big thank you to everyone that contributed to getting this massive effort done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants