[Packetbeat] Refactor packetbeat for use with Elastic Agent #22134

andrewstucki · 2020-10-23T20:51:27Z

What does this PR do?

This PR refactors packetbeat to support for agent-based configuration. In order to facilitate control by agent, it does a few things:

Gets rid of any global variables that are initialized based on the configuration state
Adds handling that overrides any configuration if packetbeat.inputs are provided.
Makes the index that packetbeat pushes events to overrideable on each input (flows or protocols)
Uses centralized configuration management reloading to restart the sniffer and all worker goroutines when configuration changes

Follow ups

We'll likely want to do a few things in the future, I can create issues if we want:

Right now, due to the way a single sniffer gets initialized and then delegates all of the protocol parsing to internally registered protocol handlers, each configuration reload, if even a single input changes, the sniffer needs to be stopped and started--so effectively all inputs are tied together. Not sure if it would make sense to move to some sort of worker routine per protocol or not, but it would get rid of some of the custom reload logic I had to add.
Once again, due to the single sniffer, the sniffing interface can really only be configured once. Right now I just choose the "default" interface based off of the OS, but ideally this would be customizable. That said, what it might mean is that we need to make all packetbeat integrations a single package or change packetbeat so we run a sniffer per input.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Related: [Packetbeat] Create x-pack magefile #21979
Related: [Packetbeat] Split out elastic-agent config changes #22145

elasticmachine · 2020-10-23T20:51:29Z

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

elasticmachine · 2020-10-23T20:51:32Z

Pinging @elastic/ingest-management (Team:Ingest Management)

elasticmachine · 2020-10-23T21:07:51Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: [Pull request #22134 updated]
Start Time: 2020-10-29T15:29:18.030+0000
Duration: 76 min 42 sec

Test stats 🧪

Test	Results
Failed	0
Passed	16382
Skipped	1344
Total	17726

ruflin

We must decide, if packetbeat should be packaged by default or not. @ph @mostlyjason

packetbeat/beater/packetbeat.go

packetbeat/publish/publish.go

ruflin · 2020-10-26T15:57:28Z

Would it be possible to support inputs on the top level instead of under the packetbeat prefix? Like this Agent could just forward.

In general, I think this PR is update to the team to get reviewed and merged as it does not tackle Agent yet, thanks for taking it out.

andrewstucki · 2020-10-26T16:27:20Z

@ruflin quick question about that--right now the agent configuration files do a lot more than just moving some keys around. That includes:

injecting agent metadata
adding stream dataset info into events
flattening streams input hashes into each individual input itself
and injecting per-index configuration for each input

having the inputs be at the top level doesn't mitigate any of the above, so agent still wouldn't be able to forward configuration from Kibana unmodified. We'd either still need to rely heavily on the current spec transpilation logic or add some processors into each integration package and update the configuration deserialization code in each beat to support deserializing (and using) some shared notion of a Streams configuration structure.

I stripped out any additions to the transpiler that would have been required to massage this into standard packetbeat configuration, but I would think that any of the above would likely be beyond the scope of getting packetbeat functioning with agent.

So, just want to clarify--just want me to un-nest inputs and keep the additional configuration we'd need in the spec file in place, correct?

mostlyjason · 2020-10-26T19:49:44Z

We must decide, if packetbeat should be packaged by default or not. @ph @mostlyjason

Just checked our telemetry and packetbeat is used by a small percentage of clusters. Would save a lot of bandwidth if it was downloaded on demand. This requires an internet connection or proxy though.

…fig-rebase

CHANGELOG.next.asciidoc

packetbeat/beater/processor.go

packetbeat/beater/reloader.go

packetbeat/config/agent.go

ruflin · 2020-10-28T09:01:00Z

For the config and inputs: My favorite solution would be that Elastic Agent just forwards the policy to packetbeat and packetbeat adds all the processors and data needed. The processors are to add important fields but packetbeat actually exactly knows which fields have to be added without Agent sending it. It would be nice if you could check on your end, how big the effort would be to directly go this step. This is what we do with apm-server and plan to also do for filebeat / metricbeat. If we don't go this step directly, we need to migrate to it at one stage.

Perhaps you can provide some examples on what config is put into the agent and what the config looks like when sent to packetbeat as yaml here. This could simplify the discussion and making sure we talk about exactly the same thing.

andrewstucki · 2020-10-29T01:40:47Z

Ok, so I wound up going and swapping the configuration to what I mentioned here #22227 (comment)

The changes involve some additional normalization code to inject processors into each stream. On the plus side, the reloading code becomes much simpler since it just wraps cfgfile.RunnerList. For the configuration, I currently enforce ~~a single input~~ up to 100 inputs of type packet to be passed in (corresponding to 100 sniffers/integrations). If packetbeat receives any additional inputs, it fails re-configuration. ~~This is to punt for the time being on:~~

~~1. figuring out the nitty-gritty details of running multiple packet sniffers on a system and~~
~~2. keep the current goroutine synchronization functional (the error channels will need some work if we move to multiple sniffers)~~

A demo package I've been using to test all of this stuff is here:

andrewstucki/integrations@28455c5

@ruflin this approach pretty much completely removes all transpiler rules in the packetbeat spec apart from filtering and injecting agent version information (see updated #22145). We'll still need to figure out how to inject additional configuration at the input level in the configuration if we want to use that to control the sniffer behavior, but I figure that can wait. Let me know what you think.

UPDATE:

Added support for up to 100 sniffers/integrations, appears to work like a charm with minimal configuration. The thought would be then that we could use the above format for integrating packetbeat with whatever package we want. For example, MySQL could have a packet input with a stream type set for mysql while another package could add icmp or flow support--each instance of a package (integration) would then create its own sniffer and just specify the streams (protocols or flow) it wanted to capture.

elasticmachine · 2020-10-29T03:16:37Z

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	0
Passed	16382
Skipped	1344
Total	17726

ruflin · 2020-10-30T09:54:56Z

@andrewstucki Read your comment above last after going through all the other comments 🤦‍♂️ We should probably have all the config discussions in a single place :-) Great to hear you solved the sniffer limitation issue.

How do we get all the conversations into a single place and can drive it there to a conclusion?

andrewstucki · 2020-10-30T13:37:24Z

@ruflin I'm fine with doing the rest of the config discussion here and using the added unit test file packetbeat/config/agent_test.go to finish driving the discussion (no matter where we discuss, any spec file changes result in changes to the normalization code here, so talking about it here keeps us from having to context-switch). That test pretty shows the agent spec file configuration format of a single input (after it's been injected with agent metadata).

The issue I created and linked to (#22227) is still valid though and can be a follow up to the initial implementation, as there's still work to be done to figure out how to configure shared, sniffer-specific options.

michalpristas

seems to work as expected together with #22145

andrewkroh

LGTM

andrewkroh · 2020-11-03T17:59:46Z

packetbeat/beater/processor.go

+		return nil, err
+	}
+
+	watcher := procs.ProcessesWatcher{}


Would the process watcher be something that benefits from being shared across each processor?

So the process watcher has traditionally been a global. I more or less kept it that way--making it a singleton for the entirety of the agent-level "input" (meaning 1 sniffer + 1 process watcher). I would imagine eventually we could potentially scope this based on a protocol-by-protocol basis (moving from "input" level, to "stream" level), but we'd probably need to see if that had a performance impact.

In general I tried to keep the traditionally global configuration options scoped to a single input run (1 sniffer, etc.) to make things a bit more flexible for the future and avoid introducing three different levels of configuration (global, input-level, stream-level) when configured via agent. Instead we just have input-level and stream-level, so nothing is shared across multiple inputs.

Hope that answers the question/makes sense.

…22134) * Refactor packetbeat to support agent-based configuration * Add documentation changes and a Changelog entry * Update reference template * Fix funny merge * Incorporate feedback * use streams instead of inputs * support multiple sniffers * fix shutdown_timeout behavior (cherry picked from commit 8c05a41)

…ith Elastic Agent (#22546) * [Packetbeat] Refactor packetbeat for use with Elastic Agent (#22134) * Refactor packetbeat to support agent-based configuration * Add documentation changes and a Changelog entry * Update reference template * Fix funny merge * Incorporate feedback * use streams instead of inputs * support multiple sniffers * fix shutdown_timeout behavior (cherry picked from commit 8c05a41) * Fix up changelog

andrewstucki added enhancement Packetbeat Team:Security-External Integrations labels Oct 23, 2020

andrewstucki requested a review from a team as a code owner October 23, 2020 20:51

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label Team:Ingest Management labels Oct 23, 2020

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 23, 2020

andrewstucki requested a review from a team October 23, 2020 22:18

ruflin requested review from michalpristas and blakerouse October 26, 2020 09:47

ruflin reviewed Oct 26, 2020

View reviewed changes

packetbeat/beater/packetbeat.go Show resolved Hide resolved

Refactor packetbeat to support agent-based configuration

c18fed9

andrewstucki force-pushed the packetbeat-config-rebase branch from 67fdce1 to c18fed9 Compare October 26, 2020 13:17

andrewstucki mentioned this pull request Oct 26, 2020

[Packetbeat] Split out elastic-agent config changes #22145

Merged

5 tasks

andrewstucki changed the title ~~[Packetbeat] Add packetbeat support to Elastic Agent~~ [Packetbeat] Refactor packetbeat for use with Elastic Agent Oct 26, 2020

andrewkroh reviewed Oct 26, 2020

View reviewed changes

packetbeat/publish/publish.go Show resolved Hide resolved

Andrew Stucki added 2 commits October 26, 2020 09:58

Add documentation changes and a Changelog entry

cce8f7b

Update reference template

a242bdb

Andrew Stucki added 2 commits October 27, 2020 10:50

Merge branch 'master' of github.com:elastic/beats into packetbeat-con…

7496703

…fig-rebase

Fix funny merge

43e719c

andrewkroh reviewed Oct 28, 2020

View reviewed changes

Andrew Stucki added 2 commits October 28, 2020 13:51

Incorporate feedback

3d10bdc

use streams instead of inputs

9539f4a

Andrew Stucki added 2 commits October 29, 2020 09:29

support multiple sniffers

9932ed6

fix shutdown_timeout behavior

95c721d

michalpristas approved these changes Nov 2, 2020

View reviewed changes

andrewstucki requested a review from a team November 2, 2020 20:03

andrewkroh approved these changes Nov 3, 2020

View reviewed changes

andrewstucki merged commit 8c05a41 into elastic:master Nov 3, 2020

andrewstucki deleted the packetbeat-config-rebase branch November 3, 2020 18:07

andrewstucki mentioned this pull request Nov 11, 2020

Cherry-pick #22134 to 7.x: [Packetbeat] Refactor packetbeat for use with Elastic Agent #22546

Merged

6 tasks

andrewstucki added the v7.11.0 label Nov 11, 2020

andrewstucki mentioned this pull request Nov 16, 2020

Cherry-pick #22145 to 7.x: [Packetbeat] Split out elastic-agent config changes #22600

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Packetbeat] Refactor packetbeat for use with Elastic Agent #22134

[Packetbeat] Refactor packetbeat for use with Elastic Agent #22134

andrewstucki commented Oct 23, 2020 •

edited

Loading

elasticmachine commented Oct 23, 2020

elasticmachine commented Oct 23, 2020

elasticmachine commented Oct 23, 2020 •

edited by jenkins-beats-ci bot

Loading

Build stats

Test stats 🧪

ruflin left a comment

ruflin commented Oct 26, 2020

andrewstucki commented Oct 26, 2020

mostlyjason commented Oct 26, 2020 •

edited

Loading

ruflin commented Oct 28, 2020

andrewstucki commented Oct 29, 2020 •

edited

Loading

elasticmachine commented Oct 29, 2020 •

edited by jenkins-beats-ci bot

Loading

Test stats 🧪

ruflin commented Oct 30, 2020

andrewstucki commented Oct 30, 2020

michalpristas left a comment

andrewkroh left a comment

andrewkroh Nov 3, 2020

andrewstucki Nov 3, 2020

[Packetbeat] Refactor packetbeat for use with Elastic Agent #22134

[Packetbeat] Refactor packetbeat for use with Elastic Agent #22134

Conversation

andrewstucki commented Oct 23, 2020 • edited Loading

What does this PR do?

Follow ups

Checklist

Related issues

elasticmachine commented Oct 23, 2020

elasticmachine commented Oct 23, 2020

elasticmachine commented Oct 23, 2020 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Test stats 🧪

ruflin left a comment

Choose a reason for hiding this comment

ruflin commented Oct 26, 2020

andrewstucki commented Oct 26, 2020

mostlyjason commented Oct 26, 2020 • edited Loading

ruflin commented Oct 28, 2020

andrewstucki commented Oct 29, 2020 • edited Loading

elasticmachine commented Oct 29, 2020 • edited by jenkins-beats-ci bot Loading

💚 Flaky test report

Test stats 🧪

ruflin commented Oct 30, 2020

andrewstucki commented Oct 30, 2020

michalpristas left a comment

Choose a reason for hiding this comment

andrewkroh left a comment

Choose a reason for hiding this comment

andrewkroh Nov 3, 2020

Choose a reason for hiding this comment

andrewstucki Nov 3, 2020

Choose a reason for hiding this comment

andrewstucki commented Oct 23, 2020 •

edited

Loading

elasticmachine commented Oct 23, 2020 •

edited by jenkins-beats-ci bot

Loading

mostlyjason commented Oct 26, 2020 •

edited

Loading

andrewstucki commented Oct 29, 2020 •

edited

Loading

elasticmachine commented Oct 29, 2020 •

edited by jenkins-beats-ci bot

Loading