Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove container_name from Metricbeat autodiscover eventID #14787

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Nov 26, 2019

In Metricbeat we don't want to have different eventIDs for
containers within the same Pod, since these containers
share the same IP and handling them seperately can lead in
launching hint based modules for all of the containers.

Signed-off-by: chrismark [email protected]

What this PR solves

This PR tackles the problem of having a module to be launched twice in case we have 2 containers in the Pod with annotations. See #12011 .

This patch will make

if a.configs[eventID][hash] != nil {

work as expected, in cases like:

2019-11-26T09:48:44.093Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:210	eventID: 878f170c-c892-4735-8b4f-60bf707fd01d:266e2380-d3b6-423b-a257-0cb428235fc6.prometheus-container
2019-11-26T09:48:44.093Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:211	hash: 2490033313956307158

2019-11-26T09:48:44.107Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:210	eventID: 878f170c-c892-4735-8b4f-60bf707fd01d:266e2380-d3b6-423b-a257-0cb428235fc6.redis-container
2019-11-26T09:48:44.107Z	DEBUG	[autodiscover]	autodiscover/autodiscover.go:211	hash: 2490033313956307158

closes: #12011

@ChrsMark ChrsMark added bug Metricbeat Metricbeat containers Related to containers use case [zube]: In Review Team:Integrations Label for the Integrations team autodiscovery labels Nov 26, 2019
@ChrsMark ChrsMark requested a review from exekias November 26, 2019 11:13
@ChrsMark ChrsMark self-assigned this Nov 26, 2019
In Metricbeat we don't want to have different eventIDs for
containers within the same Pod, since these containers
share the same IP and handling them seperately can lead in
launching hint based modules for all of the containers.

Signed-off-by: chrismark <[email protected]>
@ChrsMark ChrsMark force-pushed the unify_eventid_autodiscover_metricbeat branch from 0effaad to c79a817 Compare November 26, 2019 11:14
@ChrsMark ChrsMark changed the title Remove container_name from Metricbeat autodiscover eventsID Remove container_name from Metricbeat autodiscover eventID Nov 26, 2019
@exekias
Copy link
Contributor

exekias commented Nov 27, 2019

This is a bit of a hack but it could solve #12011. I see the problem comes from #6727, as if one of the containers is not exposing ports we will end up monitoring it too.

Would like to hear more opinions, @vjsamuel @jsoriano WDYT?

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great analysis for the issue in #12011!

Regarding the fix, I wonder if it will work if we want to monitor ports of multiple containers in the same pod.

I also don't like to add conditional logic for a single beat in libbeat, but I could agree with this if it fixes the issue by now without breaking anything else. And if we plan to continue looking for other solutions.

Regarding longer-term possible solutions for this, if we see that different beats can be interested on different things (filebeat needs to know about all the containers to get their logs, but it doesn't care about the ports, metricbeat is interested about the ports exposed by the pod, but not so much about each container), maybe we should emit different kinds of events, with different information, and each beat in their hints builder would decide what events to attend or ignore.

For example, we could emit:

  • An event per pod with the network information (all containers in a pod share the same network), including a list of all the ports exposed by their containers, and maybe the list of containers to match container-specific hints. This event could be used by beats interested on network endpoints, like metricbeat or heartbeat.
  • An event per container, without network information. This event could be used by filebeat to get the logs of each container.

I am not sure what would be the best approach to differentiate these events, it could be by the presence or absence of certain fields, or we could add an extra field to distinguish them.

@@ -217,7 +217,10 @@ func (p *Provider) emitEvents(pod *kubernetes.Pod, flag string, containers []kub

// This must be an id that doesn't depend on the state of the container
// so it works also on `stop` if containers have been already deleted.
eventID := fmt.Sprintf("%s.%s", pod.GetObjectMeta().GetUID(), c.Name)
eventID := fmt.Sprintf("%s", pod.GetObjectMeta().GetUID())
Copy link
Member

@jsoriano jsoriano Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this still work if we want to monitor more than one container per pod? For example if a pod has more than one container, each one exposing a port, and each one with a dedicated annotation. Something like this:

apiVersion: v1
kind: Pod
metadata:
  name: two-containers-prometheus
  annotations:
    co.elastic.metrics.prometheus-container/module: prometheus
    co.elastic.metrics.prometheus-container/hosts: ${data.host}:8080
    co.elastic.metrics.redis-container/module: redis
    co.elastic.metrics.redis-container/hosts: ${data.host}:6379
spec:
  restartPolicy: Never
  containers:
  - name: prometheus-container
    image: prom/prometheus
    ports:
      - containerPort: 8080
  - name: redis-container
    image: redis
    ports:
      - containerPort: 6379

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have many use cases that use this feature. also, it would be necessary what container is logging the metric. We run filebeat and metricbeat on the same daemonset. since most of the metrics are same, we would need to be able to know what container the app is reporting from.

@@ -217,7 +217,10 @@ func (p *Provider) emitEvents(pod *kubernetes.Pod, flag string, containers []kub

// This must be an id that doesn't depend on the state of the container
// so it works also on `stop` if containers have been already deleted.
eventID := fmt.Sprintf("%s.%s", pod.GetObjectMeta().GetUID(), c.Name)
eventID := fmt.Sprintf("%s", pod.GetObjectMeta().GetUID())
if p.bus.GetName() != "metricbeat" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid this kind of ifs, if at some point in the future we want to use autodiscover monitor network endpoints in other beat (for example with heartbeat), we can get crazy finding why it behaves differently in metricbeat.

If we decide to go on with this solution as a quick fix for #12011 please create a follow-up issue to find a longer term solution.

Copy link
Contributor

@vjsamuel vjsamuel Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on avoiding beat specific checks on libbeat.

@ChrsMark
Copy link
Member Author

Thanks for the comments everyone! Closing this for now and we could consider about possible solutions in #12011.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autodiscovery bug containers Related to containers use case Metricbeat Metricbeat Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metricbeat hints builder discovers same hosts multiple times
5 participants