-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metricbeat hints builder discovers same hosts multiple times #12011
Comments
@exekias any thoughts on this? my suggestion would be to force containers to expose ports explicitly to be able to monitor them. |
The part I don't understand is: Shouldn't autodiscover detected that the final configuration is the same and launch just one instance of the module? |
It doesn’t seem to work that way. We should investigate the same then. |
It wouldn’t be the same as the meta would be different right ? |
That sounds correct, ey @ChrsMark I think you have played with autodiscover recently, any chance you could confirm this behavior? |
Hey, yeap could check it soonish! |
Hey I was able to reproduce this behaviour. 2 runners launchedPod config: apiVersion: v1
kind: Pod
metadata:
name: two-containers-prometheus
annotations:
co.elastic.metrics/module: prometheus
co.elastic.metrics/hosts: ${data.host}:8080
spec:
restartPolicy: Never
containers:
- name: prometheus-container
image: prom/prometheus
ports:
- containerPort: 8080
- name: redis-container
image: redis Metricbeat logs: 2019-11-26T08:55:07.348Z DEBUG [autodiscover] autodiscover/autodiscover.go:166 Got a start event: map[config:[0xc000ccb290] host:172.17.0.8 id:582f6f2e-677a-4691-88a4-f64b61fc65e3.prometheus-container kubernetes:{"annotations":{"co":{"elastic":{"metrics/hosts":"${data.host}:8080","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"d5f46003c6cc5960a22e77ff3877487ae585f1d9db90fcc03e90aa79c5aaee33","image":"prom/prometheus","name":"prometheus-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}} meta:{"kubernetes":{"container":{"image":"prom/prometheus","name":"prometheus-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}}} port:8080 provider:e6c5fbe1-560f-439a-96ea-4bdf830d2cac start:true]
2019-11-26T08:55:07.348Z DEBUG [autodiscover] autodiscover/autodiscover.go:191 Generated config: map[enabled:true hosts:[172.17.0.8:8080] metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T08:55:07.348Z DEBUG [autodiscover] autodiscover/autodiscover.go:253 Got a meta field in the event
2019-11-26T08:55:07.348Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 0
2019-11-26T08:55:07.348Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 1, Stop list: 0
2019-11-26T08:55:07.348Z DEBUG [autodiscover] cfgfile/list.go:101 Starting runner: prometheus [metricsets=1]
2019-11-26T08:55:07.349Z DEBUG [autodiscover] autodiscover/autodiscover.go:166 Got a start event: map[config:[0xc00092e9f0] host:172.17.0.8 id:582f6f2e-677a-4691-88a4-f64b61fc65e3.redis-container kubernetes:{"annotations":{"co":{"elastic":{"metrics/hosts":"${data.host}:8080","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"7632494bbf7fd078a3551c7d6c3847b13d2a7f3c1092e925a1e6fce3b9f226d5","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"582f6f2e-677a-4691-88a4-f64b61fc65e3"}}} provider:e6c5fbe1-560f-439a-96ea-4bdf830d2cac start:true]
2019-11-26T08:55:07.349Z DEBUG [autodiscover] autodiscover/autodiscover.go:191 Generated config: map[enabled:true hosts:[172.17.0.8:8080] metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T08:55:07.349Z DEBUG [autodiscover] autodiscover/autodiscover.go:253 Got a meta field in the event
2019-11-26T08:55:07.349Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 1
2019-11-26T08:55:07.350Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 1, Stop list: 0
2019-11-26T08:55:07.350Z DEBUG [autodiscover] cfgfile/list.go:101 Starting runner: prometheus [metricsets=1] Pin the annotation to one containerPod config:
Metricbeat launches only one runner: Generated config: map[enabled:true hosts:[172.17.0.5:8080] metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T09:05:41.450Z DEBUG [autodiscover] autodiscover/autodiscover.go:253 Got a meta field in the event
2019-11-26T09:05:41.450Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 0
2019-11-26T09:05:41.451Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 1, Stop list: 0
2019-11-26T09:05:41.456Z DEBUG [autodiscover] cfgfile/list.go:101 Starting runner: prometheus [metricsets=1]
2019-11-26T09:05:41.456Z DEBUG [autodiscover] autodiscover/autodiscover.go:166 Got a start event: map[config:[0xc000deae40] host:172.17.0.5 id:dfacd6e3-0b4d-47a1-b1f5-d694c2a0592c.redis-container kubernetes:{"annotations":{"co":{"elastic":{"metrics":{"prometheus-container/hosts":"${data.host}:8080"},"metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics.prometheus-container/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"9d5221ad274adde255ee22cf27a71abdfcdfa85c711f7fa23272e9d0324ae5b8","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"dfacd6e3-0b4d-47a1-b1f5-d694c2a0592c"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"dfacd6e3-0b4d-47a1-b1f5-d694c2a0592c"}}} provider:7481c04b-7042-402a-9be6-683580dface8 start:true]
2019-11-26T09:05:41.456Z DEBUG [autodiscover] autodiscover/autodiscover.go:191 Generated config: map[enabled:true metricsets:[collector] module:prometheus period:1m timeout:3s]
2019-11-26T09:05:41.456Z DEBUG [autodiscover] autodiscover/autodiscover.go:253 Got a meta field in the event
2019-11-26T09:05:41.456Z ERROR [autodiscover] autodiscover/autodiscover.go:205 Auto discover config check failed for config &{{<nil> } <nil> 0xc000b59b40}, won't start runner: 1 error: host parsing failed for prometheus-collector: error parsing URL: empty host |
I think this is a bug, we should not be launching the same configuration twice for the same Pod. Something must be failing, as I was under the impression of this being checked here: beats/libbeat/autodiscover/autodiscover.go Line 211 in d8cd8c4
|
In this case this cannot work since we have different events for the two different containers of the same Pod: 2019-11-26T09:48:44.093Z DEBUG [autodiscover] autodiscover/autodiscover.go:210 eventID: 878f170c-c892-4735-8b4f-60bf707fd01d:266e2380-d3b6-423b-a257-0cb428235fc6.prometheus-container
2019-11-26T09:48:44.093Z DEBUG [autodiscover] autodiscover/autodiscover.go:211 hash: 2490033313956307158
2019-11-26T09:48:44.107Z DEBUG [autodiscover] autodiscover/autodiscover.go:210 eventID: 878f170c-c892-4735-8b4f-60bf707fd01d:266e2380-d3b6-423b-a257-0cb428235fc6.redis-container
2019-11-26T09:48:44.107Z DEBUG [autodiscover] autodiscover/autodiscover.go:211 hash: 2490033313956307158 This is something that we want in some cases like Filebeat, where we want to handle each container explicitly. In Metricbeat we can skip this since all containers in the same Pod have the same IP. So this current issue can be resolved if we create the What we should make sure is if unifying the eventID in metricbeat events will have any side-effects in metadata etc. I will open a PR with this approach and discuss it there. |
Pinging @elastic/integrations-platforms (Team:Platforms) |
Hey @exekias, @vjsamuel, @jsoriano ! Think this might be resolved now by #18564. I tried to verify the fix and it seems that worked for me. Deploying: apiVersion: v1
kind: Pod
metadata:
name: two-containers-prometheus
annotations:
co.elastic.metrics/module: prometheus
co.elastic.metrics/hosts: ${data.host}:8080
spec:
restartPolicy: Never
containers:
- name: prometheus-container
image: prom/prometheus
ports:
- containerPort: 8080
- name: redis-container
image: redis Here is what I get: 2020-05-21T12:30:08.672Z DEBUG [autodiscover] autodiscover/autodiscover.go:196 Generated config: {
"enabled": true,
"hosts": [
"xxxxx"
],
"metricsets": [
"collector"
],
"module": "prometheus",
"period": "1m",
"timeout": "3s"
}
2020-05-21T12:30:08.672Z DEBUG [autodiscover] autodiscover/autodiscover.go:258 Got a meta field in the event
2020-05-21T12:30:08.672Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 0
2020-05-21T12:30:08.673Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 1, Stop list: 0
2020-05-21T12:30:08.673Z DEBUG [autodiscover] cfgfile/list.go:101 Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-05-21T12:30:08.674Z DEBUG [module] module/wrapper.go:127 Starting Wrapper[name=prometheus, len(metricSetWrappers)=1]
2020-05-21T12:30:08.674Z DEBUG [autodiscover] autodiscover/autodiscover.go:174 Got a start event: map[config:[0xc00102f3e0] host:172.17.0.7 id:cf08c19e-c973-4f82-8461-a61e106fa0a0.redis-container keystore:0xc00004da00 kubernetes:{"annotations":{"co":{"elastic":{"metrics/hosts":"${data.host}:8080","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics/hosts\":\"${data.host}:8080\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":8080}]},{\"image\":\"redis\",\"name\":\"redis-container\"}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"8696be13ef36f460c48f74942955c3213c2040bf4235c4d3ca5f0b80ebce4677","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"cf08c19e-c973-4f82-8461-a61e106fa0a0"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"cf08c19e-c973-4f82-8461-a61e106fa0a0"}}} provider:877348a2-a905-4e4d-ae1f-10589fb0eea2 start:true]
2020-05-21T12:30:08.674Z DEBUG [autodiscover] autodiscover/autodiscover.go:196 Generated config: {
"enabled": true,
"hosts": [
"xxxxx"
],
"metricsets": [
"collector"
],
"module": "prometheus",
"period": "1m",
"timeout": "3s"
}
2020-05-21T12:30:08.674Z DEBUG [autodiscover] autodiscover/autodiscover.go:258 Got a meta field in the event
2020-05-21T12:30:08.674Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 1
2020-05-21T12:30:08.675Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 0, Stop list: 0
2020-05-21T12:30:08.674Z DEBUG [module] module/wrapper.go:181 prometheus/collector will start after 2.336432641s
2020-05-21T12:30:11.012Z DEBUG [module] module/wrapper.go:189 Starting metricSetWrapper[module=prometheus, name=collector, host=172.17.0.7:8080] Morever testing with 2 target containers: apiVersion: v1
kind: Pod
metadata:
name: two-containers-prometheus
annotations:
co.elastic.metrics/module: prometheus
co.elastic.metrics/hosts: ${data.host}:9090
co.elastic.metrics.redis-container/module: redis
co.elastic.metrics.redis-container/hosts: ${data.host}:6379
spec:
restartPolicy: Never
containers:
- name: prometheus-container
image: prom/prometheus
ports:
- containerPort: 9090
- name: redis-container
image: redis
ports:
- containerPort: 6379 Here is the result: 2020-05-21T12:44:44.103Z DEBUG [autodiscover] autodiscover/autodiscover.go:196 Generated config: {
"enabled": true,
"hosts": [
"xxxxx"
],
"metricsets": [
"collector"
],
"module": "prometheus",
"period": "1m",
"timeout": "3s"
}
2020-05-21T12:44:44.103Z DEBUG [autodiscover] autodiscover/autodiscover.go:258 Got a meta field in the event
2020-05-21T12:44:44.103Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 0
2020-05-21T12:44:44.104Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 1, Stop list: 0
2020-05-21T12:44:44.104Z DEBUG [autodiscover] cfgfile/list.go:101 Starting runner: RunnerGroup{prometheus [metricsets=1]}
2020-05-21T12:44:44.104Z DEBUG [module] module/wrapper.go:127 Starting Wrapper[name=prometheus, len(metricSetWrappers)=1]
2020-05-21T12:44:44.104Z DEBUG [autodiscover] autodiscover/autodiscover.go:174 Got a start event: map[config:[0xc0003a04e0] host:172.17.0.7 id:b290597a-7950-4fa2-a85a-3f64b1637b41.redis-container keystore:0xc00004da00 kubernetes:{"annotations":{"co":{"elastic":{"metrics":{"redis-container/hosts":"${data.host}:6379","redis-container/module":"redis"},"metrics/hosts":"${data.host}:9090","metrics/module":"prometheus"}},"kubectl":{"kubernetes":{"io/last-applied-configuration":"{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{\"co.elastic.metrics.redis-container/hosts\":\"${data.host}:6379\",\"co.elastic.metrics.redis-container/module\":\"redis\",\"co.elastic.metrics/hosts\":\"${data.host}:9090\",\"co.elastic.metrics/module\":\"prometheus\"},\"name\":\"two-containers-prometheus\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"prom/prometheus\",\"name\":\"prometheus-container\",\"ports\":[{\"containerPort\":9090}]},{\"image\":\"redis\",\"name\":\"redis-container\",\"ports\":[{\"containerPort\":6379}]}],\"restartPolicy\":\"Never\"}}\n"}}},"container":{"id":"c4513734dbd609928b16b0483c3f7b0c443d38af19aad8815e894f97d00e0542","image":"redis","name":"redis-container","runtime":"docker"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"b290597a-7950-4fa2-a85a-3f64b1637b41"}} meta:{"kubernetes":{"container":{"image":"redis","name":"redis-container"},"namespace":"default","node":{"name":"minikube"},"pod":{"name":"two-containers-prometheus","uid":"b290597a-7950-4fa2-a85a-3f64b1637b41"}}} port:6379 provider:877348a2-a905-4e4d-ae1f-10589fb0eea2 start:true]
2020-05-21T12:44:44.104Z DEBUG [autodiscover] autodiscover/autodiscover.go:196 Generated config: {
"enabled": true,
"hosts": [
"xxxxx"
],
"metricsets": [
"info",
"keyspace"
],
"module": "redis",
"period": "1m",
"timeout": "3s"
}
2020-05-21T12:44:44.104Z DEBUG [autodiscover] autodiscover/autodiscover.go:258 Got a meta field in the event
2020-05-21T12:44:44.104Z DEBUG [module] module/wrapper.go:181 prometheus/collector will start after 9.856799733s
2020-05-21T12:44:44.104Z DEBUG [autodiscover] cfgfile/list.go:62 Starting reload procedure, current runners: 1
2020-05-21T12:44:44.104Z DEBUG [autodiscover] cfgfile/list.go:80 Start list: 1, Stop list: 0
2020-05-21T12:44:44.104Z DEBUG [autodiscover] cfgfile/list.go:101 Starting runner: RunnerGroup{redis [metricsets=1], redis [metricsets=1]}
2020-05-21T12:44:44.105Z DEBUG [module] module/wrapper.go:127 Starting Wrapper[name=redis, len(metricSetWrappers)=1]
2020-05-21T12:44:44.105Z DEBUG [module] module/wrapper.go:127 Starting Wrapper[name=redis, len(metricSetWrappers)=1]
2020-05-21T12:44:44.105Z DEBUG [module] module/wrapper.go:181 redis/info will start after 3.463596093s
2020-05-21T12:44:44.105Z DEBUG [module] module/wrapper.go:181 redis/keyspace will start after 6.610549433s
2020-05-21T12:44:47.569Z DEBUG [module] module/wrapper.go:189 Starting metricSetWrapper[module=redis, name=info, host=172.17.0.7:6379]
2020-05-21T12:44:50.718Z DEBUG [module] module/wrapper.go:189 Starting metricSetWrapper[module=redis, name=keyspace, host=172.17.0.7:6379]
2020-05-21T12:44:53.961Z DEBUG [module] module/wrapper.go:189 Starting metricSetWrapper[module=prometheus, name=collector, host=172.17.0.7:9090] Let me know what you think folks about closing this one :). |
I think it can be closed, yes. Nice fix! |
actually, this issue isnt completely fixed with this change. we will randomly pick what container name to put in even though there is only one configuration that is running. We need a minor change to say that, if the port is not exposed, then it needs to use metadata that doesnt have |
I was able to reproduce this on HEAD and the underlying problem is caused due to this check:
https://github.com/elastic/beats/blob/master/metricbeat/autodiscover/builder/hints/metrics.go#L173
It can be reproduced as follows:
Come up with a pod spec that has two containers, one exposing port 8080 and one exposing no ports at all.
Have annotations like:
This will cause the container to be polled twice as it gets discovered once because of the explicit port definition on the container spec of container 1. Secondly since container 2 has no port definition and there is a port defined on the annotation.
Temporary work around is to pin the annotation to a single container.
The text was updated successfully, but these errors were encountered: