Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

state_service metricset fails on headless services #17447

Closed
anyasabo opened this issue Apr 2, 2020 · 13 comments · Fixed by #20571
Closed

state_service metricset fails on headless services #17447

anyasabo opened this issue Apr 2, 2020 · 13 comments · Fixed by #20571
Labels
bug Team:Platforms Label for the Integrations - Platforms team

Comments

@anyasabo
Copy link
Contributor

anyasabo commented Apr 2, 2020

The kubernetes module's state_service metricset fails to index headless kubernetes services, which explicitly have None as the ClusterIP. This is easy to reproduce by using metricbeat and ECK, which uses headless services by default. It's not clear to me what the correct behavior would be here though.

This is the meat of the error being thrown:

{"type":"mapper_parsing_exception","reason":"failed to parse field [kubernetes.service.cluster_ip] of type [ip] in document with id 'yf_HN3EBEveQZJtR4sNf'. Preview of field's value: 'None'","caused_by":{"type":"illegal_argument_exception","reason":"'None' is not an IP string literal.

And the full error:

{"level":"warn","timestamp":"2020-04-01T22:06:18.853Z","caller":"elasticsearch/client.go:517","message":"Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbf9961f66a396c38, ext:118425703920, loc:(*time.Location)(0x7eb3060)}, Meta:null, Fields:{\"agent\":{\"ephemeral_id\":\"655841f1-0966-41c1-822a-c6ea9af7fdc8\",\"hostname\":\"kube-elastic-metricbeat-7dd4c74c74-jz25z\",\"id\":\"7b3c4cbc-103d-4684-8ec6-4255fa2cd7c5\",\"type\":\"metricbeat\",\"version\":\"7.6.1\"},\"cloud\":{\"availability_zone\":\"europe-west1-d\",\"instance\":{\"id\":\"1037830539447785865\",\"name\":\"gke-sabo-dev-cluster-default-pool-617f5774-gl30\"},\"machine\":{\"type\":\"n1-standard-8\"},\"project\":{\"id\":\"elastic-cloud-dev\"},\"provider\":\"gcp\"},\"ecs\":{\"version\":\"1.4.0\"},\"event\":{\"dataset\":\"kubernetes.service\",\"duration\":29916059,\"module\":\"kubernetes\"},\"host\":{\"name\":\"kube-elastic-metricbeat-7dd4c74c74-jz25z\"},\"kubernetes\":{\"labels\":{\"common_k8s_elastic_co_type\":\"elasticsearch\",\"elasticsearch_k8s_elastic_co_cluster_name\":\"kube-elastic-monitor\",\"elasticsearch_k8s_elastic_co_statefulset_name\":\"kube-elastic-monitor-es-default\"},\"namespace\":\"default\",\"service\":{\"cluster_ip\":\"None\",\"created\":\"2020-04-01T15:22:51.000Z\",\"name\":\"kube-elastic-monitor-es-default\",\"type\":\"ClusterIP\"}},\"metricset\":{\"name\":\"state_service\",\"period\":10000},\"service\":{\"address\":\"kube-elastic-kube-state-metrics.default:8080\",\"type\":\"kubernetes\"}}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {\"type\":\"mapper_parsing_exception\",\"reason\":\"failed to parse field [kubernetes.service.cluster_ip] of type [ip] in document with id 'yf_HN3EBEveQZJtR4sNf'. Preview of field's value: 'None'\",\"caused_by\":{\"type\":\"illegal_argument_exception\",\"reason\":\"'None' is not an IP string literal.\"}}"}

And a service yaml that causes the error:

17:09 $ kubectl get svc kube-elastic-monitor-es-default -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2020-04-01T15:22:51Z"
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: kube-elastic-monitor
    elasticsearch.k8s.elastic.co/statefulset-name: kube-elastic-monitor-es-default
  name: kube-elastic-monitor-es-default
  namespace: default
  ownerReferences:
  - apiVersion: elasticsearch.k8s.elastic.co/v1
    blockOwnerDeletion: true
    controller: true
    kind: Elasticsearch
    name: kube-elastic-monitor
    uid: 47ba3d61-9c4d-479b-9689-4bb184aa7541
  resourceVersion: "2612356"
  selfLink: /api/v1/namespaces/default/services/kube-elastic-monitor-es-default
  uid: ff1c86e3-f00a-423b-9bbb-dfc8df613ee1
spec:
  clusterIP: None
  selector:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: kube-elastic-monitor
    elasticsearch.k8s.elastic.co/statefulset-name: kube-elastic-monitor-es-default
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

For confirmed bugs, please report:

@ChrsMark ChrsMark added bug Team:Platforms Label for the Integrations - Platforms team labels Apr 3, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@tuandn8
Copy link

tuandn8 commented Jun 9, 2020

I also meet this error.

@milkfinch
Copy link

milkfinch commented Aug 11, 2020

Yep, same here. My quick fix is:

processors:
  - drop_fields:
       when:
         equals:
           kubernetes.service.cluster_ip: "None"
         fields: ["kubernetes.service.cluster_ip"]
         ignore_missing: true

@ChrsMark
Copy link
Member

Maybe changing this field from ip to keyword at

will be the way to go here.

Wondering why other fields of this metricset, like external_ip are already of type keyword instead of ip and why only cluster_ip was of type ip.

On the other hand changing this field from ip will may lead into losing some functionality related to IPs in Kibana.

Any thoughts @jsoriano ?

@jsoriano
Copy link
Member

Any thoughts @jsoriano ?

Yes, changing the field type to keyword would solve this issue, and it would be coherent with the other IP fields on this metricset.
I guess that the rest of them are also keyword because of similar reasons. We could go with this by now.

As you mention we would be losing functionality available to IP fields.

An alternative would be to handle non-IP values of these fields in a different way, for example we could drop the field on None values as suggested in a previous comment, but we might be losing information (cluster_ip: None has a meaning).

@mariusmotea
Copy link

Yep, same here. My quick fix is:

processors:
  - drop_fields:
       when:
         equals:
           kubernetes.service.cluster_ip: "None"
         fields: ["kubernetes.service.cluster_ip"]
         ignore_missing: true

did not work for me.

@mariusmotea
Copy link

Maybe changing this field from ip to keyword at

will be the way to go here.
Wondering why other fields of this metricset, like external_ip are already of type keyword instead of ip and why only cluster_ip was of type ip.

On the other hand changing this field from ip will may lead into losing some functionality related to IPs in Kibana.

Any thoughts @jsoriano ?

I edit the file /usr/share/metricbeat/fields.yml in metricbeat docker container (v7.8.1) setting the type keyword for cluster_ip key and restarted the container. I still receive the error:

[elasticsearch] elasticsearch/client.go:407 Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbfc5161fcae2c166, ext:110789299798, loc:(*time.Location)(0x7cc8560)}, Meta:null, Fields:{"agent":{"ephemeral_id":"229973a9-3171-4b30-b365-7423d815e16f","hostname":"PZN-BU-APP-01","id":"b83ba018-c8da-48f6-b8f6-9d106b7acd04","name":"PZN-BU-APP-01","type":"metricbeat","version":"7.8.1"},"ecs":{"version":"1.5.0"},"event":{"dataset":"kubernetes.service","duration":40216350,"module":"kubernetes"},"host":{"name":"PZN-BU-APP-01"},"kubernetes":{"labels":{"app_kubernetes_io_name":"kube-state-metrics","app_kubernetes_io_version":"1.9.7"},"namespace":"kube-system","service":{"cluster_ip":"None","created":"2020-08-10T10:59:57.000Z","name":"kube-state-metrics","type":"ClusterIP"}},"metricset":{"name":"state_service","period":10000},"service":{"address":"kube-state-metrics:8080","type":"kubernetes"}}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse field [kubernetes.service.cluster_ip] of type [ip] in document with id '4V6m4nMB0UCBbcjO6Pzh'. Preview of field's value: 'None'","caused_by":{"type":"illegal_argument_exception","reason":"'None' is not an IP string literal."}}

@ChrsMark
Copy link
Member

Any thoughts @jsoriano ?

Yes, changing the field type to keyword would solve this issue, and it would be coherent with the other IP fields on this metricset.
I guess that the rest of them are also keyword because of similar reasons. We could go with this by now.

As you mention we would be losing functionality available to IP fields.

An alternative would be to handle non-IP values of these fields in a different way, for example we could drop the field on None values as suggested in a previous comment, but we might be losing information (cluster_ip: None has a meaning).

I think that using keyword will be fine since I think that cluster_ips are usually internal IPs of a k8s cluster and Kibana IP functionalities will not have a lot of interest in this case.

@ChrsMark
Copy link
Member

ChrsMark commented Aug 12, 2020

Maybe changing this field from ip to keyword at

will be the way to go here.
Wondering why other fields of this metricset, like external_ip are already of type keyword instead of ip and why only cluster_ip was of type ip.
On the other hand changing this field from ip will may lead into losing some functionality related to IPs in Kibana.
Any thoughts @jsoriano ?

I edit the file /usr/share/metricbeat/fields.yml in metricbeat docker container (v7.8.1) setting the type keyword for cluster_ip key and restarted the container. I still receive the error:

[elasticsearch] elasticsearch/client.go:407 Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbfc5161fcae2c166, ext:110789299798, loc:(*time.Location)(0x7cc8560)}, Meta:null, Fields:{"agent":{"ephemeral_id":"229973a9-3171-4b30-b365-7423d815e16f","hostname":"PZN-BU-APP-01","id":"b83ba018-c8da-48f6-b8f6-9d106b7acd04","name":"PZN-BU-APP-01","type":"metricbeat","version":"7.8.1"},"ecs":{"version":"1.5.0"},"event":{"dataset":"kubernetes.service","duration":40216350,"module":"kubernetes"},"host":{"name":"PZN-BU-APP-01"},"kubernetes":{"labels":{"app_kubernetes_io_name":"kube-state-metrics","app_kubernetes_io_version":"1.9.7"},"namespace":"kube-system","service":{"cluster_ip":"None","created":"2020-08-10T10:59:57.000Z","name":"kube-state-metrics","type":"ClusterIP"}},"metricset":{"name":"state_service","period":10000},"service":{"address":"kube-state-metrics:8080","type":"kubernetes"}}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse field [kubernetes.service.cluster_ip] of type [ip] in document with id '4V6m4nMB0UCBbcjO6Pzh'. Preview of field's value: 'None'","caused_by":{"type":"illegal_argument_exception","reason":"'None' is not an IP string literal."}}

Maybe you missed the step of re-loading the index template? See https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-template.html

hint: ./metricbeat setup --index-management -E setup.template.overwrite=true

You can always delete the mapping/template from your ES and try again. This should fix your issue.

Btw I will push an upstream patch for it soonish.

@mariusmotea
Copy link

I will wait for the fixed docker image. For the moment i remove the kubernetes services from monitoring because the error was caused by kube-system-metrics service.

Thanks.

@mariusmotea
Copy link

There was published a new docker image with the fix?

Thanks.

@ChrsMark
Copy link
Member

The change will be reflected to master (docker.elastic.co/beats/metricbeat:8.0.0-SNAPSHOT) and 7.x (docker.elastic.co/beats/metricbeat:7.10.0-SNAPSHOT) once the daily builds are completed tomorrow.

@wajika
Copy link

wajika commented Aug 15, 2020

The change will be reflected to master (docker.elastic.co/beats/metricbeat:8.0.0-SNAPSHOT) and 7.x (docker.elastic.co/beats/metricbeat:7.10.0-SNAPSHOT) once the daily builds are completed tomorrow.

What's the difference between 7.10 and 8.0?

@zube zube bot removed the [zube]: Done label Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants