Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick #20281 to 7.x: Add leader election for autodiscover #20510

Merged
merged 1 commit into from
Aug 11, 2020

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Aug 10, 2020

Cherry-pick of PR #20281 to 7.x branch. Original message:

What does this PR do?

Implements leader election as part of kubernetes autodiscover provider. With this, when the beat is elected as leader it will start cluster scope metricsets.

Under the hood, all leader candidates will try to gain the lock on Lease object (https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#lease-v1-coordination-k8s-io). When the lock is gained then the configured cluster wide mericsets will get started.

Why is it important?

To get rid of Deployment procedure for the singleton Metricbeat instance which is needed for the cluster scope metricsets.
Sample configuration that will start state_node metricset when a leadership is gained:

metricbeat.autodiscover:
  providers:
    - type: kubernetes
      scope: cluster
      node: ${NODE_NAME}
      unique: true
      identifier: leaderelectionmetricbeat
      templates:
        - config:
            - module: kubernetes
              hosts: ["kube-state-metrics:8080"]
              period: 10s
              add_metadata: true
              metricsets:
                - state_node

Then we can monitor the lease to check which provider holds the lock:

kubectl describe lease beats-cluster-leader
Name:         beats-cluster-leader
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  coordination.k8s.io/v1
Kind:         Lease
Metadata:
  Creation Timestamp:  2020-07-30T14:48:16Z
  Resource Version:    357033
  Self Link:           /apis/coordination.k8s.io/v1/namespaces/default/leases/beats-cluster-leader
  UID:                 ef255352-3d30-42f9-b27a-3a3e5dcb80dd
Spec:
  Acquire Time:            2020-07-30T14:51:00.592028Z
  Holder Identity:         beats-leader-leaderelectionmetricbeat
  Lease Duration Seconds:  15
  Lease Transitions:       2
  Renew Time:              2020-07-30T14:51:00.597852Z
Events:                    <none>

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  1. Prepare a multinode k8s cluster (for instance on gke)
  2. Deploy Metricbeat as a Daemonset on the cluster with leader election config:
metricbeat.autodiscover:
  providers:
    - type: kubernetes
      scope: cluster
      node: ${NODE_NAME}
      unique: true
      leader_lease: leaderelectionmetricbeat
      templates:
        - config:
            - module: kubernetes
              hosts: ["kube-state-metrics:8080"]
              period: 10s
              add_metadata: true
              metricsets:
                - state_node
  1. Make sure that events for state_node are being collected from only one metricbeat instance. You can the in the logs and verify that only one metricbeat has gained the lock so far.

  2. Monitor the lease object to keep track of the lock holder: watch kubectl describe lease beats-cluster-leader

  3. Stop the Metricbeat instance that currently holds the lock and make sure that another one takes over and we keep getting events from state-node. Make sure that the lease changed holder (logs will also let us know that a new metricset is getting started)

  4. Also test without setting identifier and check that it takes the default value properly like metricbeat-cluster-leader.

  5. Check with old autodiscover (template based/hints based) for possible regressions.

Related issues

@ChrsMark ChrsMark added [zube]: In Review backport Team:Platforms Label for the Integrations - Platforms team labels Aug 10, 2020
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 10, 2020
@elasticmachine
Copy link
Collaborator

❕ Build Aborted

Either there was a build timeout or someone aborted the build.'}

Pipeline View Test View Changes Artifacts

Expand to view the summary

Build stats

  • Build Cause: [Pull request #20510 opened]

  • Start Time: 2020-08-10T11:40:01.195+0000

  • Duration: 123 min 42 sec

Test stats 🧪

Test Results
Failed 0
Passed 17634
Skipped 1838
Total 19472

Log output

Expand to view the last 100 lines of log output

[2020-08-10T13:42:07.096Z] [INFO] system-tests=''. If no empty then let's create a tarball
[2020-08-10T13:42:08.779Z] Failed in branch Libbeat x-pack
[2020-08-10T13:42:10.499Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats
[2020-08-10T13:42:10.845Z] + find . -type f -name TEST*.xml -path */build/* -delete
[2020-08-10T13:42:10.873Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Lint
[2020-08-10T13:42:11.030Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Elastic-Agent-Mac-OS-X
[2020-08-10T13:42:11.185Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-oss-Mac-OS-X
[2020-08-10T13:42:11.343Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Winlogbeat-oss
[2020-08-10T13:42:11.497Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-crosscompile
[2020-08-10T13:42:11.653Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Elastic-Agent-x-pack
[2020-08-10T13:42:11.814Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Filebeat-x-pack-Mac-OS-X
[2020-08-10T13:42:11.970Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Journalbeat-oss
[2020-08-10T13:42:12.127Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Dockerlogbeat
[2020-08-10T13:42:12.355Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Generators-Metricbeat-Linux
[2020-08-10T13:42:12.513Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-x-pack-Mac-OS-X
[2020-08-10T13:42:12.669Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Functionbeat-x-pack
[2020-08-10T13:42:12.824Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Filebeat-Mac-OS-X
[2020-08-10T13:42:12.979Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Elastic-Agent-x-pack-Windows
[2020-08-10T13:42:13.146Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack-Mac-OS-X
[2020-08-10T13:42:13.314Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Packetbeat-Linux
[2020-08-10T13:42:13.474Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-OSS-Unit-tests
[2020-08-10T13:42:13.634Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Heartbeat-oss
[2020-08-10T13:42:13.792Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-oss-Windows
[2020-08-10T13:42:13.951Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-x-pack-Windows
[2020-08-10T13:42:14.108Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-crosscompile
[2020-08-10T13:42:14.282Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-x-pack
[2020-08-10T13:42:14.452Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Functionbeat-Mac-OS-X-x-pack
[2020-08-10T13:42:14.611Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Winlogbeat-Windows-x-pack
[2020-08-10T13:42:14.784Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Filebeat-Windows
[2020-08-10T13:42:14.944Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Heartbeat-Mac-OS-X
[2020-08-10T13:42:15.103Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Packetbeat-Mac-OS-X
[2020-08-10T13:42:15.267Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Filebeat-x-pack-Windows
[2020-08-10T13:42:15.430Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Auditbeat-oss-Linux
[2020-08-10T13:42:15.596Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack-Windows
[2020-08-10T13:42:15.754Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Winlogbeat-Windows
[2020-08-10T13:42:15.909Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-Windows
[2020-08-10T13:42:16.064Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Generators-Beat-Linux
[2020-08-10T13:42:16.222Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Heartbeat-Windows
[2020-08-10T13:42:16.381Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Functionbeat-Windows
[2020-08-10T13:42:16.537Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Packetbeat-Windows
[2020-08-10T13:42:16.697Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Filebeat-x-pack
[2020-08-10T13:42:16.854Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Filebeat-oss
[2020-08-10T13:42:17.013Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Libbeat-oss
[2020-08-10T13:42:17.169Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-OSS-Python-Integration-tests
[2020-08-10T13:42:17.333Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-OSS-Go-Integration-tests
[2020-08-10T13:42:17.491Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Libbeat-crosscompile
[2020-08-10T13:42:17.654Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Libbeat-stress-tests
[2020-08-10T13:42:17.814Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Generators-Metricbeat-Mac-OS-X
[2020-08-10T13:42:17.972Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack
[2020-08-10T13:42:18.134Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Generators-Beat-Mac-OS-X
[2020-08-10T13:42:18.293Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-Mac-OS-X
[2020-08-10T13:42:18.456Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Libbeat-x-pack
[2020-08-10T13:42:18.913Z] + cat
[2020-08-10T13:42:18.913Z] + /usr/local/bin/runbld ./runbld-script
[2020-08-10T13:42:18.913Z] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
[2020-08-10T13:42:25.519Z] runbld>>> runbld started
[2020-08-10T13:42:25.519Z] runbld>>> 1.6.12/f45d832f2ba0aa2722ab4ec1fda8ad140f027f8b
[2020-08-10T13:42:28.079Z] runbld>>> The following profiles matched the job 'Beats/beats/PR-20510' in order of occurrence in the config (last value wins).
[2020-08-10T13:42:29.025Z] runbld>>> Debug logging enabled.
[2020-08-10T13:42:29.025Z] runbld>>> Storing result
[2020-08-10T13:42:29.025Z] runbld>>> Store result: created {:total 2, :successful 2, :failed 0} 1
[2020-08-10T13:42:29.025Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200810134228-2BE4DB9B
[2020-08-10T13:42:29.287Z] runbld>>> Adding system facts.
[2020-08-10T13:42:30.257Z] runbld>>> Adding vcs info for the latest commit:  eae2d48c635ea0ad030f53db261d5516685f914b
[2020-08-10T13:42:30.257Z] runbld>>> >>>>>>>>>>>> SCRIPT EXECUTION BEGIN >>>>>>>>>>>>
[2020-08-10T13:42:30.522Z] runbld>>> Adding /usr/lib/jvm/java-8-openjdk-amd64/bin to the path.
[2020-08-10T13:42:30.522Z] Processing JUnit reports with runbld...
[2020-08-10T13:42:30.522Z] + echo 'Processing JUnit reports with runbld...'
[2020-08-10T13:42:30.786Z] runbld>>> <<<<<<<<<<<< SCRIPT EXECUTION END <<<<<<<<<<<<
[2020-08-10T13:42:30.786Z] runbld>>> DURATION: 33ms
[2020-08-10T13:42:30.786Z] runbld>>> STDOUT: 40 bytes
[2020-08-10T13:42:30.786Z] runbld>>> STDERR: 49 bytes
[2020-08-10T13:42:30.786Z] runbld>>> WRAPPED PROCESS: SUCCESS (0)
[2020-08-10T13:42:30.786Z] runbld>>> Searching for build metadata in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats
[2020-08-10T13:42:32.181Z] runbld>>> Storing build metadata: 
[2020-08-10T13:42:32.181Z] runbld>>> Adding test report.
[2020-08-10T13:42:32.181Z] runbld>>> Searching for junit test output files with the pattern: TEST-.*\.xml$ in: /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats
[2020-08-10T13:42:32.755Z] runbld>>> Found 134 test output files
[2020-08-10T13:42:33.708Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-openmetrics.xml
[2020-08-10T13:42:33.708Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-iis.xml
[2020-08-10T13:42:33.708Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-istio.xml
[2020-08-10T13:42:33.969Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-tomcat.xml
[2020-08-10T13:42:33.969Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-x-pack/x-pack/metricbeat/build/TEST-go-integration-activemq.xml
[2020-08-10T13:42:33.969Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-OSS-Go-Integration-tests/metricbeat/build/TEST-go-integration-graphite.xml
[2020-08-10T13:42:33.969Z] runbld>>> No testsuite node found in /var/lib/jenkins/workspace/Beats_beats_PR-20510/src/github.com/elastic/beats/Metricbeat-OSS-Go-Integration-tests/metricbeat/build/TEST-go-integration-windows.xml
[2020-08-10T13:42:35.885Z] runbld>>> Test output logs contained: Errors: 0 Failures: 0 Tests: 19326 Skipped: 1560
[2020-08-10T13:42:36.147Z] runbld>>> Storing result
[2020-08-10T13:42:36.147Z] runbld>>> FAILURES: 0
[2020-08-10T13:42:36.147Z] runbld>>> Store result: updated {:total 2, :successful 2, :failed 0} 2
[2020-08-10T13:42:36.147Z] runbld>>> BUILD: https://c150076387b5421f9154dfbf536e5c60.us-west1.gcp.cloud.es.io:9243/build-1587637540455/t/20200810134228-2BE4DB9B
[2020-08-10T13:42:36.409Z] runbld>>> Email notification disabled by environment variable.
[2020-08-10T13:42:36.409Z] runbld>>> Slack notification disabled by environment variable.
[2020-08-10T13:42:42.023Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-20510
[2020-08-10T13:42:42.285Z] [INFO] getVaultSecret: Getting secrets
[2020-08-10T13:42:42.381Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-08-10T13:42:43.550Z] + chmod 755 generate-build-data.sh
[2020-08-10T13:42:43.550Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20510/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20510/runs/1 ABORTED 7362082
[2020-08-10T13:42:43.550Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20510/runs/1/steps/?limit=10000 -o steps-info.json
[2020-08-10T13:42:46.034Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20510/runs/1/tests/?status=FAILED -o tests-errors.json
[2020-08-10T13:42:46.034Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-20510/runs/1/log/ -o pipeline-log.txt

@ChrsMark ChrsMark merged commit 9dff318 into elastic:7.x Aug 11, 2020
@zube zube bot removed the [zube]: Done label Nov 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants