Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes Module] Add missing metrics for replicaset and pod owners #36746

Closed
wants to merge 12 commits into from

Conversation

gizas
Copy link
Contributor

@gizas gizas commented Oct 4, 2023

  • Enhancement

Proposed commit message

  • WHAT: We add information regarding the owner of pod and replicaset assets that was missing from Kube State Metrics endpoint.
  • WHY: We disable the metadata enrichemnt for deployments and replicasets by default as per Disable Deployment and Replicaset enrichment by default elastic-agent-autodiscover#62. This means that our pods (either from deployments or cronjobs) wont have the needed information for the resource that creates them. We plan to fix this by introducsing relevant owner fields in state_pod and state_replicaset metricsets

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  1. Build metricbeat as per instructions here
    This will copy your new metricbeat executable into the kubernetes cluster you want to observe
  2. Login to your Elastic Stack (where metricbeat of previous step sends data) and create a new Dataview for metricbeat-* indices.
  3. Observe the new kubernetes.replicaset.replicas.owner.* and kubernetes.pod.owner.* fields

Related issues

Screenshots

For Pod that owner is node:
Screenshot 2023-10-05 at 11 45 41 AM

For Pod that owner is replicaset:

Screenshot 2023-10-05 at 11 46 16 AM

For replicaset that owner is deployment:
Screenshot 2023-10-05 at 11 46 41 AM

@gizas gizas requested a review from a team as a code owner October 4, 2023 15:04
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 4, 2023
@gizas gizas requested a review from tommyers-elastic October 4, 2023 15:05
@mergify mergify bot assigned gizas Oct 4, 2023
@mergify
Copy link
Contributor

mergify bot commented Oct 4, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @gizas? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 4, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-10-06T12:05:05.642+0000

  • Duration: 52 min 42 sec

Test stats 🧪

Test Results
Failed 0
Passed 4436
Skipped 902
Total 5338

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@constanca-m
Copy link
Contributor

Based on the fact that there was no change in the asciidoc file, I think you forgot to run mage update on the metricbeat folder @gizas

@gizas
Copy link
Contributor Author

gizas commented Oct 5, 2023

Based on the fact that there was no change in the asciidoc file, I think you forgot to run mage update on the metricbeat folder @gizas

Yes sorry for that Constanca, I opened the PR more as a draft in order to showcase the changes needed. I will push latest changes and also make the tests today

@mergify
Copy link
Contributor

mergify bot commented Oct 5, 2023

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b metadataowners upstream/metadataowners
git merge upstream/main
git push upstream metadataowners

@gizas gizas added the Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team label Oct 5, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 5, 2023
@gizas gizas changed the title Adding missing info for replicaset and pod owners Adding missing metrics for replicaset and pod owners Oct 5, 2023
@gizas gizas changed the title Adding missing metrics for replicaset and pod owners Add missing metrics for replicaset and pod owners Oct 5, 2023
@gizas gizas changed the title Add missing metrics for replicaset and pod owners [Kubernetes Module] Add missing metrics for replicaset and pod owners Oct 5, 2023
@gizas gizas added backport-v8.9.0 Automated backport with mergify backport-v8.10.0 Automated backport with mergify backport-v8.11.0 Automated backport with mergify labels Oct 5, 2023
@gizas gizas removed the backport-v8.9.0 Automated backport with mergify label Oct 5, 2023
Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gizas would that PR add 2nd layer controller/owner's information?
The 1st owner's information is provided already through the Metadata's Enricher at https://github.com/elastic/elastic-agent-autodiscover/blob/2cc3dcb075fe437467387d36096f554c13c6d144/kubernetes/metadata/resource.go#L138-L152.
So if this PR is only to cover the 1st layer owner's metadata then it seems it's already supported and the PR is redundant?

metricbeat/module/kubernetes/state_pod/_meta/fields.yml Outdated Show resolved Hide resolved
metricbeat/module/kubernetes/state_pod/_meta/fields.yml Outdated Show resolved Hide resolved
"owner": {
"is_controller": "true",
"kind": "ReplicaSet",
"name": "coredns-787d4945fb"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this would replace the disabled addition of the kubernetes.deployment field? This only adds the first level owner so in a case of Deploymenyt->ReplicaSet->Pod it won't add the Deployment information, right?

Also this is happening already supported through the Metadata Enricher -> https://github.com/elastic/elastic-agent-autodiscover/blob/2cc3dcb075fe437467387d36096f554c13c6d144/kubernetes/metadata/resource.go#L138-L152

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not add the deployment name correct only 1st level, you are right.

I was thinking that we can justify that this is enough for the users in order to figure out the reference. Of course we need to document this. Otherwise even if we can try in code or with ingest pipelines to trim the strings and produce ancestors we might fall in mistakes because we are not sure for the existance.

Also if we integrate this elastic/elastic-agent-autodiscover#62, this means that also the enrichers will loose those metadata is not it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through the Enricher every metrics' document for Pod's like pod.cpu.usage will be enriched with Pod's metadata. In this metadata the Enricher adds the owner name like kubernetes.cronjob.name, kubernetes.replicaset.name etc. So this is not related to the settings you change at elastic/elastic-agent-autodiscover#62. That's a diffrent feature takes place at https://github.com/elastic/elastic-agent-autodiscover/blob/2cc3dcb075fe437467387d36096f554c13c6d144/kubernetes/metadata/pod.go#L96-L122.

So still this patch seems to be redundant.
Could you try to run Metricbeat with k8s module enabled and see that Pod's metrics are enriched with the owner name (disable the deployment and cronjob metadata as usually to isolate the functionality)? This will help you understand how the feature works and that this patch actually overlaps already existent information.

@gizas
Copy link
Contributor Author

gizas commented Oct 10, 2023

So still this patch seems to be redundant.

@ChrsMark you are correct. I have repeated the tests and the 1st level citizens are there. So there is no need for this. I am going to close is as duplicate.

We will try to figure out how to add the 2nd level citizen creator on separate story

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.10.0 Automated backport with mergify backport-v8.11.0 Automated backport with mergify Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants