Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle K8s events with invalid count or invalid Timestamps #31126

Open
gsantoro opened this issue Apr 4, 2022 · 7 comments
Open

Handle K8s events with invalid count or invalid Timestamps #31126

gsantoro opened this issue Apr 4, 2022 · 7 comments
Assignees
Labels
enhancement good first issue Indicates a good issue for first-time contributors Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team

Comments

@gsantoro
Copy link
Contributor

gsantoro commented Apr 4, 2022

Describe the enhancement:

  • Handle use cases when k8s event timestamps (FirstTimestamp and LastTimestamp) are null
  • Handle use cases when the k8s event count is zero

Describe a specific use case for the enhancement or feature:

  • We want to fall back to a valid timestamp if the event timestamps are null
  • We want to provide a valid count if the API reports a zero count

Notes

  • discussion on proposed changes at fix: adapt kubernetes event #29591
    • we should create a new PR from main as suggested in this comment
    • if event.Count == 0 and event.Series != null => default to event.Series.Count or 1 otherwise
      • I suggest to keep the original value for event.Count (either 0 or null as it comes)
      • add new fields for event.Series.Count. This increase the dimensionality of our metrics but it removes the unpredictability of what event.Count is filled with
      • the fact that event.Count is 0 or null might be a signal by itself. something else is wrong in the k8s setup
    • if event.FirstTimestamp = (null or 0) => default to event.LastTimestamp
      • a better suggestion I found is to default FirstTimestamp to event.EventTime like suggested at comment
    • if event.LastTimestamp = (null or 0) => default to event.Series.LastObservedTime or event.EventTime (if not null) or event.CreationTimestamp
      • about default to event.Series.LastObservedTime, I have the same opinion as for event.Series.Count. If new fields from event.Series.LastObservedTime is necessary, we should expose it as a new field instead of using its value as default value when some other field is null
      • as suggested at comment event.CreationTimestamp is not part of the kubernetes supported release. Not sure where it is coming from
@gsantoro gsantoro added enhancement good first issue Indicates a good issue for first-time contributors Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team labels Apr 4, 2022
@gsantoro gsantoro self-assigned this Apr 4, 2022
@ChrsMark
Copy link
Member

ChrsMark commented Apr 5, 2022

* add new fields for event.Series.Count. This increase the dimensionality of our metrics but it removes the unpredictability of what event.Count is filled with
* the fact that event.Count is 0 or null might be a signal by itself. something else is wrong in the k8s setup

I like this idea. To me is more clear and will help us maintain the module with the time being if k8s change/deprecate fields. So +1 for this.

I would also suggest checking the API's compatibility between the different versions of k8s we support so as to ensure the that all these fields are exposed. The versions of k8s we test against can be found at https://github.com/elastic/beats/blob/main/deploy/kubernetes/Jenkinsfile.yml#L21.

@gsantoro
Copy link
Contributor Author

gsantoro commented Apr 5, 2022

Checking new fields compatiblity

@ChrsMark
Copy link
Member

ChrsMark commented Apr 6, 2022

What about event.EventTime, is this also supports across these versions?

I think it is fine if we rely on major.minor versions for compatibility, I wouldn't expect a field removal/addition in a patch release ;).

Also keep on mind that a visualisation about Events is probably coming at #31021 so maybe a coordination would help here so as to check if new fields would be more appropriate to be used in the visualisaton.

@gsantoro
Copy link
Contributor Author

gsantoro commented Apr 7, 2022

Yes I agree that event.EventTime should be added as well. Just checked, it is supported across all versions in that list.

I also noticed that https://github.com/elastic/beats/blob/main/metricbeat/module/kubernetes/event/_meta/data.json is missing the field event.source. I am not sure what that file is used for, if it is just for documentation or something else but I think it should reflect the code changes.

About the coordination with #31021 I am not sure how that could play out. Maybe @MichaelKatsoulis could help there.

@MichaelKatsoulis
Copy link
Contributor

About the coordination with #31021 I am not sure how that could play out. Maybe @MichaelKatsoulis could help there.

@gsantoro check #31021 (comment), where there is a screenshot of how the Kubernetes overview dashboard could look like, including the events. I am showing some fields there but we could discuss what else cold be shown that would make sense.

@gsantoro
Copy link
Contributor Author

gsantoro commented Apr 7, 2022

@MichaelKatsoulis I have checked your latest screenshot. I am wondering if we should show the kubernetes.event.reason next to kubernetes.event.message column.

about the new fields that I suggested. I think they are probably too low level to show them in this dashboard but maybe they could be used in a drill-down of events per pod.

@MichaelKatsoulis
Copy link
Contributor

@gsantoro sounds very good idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement good first issue Indicates a good issue for first-time contributors Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

No branches or pull requests

3 participants