-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes autodiscover doesn't discover short living jobs (and pods?) #22718
Comments
Pinging @elastic/integrations-platforms (Team:Platforms) |
I can reproduce this on 7.10.1 Logs from pods that fail quickly on startup and enter a crashloop are not autodiscovered, resulting in filebeat not reading the reason from the crash from |
Added to the description an events sequence observed by @ChrsMark when investigating similar issues with init containers. |
Hi, Indeed the same problem occurs with init containers. Below is a dumb "bash counter" to reproduce the problem with the init container. The logs of the Kind regards,
|
Reported in #11834 (comment). I could reproduce it in 7.10 with the reference documentation and a simple cronjob. Filebeat with autodiscover doesn't collect logs of short-living jobs.
What I could see is that short living containers don't generate any start event. Sometimes they generate a stop event.
With kubectl (
-w
) I could see that pods from short-living cronjobs don't generate an event with the running state:With longer-living pods, this is the sequence of events seen, and logs are collected:
I have also seen that logs from pods that print something and fail fast are not collected, events for these cases are like this:
For these cases having the logs is important to help investigating what is happening.
If there are init containers, there can be cases where the logs for the init containers are not collected, in these cases event sequences like these ones are seen:
For Metricbeat it can be ok to don't start modules for short-living processes, but filebeat should collect logs of containers from the moment they start, it is important to investigate issues.
For confirmed bugs, please report:
With debug logs for autodiscover this is seen for some jobs, some errors regarding the lack of
container.id
, and some stop events, but no start event:The text was updated successfully, but these errors were encountered: