Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make k8s-log logging driver compatible with logging forwarding tools #20199

Closed
outcoldman opened this issue Sep 29, 2023 · 8 comments
Closed
Labels
jetsam "...cargo that is cast overboard to lighten the load in time of distress" kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@outcoldman
Copy link

Feature request description

Current issue with the podman and conman that they create just one logging file for each container. And when it reaches maxsize - the file gets truncated and start writing from the beginning. (Not 100% sure if it truncates or replaces, just saw one of the issues where person confirmed that it truncates).
Now think if fluentd or similar tool is constantly reading this file, what is going to happen. Forwarding tools (readers) are always can be behind on reading, just because some customers can write 100MB/s in 10 minutes, and their Log Aggregation tools can be lagging in accepting that much data. Especially if they have patterns where there is burst of data for an hour, and after that everything is calming down, so the log forwarding tools can catch up on logs.
In case of truncating - this can cause a significant issue, where reader can be close to the end of the file, and when file truncates, can just lose data.
Replacing the file is slightly better option, but still does not give much benefits. Think about upgrading container with fluentd (or just a simple restart). If the new file getting created before the fluentd restart, and fluentd would not finish the file before the restart, the data in this file will be lost again.

Suggest potential solution

Similar to k8s/openshift/docker - there is should be a maxsize and maxfile options. So you can tell container runtime to keep 5 rotated files and each should be within the size of 100MB. In that case logging forwarding tools can keep the reference based on the inode/idev which file is which, and continue forwarding logs based on the location in the specific file.

Have you considered any alternatives?

journald is definitely an option, but with my experience with high amount of logs (10MB/s) we saw with some customers that journald can be corrupted. I have not seen that often, but have seen it enough.

Additional context

Let me know if you have any additional questions. I feel like it could be a very common request. I see some of our customers have decided to switch to podman in production instead of using docker, and they don't want to switch to Kubernetes/OpenShift, so this seems like a feature where Podman customers will benefit.

@outcoldman outcoldman added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 29, 2023
@giuseppe
Copy link
Member

the file is renamed, so if you still have the old one opened you can keep reading it until the end.

When it is rotated a new file is created and used to append the new data.

Could you just open the file until the end and use inotify to be notified when a new log file is created? In this case, you'd open it, but first make sure to read completely the content of the previous opened file

@outcoldman
Copy link
Author

@giuseppe good to know, that the new file is created, but please take a look at my other concern when there is only one file in general:

Replacing the file is slightly better option, but still does not give much benefits. Think about upgrading container with fluentd (or just a simple restart). If the new file getting created before the fluentd restart, and fluentd would not finish the file before the restart, the data in this file will be lost again.

The big issue would be with the upgrades/restarts of log forwarding tools (like fluentd). Of course it can keep the fd open of the rotated file (that was deleted) to finish forwarding of the logs before switching to a new file. But what if you need to upgrade it/restart it fluentd? In that case when you restart the fluentd it is going to release fd to that deleted file, and the data is going to be gone.

Add to that if you have 10-100 containers running on the host. Each container rotates files in different times, so you cannot find a good time to restart fluentd because there is always an option that during the restart one of the container might get a new log file.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@outcoldman
Copy link
Author

@giuseppe curious if you have thoughts about my last comment?

@giuseppe
Copy link
Member

if the issue is with rebooting, could the tool keep track of the old file (i.e. create a hard link before rebooting)? The advantage of this solution is that it will work with the default configuration without requiring to recreate containers with a different logging configuration.

If that doesn't work, then I guess we can teach conmon how to rotate log files

Copy link

A friendly reminder that this issue had no activity for 30 days.

@outcoldman
Copy link
Author

@giuseppe the issue is not with rebooting the machine itself, but with, for example, and upgrade of your log forwarding container. In case if it is getting replaced with the new image you usually shutdown the old container and start the new one, so there is nothing that can keep the hard link on the old file. And from my experience having hard links is a complicated issue for the log forwarding tools, as if it will keep hard links for too many files, and their Indexer (Elasticsearch or Splunk) cannot handle that load, their drives can be filled with too many deleted log files on which log forwarding tool keep the hard links.

@Luap99
Copy link
Member

Luap99 commented Apr 4, 2024

Given this would need to implemented on conmon it seems rather unlikely given its maintenance status. Maybe we can consider log rotation with conmon-rs once this is integrated.

@Luap99 Luap99 closed this as not planned Won't fix, can't repro, duplicate, stale Apr 4, 2024
@Luap99 Luap99 added the jetsam "...cargo that is cast overboard to lighten the load in time of distress" label Apr 4, 2024
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jul 4, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Jul 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
jetsam "...cargo that is cast overboard to lighten the load in time of distress" kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants