-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector keeps old Inodes, clogging up disk #11742
Comments
Hi @MaxRink ! Vector will hold onto deleted files until it has fully read them. Can you tell if that's what is happening here? |
They got rotated (and after a few iterations deleted ) by logrotate or by the k8s internal apiserver logrotate |
@MaxRink makes sense, but do you know if Vector is still reading from them? Attaching |
i dont think it is. im not seeing any output from edit: forgot to do recursive, as all the io happens in child processes as it seems |
We're also seeing this issue -- maybe it's because we're not reaching EOF for some reason? Not sure why that would be happening, though. |
Also faced this issue last night, vector keeps file descriptors for rotated logs, any known workaround for that? |
#17603 may fix this. |
👍 We have this issue reported for several large customers in https://issues.redhat.com/browse/LOG-3949 Fluentd implementations has a similar issue but they also have a config which allows them to not necessarily read entirely from rotated files by introducing a delay config parameter which may be a useful feature instead of always trying to read to EOF |
The current implementation of the Having a similar parameter to |
#17603 (comment) steps through how this logic of reading deleted files until EOF works in Vector. |
@jszwedko we are using the kubernetes source? Does that matter? |
It shouldn't. The behavior is the same since they share the same underlying file tailing implementation. |
@jszwedko I would like to revisit this as I realize the ramification of adding such a parameter is log loss; we see it now with fluentd. It's naive to think, however, the collector and host resources are infinite and the collector is always able to keep up with the log volume. We routinely push the collector on OpenShift clusters to the point where vector as unable to keep up with the load. As an admin is it better for me to "never lose a log" or to continue to collect? A configuration point would be an "opt-in" choice where admins know the trade-offs. I suspect it could be instrumented as well with a metric to identify when didn't reach EOF or something similar upon which to write an alert against. |
#11742 (comment) is the expectation. In your case, it means that Vector is still reading from those deleted files. If you believe that not to be the case, we would definitely appreciate a reproduction case for this as we've tried a few times but haven't been able to reproduce behavior where Vector doesn't release the file handle once it gets to EOF for a deleted file. |
I see but in #11742 (comment), you have mentioned the |
Or it related to the fix of #18088 for |
Ah, yes, they use the same underlying mechanisms for file reading so the behavior, in this respect, should be the same. Thanks for clarifying! |
After discussing this one with the team we came to the following conclusion:
|
cc @wanjunlei |
A note for the community
Problem
We ran into the issue that Vector is keeping old Inodes open, thus filling up the disk
INodes: https://gist.github.com/MaxRink/ee056e27a4b11a7b710e437e1f892984
After pkilling vector disk usage returned to normal
Version
0.20.0
Debug Output
No response
Example Data
No response
Additional Context
No response
References
No response
The text was updated successfully, but these errors were encountered: