-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(file-server): while using vector with log rotation source, disk space is not freed #17603
Conversation
… I add an additional check at the start of the second loop to open the path again and compare the 'file_id' with the 'file_id' of the current watcher. If they do not match, the watcher is marked as 'dead', and it will be removed from the 'fp_map' later, freeing up disk space.
✅ Deploy Preview for vector-project ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
✅ Deploy Preview for vrl-playground ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Hi @maorbr, and thanks for digging into this! It's quite a complex part of Vector so we appreciate the effort. It's not entirely clear to me why you're seeing that behavior in the first place, which I'd like to sort out before we agree on a fix. I'll walk through your example scenario and explain what should be happening:
Seems like we're good to this point. If
When we scan for file updates, the first thing we do is mark every entry in the vector/lib/file-source/src/file_server.rs Lines 174 to 176 in ab1169b
Immediately afterwards, we iterate through the paths of files we're configured to watch (which will only be
You are correct that we do not remove the old entry as part of this process, but we do change it by setting findable to false. Next we iterate through all entries in vector/lib/file-source/src/file_server.rs Lines 235 to 242 in ab1169b
Within vector/lib/file-source/src/file_watcher/mod.rs Lines 222 to 224 in ab1169b
Once the watcher is set to dead, we drop it from To summarize, the intended behavior is that we release the file handle once two conditions are met:
That should be perfectly compatible with your scenario, so I would really like to understand why you're seeing different behavior. Your fix seems to take a different approach, where it detects if a file has been renamed, and marks it dead if so. This has a few potential issues:
These are all things we can discuss and address, but I think the priority right now should be finding where the existing implementation is failing to behave as we expect. |
We're going to close this for now. Feel free to re-open to pick back up the discussion. |
I found out that Vector do not release file handlers even if the files are deleted.
Specifically, the use case is that vector is watching a file/folder that is being written to by a logger with log rotation.
This is easily reproducible on my local machine, and I can see that a lot of files are being deleted, but the disk spaces is not freed because vector have a file handle open for reading and does not release it.
Over time, this is filling up the disk and services start to crash.
The command: $ lsof +L1 | grep /home/maor/Projects/logger_app/vector_logs/ will list all files in the directory that have been deleted but are still being held open by some process (vector in our case).
Assumption with python RotatingFilehandler logger is the following:
a. Logger is writing file log.log to a folder, only allowed to rotate over 2 files.
b. Vector is configured to read from this file only (log.log).
a. Vector identifies the new log.log file and start reading from it, but does not remove or change log.log1
a. Vector identifies and add the new log.log file, does not remove or change log.log1 (there are 2 of these now).
In summary, the bug occurred because, while using log rotation, two files shared the same path by the vector watcher, the fix should allow vector to release the watcher associated with the deleted file and begin tracking the new file that shares the same path.
To accomplish this, I add an additional check at the start of the second loop (where there is a scan through a map of file watchers in file-server.rs) to open the path again and compare the 'file_id' with the 'file_id' of the current watcher. If they do not match, the watcher is marked as 'dead,' and it will be removed from the 'fp_map' later, freeing up disk space.
Closes: #11742