Enable load shedding for kubernetes_log
source
#18784
Labels
source: kubernetes_logs
Anything `kubernetes_logs` source related
type: feature
A value-adding code addition that introduce new functionality.
A note for the community
Use Cases
When running Vector as a Kubernetes DaemonSet, a single pod that writes large quantities of logs can degrade performance for the logs from the other pods on a node, and eventually can lead to pods being evicted from the node due to disk pressure.
Vector holds on to file descriptors of log files that it hasn't finished processing. So if a pod generates more logs per second than Vector can parse then over time Vector will continue to hold on to file descriptors preventing rotated log files from being deleted. This can eventually exhaust the disk space on the node and cause pods to be evicted.
To prevent this, there needs to be some way for Vector to shed load. Ideally in an equitable way that sheds load from noisy pods first.
Attempted Solutions
No response
Proposal
One way to address this issue is to add a new
max_open_rotated_files_per_pod
configuration to thekubernetes_logs
source. This would allow users to define the maximum number of files Vector could track for a given pod.Example:
Given:
max_open_rotated_files_per_pod=2
andoldest_first=true
foo
outputs logs faster than Vector can process themNow that Vector is tracking 3 files for pod
foo
, butmax_open_rotated_files_per_pod
is set to 2, Vector will stop tracking the oldest file, which will allow the system to remove it.Caveats
This setting will lead to log loss, which should be called out in documentation. If added, a corresponding metric should be added to allow users to know how many log files are being left unread.
References
Version
vector 0.33.0
The text was updated successfully, but these errors were encountered: