Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emit log entries when more than 0 after exclusion #618

Conversation

ganga1980
Copy link
Contributor

If the namespace excluded in both stdout & stderr streams, then no issue since we don’t tail the logs of the container of that namespace completely.

But, If the namespace excluded either to stdout or stderr, then we drop messages of excluded stream after receiving log lines.

This is depends on the timing but If the batch, we received from the fluent bit tail plugin contains the all log lines of excluded stream then instead of dropping them, because of the bug in ciprod06112021, agent try to ingest incorrect path which will fail with 403 (which is expected) but those log lines keep retried.

This will not issue of the number of dropped messages but if the dropped messages are very high and that will cause, the number of log lines are retrying are very high, then that will impact the memory usage of td-agent-bit (aka Fluent bit) and eventually agent will OOM if the memory usage keep grows because of the retries .

In this customer case - Azure/AKS#2457, the namespace excluded “baapi2-test” excluded for stdout but not stderr, the containers in this namespace generating very high number of stdout logs which are keep retrying incorrect path instead of dropping those log lines.

@ganga1980 ganga1980 requested a review from a team August 2, 2021 15:49
@ganga1980
Copy link
Contributor Author

since changes in this PR covered by cherr-pick from prod PR so closing this PR.

@ganga1980 ganga1980 closed this Aug 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants