-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite_Tag is causing significant memory spike when upstream connectivity is lost #4049
Comments
@edsiper we are really stuck on that ... any help is appreciated ... please let me know if you need more details on it. |
Also have captured some details by running Massif with Valgrind
So looking at the output the heap mem keeps on growing which is unexpected behaviour as I have set Mem_Buf_Limit 15MB. my input configuration
have parsers
|
Currently in_emitter has 2 threads.
I think |
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
Hi @nokute78 thanks for providing the fix.I have tested fluent-bit with the changes in your PR .
this fixes our major concern when the pods we getting OOM killed. will try to explain in steps as -
my configuration ->
Observation - So looks like when the emitter plugin goes for a pause - below condition becomes true -
so till the time tail plugin is not paused , data gets written to the original chunk. Thanks |
@utkmishr Thank you for testing. |
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
@utkmishr I updated #4128. diff --git a/plugins/filter_rewrite_tag/rewrite_tag.c b/plugins/filter_rewrite_tag/rewrite_tag.c
index 28fff0ba..de0703d3 100644
--- a/plugins/filter_rewrite_tag/rewrite_tag.c
+++ b/plugins/filter_rewrite_tag/rewrite_tag.c
@@ -398,7 +398,7 @@ static int cb_rewrite_tag_filter(const void *data, size_t bytes,
* - record with new tag was emitted and the rule says it must be preserved
* - record was not emitted
*/
- if ((ret == FLB_TRUE && keep == FLB_TRUE) || ret == FLB_FALSE) {
+ if (keep == FLB_TRUE) {
msgpack_sbuffer_write(&mp_sbuf, (char *) data + pre, off - pre);
} |
@utkmishr e2b566e is to prevent emitting when in_emitter pauses.
I think line 33 to125 will not be emitted by above patch. |
Hi @nokute78 , you are right, the tail plugin also gets paused with this fix , I missed the condition https://github.com/nokute78/fluent-bit/blob/e2b566e9221922ccc0398957649a8ced0cdb3fa4/src/flb_filter.c#L138 . However, I have done a few tests and looks like some records goes missing due to change e2b566e. In my test set up , I have mem_buf_limit as 32kb , I am emitting records each 1 kb in length and there is no upstream connection when I started the test. so after emitting 30 records(30kb) emitter_plugin gets paused and then tail_plugin gets paused. In the output I have observed that from 1 to 30 sequence number gets tagged with rewrite filter and then sequence number 60 onwards gets tagged with the original tag, Also looks like seq number 30 to 60 went missing in both the tags. So as per my observation if we have 60 records in a chunk then first 30 gets tagged by rewrite tag by the function - so at this stage the value of emitter count https://github.com/nokute78/fluent-bit/blob/e2b566e9221922ccc0398957649a8ced0cdb3fa4/plugins/filter_rewrite_tag/rewrite_tag.c#L392 is greater than 0. and after that no records gets written to the buffer https://github.com/nokute78/fluent-bit/blob/e2b566e9221922ccc0398957649a8ced0cdb3fa4/plugins/filter_rewrite_tag/rewrite_tag.c#L402. so finally code block https://github.com/nokute78/fluent-bit/blob/e2b566e9221922ccc0398957649a8ced0cdb3fa4/src/flb_filter.c#L140 is executed and we loose that data. My input file is - And the output - Can you please test the commit e2b566e with these inputs and let me know if I missed anything. |
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
@utkmishr You are right. Could you test filesystem buffering ? Your issue is backpressure issue.
However filesystem buffering doesn't work. #4221 |
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
In previous implementation, in_emitter has 2 buffers. - 1. rewrite_tag -> in_emitter_add_record -> msgpack_sbuffer_write - 2. (timer thread. every 0.5 sec) cb_queue_chunks -> flb_input_chunk_append_raw 'mem_buf_limit' is for flb_input_chunk API, so the thread 1 doesn't have limits. The patch is to modify writing sequence. rewrite_tag -> in_emitter_add_record -> flb_input_chunk_append_raw Signed-off-by: Takahiro Yamashita <[email protected]>
Bug Report
Describe the bug
When using Rewrite_Tag in my configuration I am seeing significant memory spike . Observed this when remote upstream connectivity is lost and Retry_Limit is set as False. I have set Mem_Buf_Limit as 5mb
To Reproduce
Example log -
git_rewrite_tag_issue_log.txt
Steps to reproduce the problem:
Got the issue while using below configuration -
conf_rewrite_tag_mem_leak.txt
Expected behavior
on reaching Mem_Buf_Limit (5mb) the plugin should pause and should not cause the memory spike.
Screenshots
Your Environment
Additional context
Issue exists in both 1.7.0 and 1.8.4
The text was updated successfully, but these errors were encountered: