-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in in_emitter mem_buf when appending metric #9189
Comments
@drbugfinder-work maybe valgrind could find this issue but surprisingly no one reported it till now |
@drbugfinder-work thanks for reporting this. would you please help us with an easy reproducible test case ? |
@leonardo-albertovich @edsiper I have now created a reproducer. I've also tested multiple versions, even back to v2.2.3, but the behavior is the same. You need two Fluent Bit instances, one forwarding (Forward (msgpack) or HTTP (json) makes NO difference) to the other instance. The second (receiving) instance then creates the metric. We are still investigating on this and this is just the current version of our reproducer. We cannot tell for sure (yet) if this happens only when using two instances, but for now we only saw this happening when using two Fluent Bit instances. fb-source.conf:
fb-sink.conf:
You can play around with the numbers. We think there's a tipping point somewhere between ~75-200 logs/second. From there it cannot catch up with processing and the mem size rises. Of course the number of chunks/buckets/... will increase, as the cardinality increases. The important question is why the mem size of the emitter is growing hundreds of Megabytes while the input mem size is only between 0 and 1MB and the real amount of logs is also only a couple hundred Kilobytes. There seems to be absolutely no issue when logs are ingested at a lower rate (with same cardinality but over a longer time period). |
Hi @drbugfinder-work, I'm having a bit of an issue understanding your idea, however, while looking at it using address sanitizer I noticed that there are two obvious memory leaks in One thing I noticed i my tests was that I had lots of flush tasks when trying to terminate the process which is odd because prometheus exporter flushes should be very short lived (as they should encode the metrics, store them in the local context and return) which I still haven't gotten to the root of. Anyway, I wanted to share that leak stuff in order to clear the picture a bit so we can make some progress together. |
we are also seeing these many flush tasks.
As I wrote in my example config, the issue also happens when we use http instead of forward, so in_forward should not be the culprit here. |
What I was saying was that it seems that there are two independent issues and so far I've found that the prometheus exporter flush callback is taking way too long which means that when there is a sustained high volume input it's not able to catch up. I'm still manually tracing to find the bottleneck but you just caught me in the middle of the test. I'll report back in a few minutes. The one thing I don't think I'd fixate on is |
I thought there could be a bottleneck in the code and in the end the only noticeable delay I found was in the metrics context encoding function, however, it was not significant itself. The problem I see with this test is that each chunk contains about 6500 to 7900 individual metric contexts all of which have to be decoded, processed (ie. append labels) and encoded individually which seems to be way too much. I think at this point I'd need another opinion because I'm not sure of what to think of it. |
Just opened PR #9199 to address the leak in in |
Thanks for the #9199 which is already in v3.1.6 now. I think this is the reason why the memory rise was even faster when we used forward in contrast to http. We have not compared that with your fix, yet.
You're right. This is an artificial test with very high load (multiple times higher than in our real environments); however, we constantly run into the same issue in real scenarios (within minutes/hours from start, curve is flatter). However, I'm really interested why there is so much memory being allocated - multiple times higher than the actual log. Is it because of the meta data, metric context,...? Or is there a wrong calculation during realloc or chunk selection. I could not find anything, yet. If the actual logs are only a few MB in size and the emitter buffer has reached already several gigabytes, there is something wrong. |
@leonardo-albertovich I cannot see any difference; the memory consumption is several gigabytes (grows beyond >4GB for about 20 seconds of runtime with 5000 logs/s = ±100.000 dummy logs) at the
|
I am also facing this issue. Details: Now, that I have transitioned to fluentbit version 3.1.4, the highest memory consumption that I have seen is 373MB, and so the pod doesn't gets killed due to OOM error but this comes with following draws:
Could anyone please help me with what version/modifications in the configuration that I should do so that the fluentbit doesn't consume more than 6GB of memory, the append error goes away and also ingestion of the incoming logs happens smoothly ? Version used is 3.1.4 and demo configuration :
|
Hi @1999ankitgoyal, Thank you for providing a second sample of this issue. The reduction in memory consumption from version 3.0.4 to 3.1.4 was due to PR #8659, where I introduced the option for a configurable memory buffer limit and set the default to 10MB. Before that change, the memory buffer was unlimited, which could result in the consumption of many gigabytes of memory. At the time, I introduced the limit to align with other plugins, not realizing that this change would expose an existing memory issue in the pipeline. However, this limitation of the memory buffer leads to the error log you're seeing. @leonardo-albertovich @edsiper I strongly suspect that any other metric-generating process using the flb_input_metrics_append function to insert metrics into the pipeline could potentially encounter the same issue, especially with similar volumes (e.g., >100~500 updates/second). For example, see
When I set the limit to 0 (unlimited) and the buffer reached several gigabytes, I also observed multiple instances where no logs were forwarded through the pipeline at all after reaching a certain memory buffer size. I am still trying to understand the content of the chunks to determine what is using this large amount of memory. |
Hi @edsiper @leonardo-albertovich Instead of modifying a single This also explains why the cardinality is relevant at the input buffer stage, as I initially thought the The proposed solution would be to avoid copying the entire I am still investigating whether this behavior is intentional or if it is a bug. I have not fully reviewed the code yet. For reference, please see the example of two records in an input buffer chunk here: https://gist.github.com/drbugfinder-work/131c11e67a12753e45bccdce478bc145#file-gistfile1-txt And here is the diff from these two entries where one value has changed, but the entire list of label tuples remains the same and is a copy of the previous one: |
@edsiper @leonardo-albertovich Additionally, multi-threading could potentially be a concern. |
If I understood your previous message correctly then what you found is what I pointed out before (thousands of individual metrics contexts being produced which in turn caused the prometheus exporter plugin to delay while processing them). I have a feeling that making those types of changes in Unless there's something I'm missing the issue should be addressed in IDK, I could be wrong, I haven't looked at this since last week and maybe I'm just completely missing the mark. |
You're right - I initially thought you were referring to the cmetrics contexts within the Prometheus output plugin. I now understand your point better. I’ll focus on your suggested approach and explore a solution inside the logs_to_metrics plugin context to periodically inject the metrics into the pipeline. Thanks for the insight! |
Bug Report
Describe the bug
By adding the mem_buf_limit function to the log_to_metrics filter (4178405), we noticed that the mem_buf_limit was reached very often, leading in many
could not append metrics
logs. This happens because the newly introduced buffer limit is 10MB (before that change it was unlimited)The bug happens especially in high load situations. With setting the mem_buf_limit to 0 (unlimited), we can see memory usage rising up to several gigabytes until the process eventually gets killed. However, this increase in memory is NOT steady but follows a sawtooth-like pattern, with values fluctuating and increasing over time.
We see values in a pattern like:
0b -> 1,2MB -> 0b -> 284KB -> 5MB -> 0b -> 20MB ... -> 2,5GB -> 0b -> 282 MB -> 2,9GB
This behavior appears to indicate some kind of memory leak. Unfortunately, we have not yet been able to identify the content of the input chunks. Any hints how to do this?
Please see the screenshots below:
Your Environment
The text was updated successfully, but these errors were encountered: