-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
could not enqueue records into the ring buffer #7071
Comments
It seems like your filter stack is causing a bottleneck which means both inputs combined are ingesting data at a faster rate than it goes through the system. I think we need to address this but I'm curious about the kubernetes filter, I'd like to know if that's the culprit (because it makes http requests in synchronous mode) but I'm not entirely sure about how to get more information about it in this case. |
Hello, I'm having the very same issue with much simpler configuration - 4 tails and 1 output to Splunk. The logs are not particularly talkative (usually a batch of few lines appears every 10 seconds) and still, once fluent-bit reaches the "could not enqueue records into the ring buffer" error state, it's not able to recover from it - no new records appear in Splunk until restart of Fluent-bit. See my config below. I used to have all logs in single tail (using a star *) but I tried to split it as below. Issue is still present. Edit: Fluent-bit 2.0.9 on Windows. [SERVICE]
flush 5
daemon Off
log_level debug
log_file C:\ProgramData\fluent-bit.log
parsers_file parsers.conf
plugins_file plugins.conf
http_server Off
http_listen 0.0.0.0
http_port 2020
storage.metrics on
[INPUT]
name tail
parser json
Path C:\Products\A.log
threaded on
[INPUT]
name tail
parser json
Path C:\Products\B.log
threaded on
[INPUT]
name tail
parser json
Path C:\Products\C.log
threaded on
[INPUT]
name tail
parser json
Path C:\Products\D.log
threaded on
[OUTPUT]
name splunk
match *
Host http-inputs.splunkcloud.com
Port 443
Tls On
Splunk_Token ******
event_index ******
event_host ******
# LEGACY DNS resolver due to memory leak since v2.0.0
# https://github.com/fluent/fluent-bit/issues/6525
net.dns.resolver LEGACY |
@leonardo-albertovich I am having same issue after using threading feature for tail input. Where is ring buffer used? I have set the storage to filesystem for the tail input. Why does it not flush to filesystem? Where is the bottleneck? [2023/05/05 16:16:31] [error] [input:tail:tail.2] could not enqueue records into the ring buffer |
Hi @amolbms, the ring buffer is used to move ingested records from the input plugin threads to the main pipeline thread where they are filtered, persisted and routed. I'd need to know a bit more about your setup to make a proper assessment but if you are running fluent-bit 2.1 (or are able to upgrade) you'll find that moving your filters to the processor stack of the input plugin (which requires you to use yaml as your configuration file format) will probably eliminate this issue. Something else you can do is ensure your output plugins are running in threaded mode (with If you share a bit more information about your setup I might be able to give you better feedback. |
@leonardo-albertovich Thanks for the quick feedback. We are using lua script in the filter. Here is my config for the input which is having issues. Sure I will try to see how I can move filter to the processor. [INPUT]
Name tail
Path /varlog/<service>/access.log
DB /varlog/<service>/access.db.pos
DB.Sync Normal
DB.locking true
Buffer_Chunk_Size 15M
Buffer_Max_Size 500MB
Mem_Buf_Limit 500MB
read_from_head true
Refresh_Interval 5
Rotate_Wait 20
threaded on
Tag mdsd.xxx
[FILTER]
Name lua
Match mdsd.xxx
script fluentbit_filter.lua
call modify_record_for_xxx |
Does this look correct config? pipeline:
inputs:
- name: tail
Path /varlog//access.log
DB /varlog//access.db.pos
DB.Sync Normal
DB.locking true
Buffer_Chunk_Size 15M
Buffer_Max_Size 500MB
Mem_Buf_Limit 500MB
read_from_head true
Refresh_Interval 5
Rotate_Wait 20
threaded on
Tag mdsd.xxx
processors:
logs:
- name: lua
call: modify_record_for_xxx
script: fluentbit_filter.lua |
The processor block looks correct, the rest does as well save for the indentation issue and I assume the path The one thing there that's counter productive in my opinion are the values for Here's an example, you'll have to replace the output component but other than that it should be compliant with your configuration :
|
In that example I shared the filter step is performed in the input thread before inserting the records in the ring buffer which should eliminate the bottleneck. |
Thanks @leonardo-albertovich. I tried processor and the error went away. After adding rewrite_tag inside the processor I am getting below exception, Is it not supported? My configuration is, processors:
logs:
- name: lua
match: mdsd.fdaceesslogs
script: fluentbit_filter.lua
call: modify_record_for_fdaccesslogs
- name: rewrite_tag
match: mdsd.fdaceesslogs
rule: ${MDSD_REWRITE_TAG_FILTER_KEY} ^.*$ mdsd.azuremonitorlogs true
- name: lua
match: mdsd.azuremonitorlogs
script: fluentbit_filter.lua
call: modify_record_for_azuremonitorlogs
|
That's not expected at all, could you please share more context with me? A complete and properly escaped config file would probably be enough for me to find the root of the issue and fix it. If you can't or don't want to share the configuration file publicly feel free to send me a private message in slack. Thank you! |
Sure, let me share config on slack. Thanks you! |
Hi @amolbms, it seems like you couldn't find me in slack, I'd really appreciate it if you shared that configuration so we can fix this issue. Thank you! |
We have almost exactly the same problem. This config doesn't crash, but reports the ring buffer error.
There are multiple filters downstream. The first is a
Then Fluent Bit immediately segfaults on boot:
|
Yes, that's an issue we're actively working on and expect to fix within the week. |
Quick update: We have already solved the initialization issue and are in the process of improving how the |
Hi, @leonardo-albertovich - With 2.0.9 version, I am also seeing ton of errors "could not enqueue records into the ring buffer" and fluent-bit doesn't get recover from these errors. Is there any workaround? is the bottleneck here record_modifier filter plugin? Can you please advise Here is the config https://github.com/microsoft/Docker-Provider/blob/ci_prod/build/linux/installer/conf/fluent-bit-geneva.conf Here is the full logs - [2023/07/14 14:18:48] [ info] [fluent bit] version=2.0.9, commit=, pid=174 |
The issue is how resume and pause are handled by input threads. flb_input_pause sends a signal to input thread event loop flb_input_resume is done by the main thread, which causes race condition. The proper fix is to have flb_input_resume also send signal to the input thread, so pause and resume don't happen out of order or stomp on each other. I will have a PR ready soon. I just want the fix to run in my env for a couple of hours to see if race condition is truly fixed. |
Bug Report
Describe the bug
Error Log:
fluent-bit config
template:
metadata:
annotations:
fluentbit.io/parser: custom-nginx
The text was updated successfully, but these errors were encountered: