-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frequent crash when in_tail multiline on and file rotation #4177
Comments
@wubn2000 Can you open an issue here to track this one on the AWS side: https://github.com/aws/aws-for-fluent-bit We'll try to get some info on the root cause using the valgrind tool: https://www.valgrind.org/ |
Thanks Wesley, just opened one there. |
I've been unable to reproduce this using the steps provided. I did get a crash on startup though, and its telling me the regex is invalid:
Valgrind gives this:
I think this is a separate issue where an invalid parser causes it to crash. Changing the parser to remove the invalid warning got rid of the crash. |
Valgrind logs for crash: aws/aws-for-fluent-bit#255 (comment) |
Bug Report
Describe the bug
When turn on the multiline parser, ( the one I used is
[PARSER]
Name multiline_stats
Format regex
Regex /^(?\d{10,}.*)/
), if monitoring on a rotate log file, it always crashed due to the file rotation (after the rotated file got deleted), with info like:
*** Error in `./fluent-bit': double free or corruption (out): 0x00007f66c2f65e80 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7f66fea7e329]
./fluent-bit[0x4b73ea]
./fluent-bit(flb_tail_file_remove+0x13e)[0x4ba469]
./fluent-bit(flb_tail_file_purge+0x293)[0x4bc068]
./fluent-bit(flb_input_collector_fd+0x3b2)[0x4614bd]
./fluent-bit(flb_engine_start+0x707)[0x472660]
./fluent-bit[0x456cf5]
Without turning on multiline parser, it won't crash. But with it on, it always crash during the file deletion.
To Reproduce
Note here is one message, with the last field as a multiline field:
1633801021538,"customer2788",13,2788,1623239523,"java.lang.RuntimeException: Something has gone wrong, aborting!
at com.myproject.module.MyProject.badMethod(MyProject.java:22)
at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
at com.myproject.module.MyProject.someMethod(MyProject.java:10)
at com.myproject.module.MyProject.main(MyProject.java:6)","Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam. Ut enim ad minim ven."
FLuentbit is monitoring only stats-0.csv. I keep appending this message to a file stats-0.csv, when the size reaches a threshold (can be 500k), do file rotation by "mv stats-0.csv stats-0-timestamp1.csv", then empty stats-0.csv and keep appending the message again. After a while, some oldest stats-0-timestamp1.csv will be deleted from the directory. The crash always happens if the old file stats-0-timestamp1.csv got deleted.
Expected behavior
The fluentbit should not crash.
Screenshots
Your Environment
Name tail
Path /data/test/stats-0.csv
Path_Key file_name
Tag stats
Buffer_Chunk_Size 32KB
Buffer_Max_Size 128KB
Skip_Long_Lines On
Multiline On
Parser_Firstline multiline_stats
Multiline_Flush 5
Rotate_Wait 35
Refresh_Interval 1
Read_from_head true
Mem_Buf_Limit 500MB
storage.type filesystem
DB /data/tail-0.db
DB.locking true
[SERVICE]
Flush 1
Daemon off
Log_Level info
Parsers_File parser.example.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
storage.path /data/storage
storage.backlog.mem_limit 500M
storage.max_chunks_up 200
@include inputs.example.conf
[OUTPUT]
Name stdout
Additional context
The text was updated successfully, but these errors were encountered: