Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline filters stopped working in 1.9.3 #5524

Closed
alexku7 opened this issue Jun 1, 2022 · 13 comments
Closed

Multiline filters stopped working in 1.9.3 #5524

alexku7 opened this issue Jun 1, 2022 · 13 comments

Comments

@alexku7
Copy link

alexku7 commented Jun 1, 2022

Bug Report

Describe the bug
Hello
After upgrading to 1.9.3 multiline filter stopped working at all. The multi lines are split.
Reverting to 1.9.2 solved the problem. The config files are the same without changes.

What could be a problem?
Tried to play wioth Emitter_Mem_Buf_Limit and buffer on/off but without any positive result :(
The config files:

[SERVICE]
    Flush 1
    Daemon Off
    Log_Level info
    Parsers_File parsers.conf
    Parsers_File custom_parsers.conf
    HTTP_Server On
    HTTP_Listen 0.0.0.0
    HTTP_Port 2020
[INPUT]
    Name tail
    Path /var/log/containers/*.log
    Parser docker
    Tag        kube.*
    Mem_Buf_Limit 30MB
    Buffer_Max_Size  10MB
    Read_from_head true
    DB               /var/run/flbkube5.db
    Skip_Long_Lines On
    Skip_Empty_Lines On
    Refresh_Interval 5
    Buffer_Chunk_Size 256k

[FILTER]
    name                  multiline
    match                 kube.*
    multiline.key_content log
    # Emitter_Mem_Buf_Limit 100M
    # buffer                 on
    multiline.parser      analyzer_multi_line
[FILTER]
    Name kubernetes
    Match kube.*
    Merge_Log Off
    Keep_Log On
    K8S-Logging.Parser On
    K8S-Logging.Exclude On
    Annotations Off
    Kube_Meta_Cache_TTL 120
    Labels Off

[OUTPUT]
    Name es
    Match kube.*
    Host             ${ELASTIC_HOST}
    Port             ${ELASTIC_PORT}
    HTTP_User        ${ELASTIC_USER}
    HTTP_Passwd      ${ELASTIC_PASSWORD}
    tls           On
    tls.verify       ${ELASTIC_VERIFY_TLS}
    Logstash_Format On
    logstash_prefix_key  $kubernetes['namespace_name']
    Retry_Limit False
    Generate_ID On
    Buffer_Size 100M

[MULTILINE_PARSER]
    name          analyzer_multi_line
    type          regex
    flush_timeout 1000
  
    # rules   |   state name   | regex pattern                 | next state name
    # --------|----------------|-----------------------------------------------------------------------
    rule         "start_state"   "/(\d{4}-\d{1,2}-\d{1,2}T\d{1,2}:\d{1,2}:\d{1,2}.\d{1,15}Z)(.*)/"  "cont"
    rule         "cont"          "/^\D+/"  "cont"
@shenningz
Copy link

We found the same problem while testing the fix for #4325 and eventually understood that this new issue does not depend on the rewrite_tag filter. We think it may be related to this change: 57ecb32

We think that with the above commit only one tag stream can be handled by a multiline filter section If more than one stream is processed by one multiline filter, then it breaks.
So a workaround could be to use multiple multiline filter sections with specific tags that handle only a single stream:

[INPUT]
    Name tail
    Path /var/log/containers/some_x_files*.log
    Parser docker
    Tag        kube.x

[INPUT]
    Name tail
    Path /var/log/containers/other_y_files*.log
    Parser docker
    Tag        kube.y

[FILTER]
    name                  multiline
    match                 kube.x
    multiline.key_content log
    multiline.parser      analyzer_multi_line

[FILTER]
    name                  multiline
    match                 kube.y
    multiline.key_content log
    multiline.parser      analyzer_multi_line

@alexku7
Copy link
Author

alexku7 commented Jun 1, 2022

We found the same problem while testing the fix for #4325 and eventually understood that this new issue does not depend on the rewrite_tag filter. We think it may be related to this change: 57ecb32

We think that with the above commit only one tag stream can be handled by a multiline filter section If more than one stream is processed by one multiline filter, then it breaks. So a workaround could be to use multiple multiline filter sections with specific tags that handle only a single stream:

[INPUT]
    Name tail
    Path /var/log/containers/some_x_files*.log
    Parser docker
    Tag        kube.x

[INPUT]
    Name tail
    Path /var/log/containers/other_y_files*.log
    Parser docker
    Tag        kube.y

[FILTER]
    name                  multiline
    match                 kube.x
    multiline.key_content log
    multiline.parser      analyzer_multi_line

[FILTER]
    name                  multiline
    match                 kube.y
    multiline.key_content log
    multiline.parser      analyzer_multi_line

Could be complicated when you have a cluster with many different pods

@nokute78
Copy link
Collaborator

nokute78 commented Jun 3, 2022

Thank you for reporting.
This issue can be reproduce using below logs and config.
I tested on current master

Output: java.2 is broken if buffer of filter_multiline is true.

$ ../bin/fluent-bit -c a.conf 
Fluent Bit v1.9.4
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/06/04 06:59:33] [ info] [fluent bit] version=1.9.4, commit=7b1c19a399, pid=5499
[2022/06/04 06:59:33] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/06/04 06:59:33] [ info] [cmetrics] version=0.3.1
[2022/06/04 06:59:33] [ info] [filter:multiline:multiline.0] created emitter: emitter_for_multiline.0
[2022/06/04 06:59:33] [ info] [sp] stream processor started
[2022/06/04 06:59:33] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_java.1
[2022/06/04 06:59:33] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.1_java.2
[2022/06/04 06:59:33] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1451131 watch_fd=1 name=java1.log
[2022/06/04 06:59:33] [ info] [input:tail:tail.1] inotify_fs_add(): inode=1448777 watch_fd=1 name=java2.log
[2022/06/04 06:59:33] [ info] [output:stdout:stdout.0] worker #0 started
[0] java.1: [1654293573.035478074, {"log"=>"Exception in thread "main" java.lang.IllegalStateException: ..null property
     at com.example.myproject.Author.getBookIds(xx.java:38)
     at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
Caused by: java.lang.NullPointerException
     at com.example.myproject.Book.getId(Book.java:22)
     at com.example.myproject.Author.getBookIds(Author.java:35)
     ... 1 more"}]
[1] java.1: [1654293573.035484346, {"log"=>"single line"}]
[0] java.2: [1654293573.035590517, {"log"=>"single line..."}]
[1] java.2: [1654293573.035591258, {"log"=>"Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!"}]
[2] java.2: [1654293573.035591629, {"log"=>"    at com.myproject.module.MyProject.badMethod(MyProject.java:22)"}]
[3] java.2: [1654293573.035591929, {"log"=>"    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)"}]
[4] java.2: [1654293573.035592200, {"log"=>"    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)"}]
[5] java.2: [1654293573.035592491, {"log"=>"    at com.myproject.module.MyProject.someMethod(MyProject.java:10)"}]
[6] java.2: [1654293573.035592741, {"log"=>"    at com.myproject.module.MyProject.main(MyProject.java:6)"}]
[7] java.2: [1654293573.035592971, {"log"=>"another line..."}]
^C[2022/06/04 06:59:34] [engine] caught signal (SIGINT)

java1.log:

Exception in thread "main" java.lang.IllegalStateException: ..null property
     at com.example.myproject.Author.getBookIds(xx.java:38)
     at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
Caused by: java.lang.NullPointerException
     at com.example.myproject.Book.getId(Book.java:22)
     at com.example.myproject.Author.getBookIds(Author.java:35)
     ... 1 more
single line

java2.log:

single line...
Dec 14 06:41:08 Exception in thread "main" java.lang.RuntimeException: Something has gone wrong, aborting!
    at com.myproject.module.MyProject.badMethod(MyProject.java:22)
    at com.myproject.module.MyProject.oneMoreMethod(MyProject.java:18)
    at com.myproject.module.MyProject.anotherMethod(MyProject.java:14)
    at com.myproject.module.MyProject.someMethod(MyProject.java:10)
    at com.myproject.module.MyProject.main(MyProject.java:6)
another line...

a.conf:

[SERVICE]
    Log_Level info

[INPUT]
    Name tail
    Read_From_Head on
    Tag java.1
    Path java1.log

[INPUT]
    Name tail
    Read_From_Head on
    Tag java.2
    Path java2.log

[FILTER]
    Name multiline
    Match java.*
    multiline.key_content log
    multiline.parser java
    buffer true
#    debug_flush true

[OUTPUT]
    Name stdout
    Match java.*

@nokute78
Copy link
Collaborator

nokute78 commented Jun 4, 2022

@PettitWesley @edsiper
I think we need to modify flb_ml_append_object.

The new buffer mode of filter_multiline will create stream_id from tag name.
If we use 2 tags, filter_multiline creates 2 stream_ids.

In my testing, the last recently used parser lru_parser seems not to be updated.
Then last_stream_id always points first stream_id that is created from java.1.
It causes broken java.2 logs.

@PettitWesley
Copy link
Contributor

@nokute78 agree, this seems like a bug. You or @edsiper will own the fix inside the multiline lib?

@afk42
Copy link

afk42 commented Jun 8, 2022

Hello,
Is there an estimate for this fix to be rolled out?

We are also having issues with the multiline using the tail input ( reading log files from docker), where some logs ( usually bigger than 16k ) aren't being successfully joined by fuentbit.

And we think this is not related with an specific language since we are testing with simple big strings and we're getting split lines on the output.

@nokute78
Copy link
Collaborator

nokute78 commented Jun 9, 2022

@PettitWesley No I don't.

@nokute78
Copy link
Collaborator

Below diff may fix the issue.

diff --git a/src/multiline/flb_ml.c b/src/multiline/flb_ml.c
index 5521f4f11..e9d235845 100644
--- a/src/multiline/flb_ml.c
+++ b/src/multiline/flb_ml.c
@@ -822,6 +822,7 @@ int flb_ml_append_object(struct flb_ml *ml, uint64_t stream_id,
             flb_ml_flush_parser_instance(ml,
                                          lru_parser,
                                          lru_parser->last_stream_id);
+            lru_parser = NULL;
         }
     }
 

nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
@nokute78
Copy link
Collaborator

I sent a patch #5564

nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 12, 2022
@nunosousa-dev
Copy link

Hello, is there a way to review this PR? We are facing this issue too and we are thinking of building from scratch including these changes. Thank you.

nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 17, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 17, 2022
nokute78 added a commit to nokute78/fluent-bit that referenced this issue Jun 17, 2022
@dschaaff
Copy link

Related issues #5452 #5323

edsiper pushed a commit that referenced this issue Jul 6, 2022
edsiper pushed a commit that referenced this issue Jul 6, 2022
@drbugfinder-work
Copy link
Contributor

drbugfinder-work commented Jul 7, 2022

@nokute78 I have tested your patch #5564 with v1.9.5 and it looks very promising. All our tests are green again 👍

@alexku7
Copy link
Author

alexku7 commented Aug 9, 2022

I can confirm that the issue is resolved in 1.9.6
Thank you all :)

@alexku7 alexku7 closed this as completed Aug 9, 2022
mgeriesa pushed a commit to mgeriesa/fluent-bit that referenced this issue Oct 25, 2022
mgeriesa pushed a commit to mgeriesa/fluent-bit that referenced this issue Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants