multiline parser stops working after Pod restarts #5256

irizzant · 2022-04-07T08:08:18Z

Bug Report

Describe the bug
I have fluentbit deployed in my k8s cluster configured to join Java stacktraces using multiline parser.

Here the configuration I use:

[SERVICE]
    Daemon Off
    Flush 1
    Log_Level info
    Parsers_File parsers.conf
    Parsers_File custom_parsers.conf
    HTTP_Server On
    HTTP_Listen 0.0.0.0
    HTTP_Port 2020
    Health_Check On
    storage.path /tmp/flb-storage/
    storage.backlog.mem_limit 500M

[INPUT]
    Name tail
    Path /var/log/containers/*.log
    Exclude_Path /var/log/containers/*fluent*.log
    multiline.parser cri
    Tag kube.*
    Mem_Buf_Limit 50MB
    Buffer_Chunk_Size 100MB
    Buffer_Max_Size 200MB
    Skip_Long_Lines On
    storage.type filesystem
[INPUT]
    Name systemd
    Tag host.*
    Systemd_Filter _SYSTEMD_UNIT=kubelet.service
    Read_From_Tail On

[FILTER]
    name                  multiline
    match                 kube.*
    buffer                on
    multiline.key_content log
    multiline.parser      multiline-java
    emitter_storage.type  filesystem
    emitter_mem_buf_limit 200MB
[FILTER]
    Name kubernetes
    Match kube.*
    Merge_Log On
    Keep_Log Off
    K8S-Logging.Parser On
    K8S-Logging.Exclude On
[FILTER]
    Name nest
    Match kube.*
    Operation lift
    Wildcard pod_name*
    Wildcard namespace_name*
    Wildcard host*
    Nested_under kubernetes
[FILTER]
    Name record_modifier
    Match kube.*
    Allowlist_key log
    Allowlist_key pod_name
    Allowlist_key namespace_name
    Allowlist_key host    

[OUTPUT]
    Name stdout
    Match *

Parser configuration:

[PARSER]
    Name docker_no_time
    Format json
    Time_Keep Off
    Time_Key time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
[MULTILINE_PARSER]
    name          multiline-java
    type          regex
    flush_timeout 1000
    #
    # Regex rules for multiline parsing
    # ---------------------------------
    #
    # configuration hints:
    #
    #  - first state always has the name: start_state
    #  - every field in the rule must be inside double quotes
    #
    # rules |   state name  | regex pattern                  | next state
    # ------|---------------|--------------------------------------------
    rule      "start_state"   "/(\d{8}_\d{6}_\d{3})(.*)/"      "cont"
    rule      "cont"          "/^[^\d].*/"                      "cont"

When the Pod first starts everything seems to be working fine:

[58] kube.var.log.containers.wms-5d8dfff84-94rkr_wms_wms-94af0bc11387e5980942ee6da5fe67f4d8c05a4d296e5a0b3a687b436299d6a4.log: [1649318345.423328754, {"log"=>"20220407_095905_418 ERROR #[[ServerService Thread Pool -- 622@srv=wms-5d8dfff84-94rkr]]# #[[it.sdb.jee.init.modules.ModuleRTC]]# Impossibile avviare integrazione con RTC: java.lang.RuntimeException: Unable to lookup service
        at it.sdb.apps.crmtrk.eng.rtc.RTCServiceFactories.getRtcConnection(RTCServiceFactories.java:33) [IntegrationService-RTC-API-7.0.1-SNAPSHOT.jar:]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_322]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_322]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_322]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_322]
...

But if I restart the Pod, multiline parser stops working:

[55] kube.var.log.containers.wms-859b94b47-vpbqm_wms_wms-1ae71c0dd10ed85802c7a3a18223b27c4b87db90ca46444e55d9fe7c9dc01cf3.log: [1649318612.231703657, {"log"=>"20220407_100332_226 ERROR #[[ServerService Thread Pool -- 288@srv=wms-859b94b47-vpbqm]]# #[[it.sdb.jee.init.modules.ModuleRTC]]# Impossibile avviare integrazione con RTC: java.lang.RuntimeException: Unable to lookup service", "pod_name"=>"wms-859b94b47-vpbqm", "namespace_name"=>"wms", "host"=>"k3d-test-server-0"}]
[56] kube.var.log.containers.wms-859b94b47-vpbqm_wms_wms-1ae71c0dd10ed85802c7a3a18223b27c4b87db90ca46444e55d9fe7c9dc01cf3.log: [1649318612.231735453, {"log"=>" at it.sdb.apps.crmtrk.eng.rtc.RTCServiceFactories.getRtcConnection(RTCServiceFactories.java:33) [IntegrationService-RTC-API-7.0.1-SNAPSHOT.jar:]", "pod_name"=>"wms-859b94b47-vpbqm", "namespace_name"=>"wms", "host"=>"k3d-test-server-0"}]
[57] kube.var.log.containers.wms-859b94b47-vpbqm_wms_wms-1ae71c0dd10ed85802c7a3a18223b27c4b87db90ca46444e55d9fe7c9dc01cf3.log: [1649318612.231739087, {"log"=>" at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_322]", "pod_name"=>"wms-859b94b47-vpbqm", "namespace_name"=>"wms", "host"=>"k3d-test-server-0"}]

To Reproduce

Steps to reproduce the problem:

deploy fluentbit with Helm chart with this values file fluentbit.yaml
Expected behavior
multiline parser should keep working

Your Environment

Version used: 1.8.14
Configuration:
Environment name and version (e.g. Kubernetes? What version?): k3d version v5.2.1
k3s version v1.21.7-k3s1 (default)

The text was updated successfully, but these errors were encountered:

chenlingmin · 2022-04-08T07:23:08Z

#5245
I have the same problem.

You can try adding the number of replicas of the Pod directly without restarting the Pod, I guess you should get the same result as above.

irizzant · 2022-04-11T16:07:40Z

scaling the replicas doesn't change the end result

trallnag · 2022-04-19T12:35:14Z

I have the same issue and I can reliably reproduce the problem.

Pod x is running
Deployed Fluent Bit
Logs of pod x are merged
Deleted pod x
Deployment starts a new pod x
Logs of pod x are not being merged anymore

Your Environment

Chart version 0.19.23
Image aws-for-fluent-bit version 2.23.3 (basically 1.8.15)

lecaros · 2022-04-22T21:00:31Z

Hi,
do you have reproduction steps that we can just run and replicate?
Have you used built-in java parser?

irizzant · 2022-04-26T12:18:08Z

@lecaros I've added here the detailed configuration and example inputs, what else do you need as a reproducer?

trallnag · 2022-04-27T14:40:42Z

@lecaros, yes, I have used the built-in java parser.

dalcouffe · 2022-05-04T18:52:29Z

I have this exact same issue running fluent bit version 1.9.3.

ehazan · 2022-05-08T19:33:34Z

Same issue for me at 1.9.3. Fluentbit starts up fine but when I take the deployment replicas to 0 and then back to 1 it breaks
The multi-line isn't functioning anymore

milen-simeonov · 2022-06-30T07:06:01Z

Same issue for me at 1.9.3 and 1.9.4

github-actions · 2022-09-29T02:18:45Z

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

irizzant · 2022-09-29T06:47:45Z

Not stale!

kwkgaya · 2022-10-07T06:39:39Z

Can you re-test with 1.9.6 as it seems the issue might be fixed.

milen-simeonov · 2022-10-07T06:52:47Z

Yes, I can confirm - the issue has been fixed since 1.9.6

irizzant · 2022-10-07T15:08:25Z

Closing the issue since it's fixed in 1.9.6

irizzant added the status: waiting-for-triage label Apr 7, 2022

yangzan0816 mentioned this issue Apr 8, 2022

Multiline parser results not consistent #4928

Closed

lecaros added waiting-for-user Waiting for more information, tests or requested changes and removed status: waiting-for-triage labels Apr 22, 2022

trallnag mentioned this issue May 24, 2022

Multiline log guidance aws/aws-for-fluent-bit#100

Closed

github-actions bot added the Stale label Sep 29, 2022

github-actions bot removed the Stale label Sep 30, 2022

irizzant closed this as completed Oct 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiline parser stops working after Pod restarts #5256

multiline parser stops working after Pod restarts #5256

irizzant commented Apr 7, 2022 •

edited

Loading

chenlingmin commented Apr 8, 2022 •

edited

Loading

irizzant commented Apr 11, 2022

trallnag commented Apr 19, 2022

lecaros commented Apr 22, 2022

irizzant commented Apr 26, 2022

trallnag commented Apr 27, 2022

dalcouffe commented May 4, 2022

ehazan commented May 8, 2022 •

edited

Loading

milen-simeonov commented Jun 30, 2022

github-actions bot commented Sep 29, 2022

irizzant commented Sep 29, 2022 •

edited

Loading

kwkgaya commented Oct 7, 2022

milen-simeonov commented Oct 7, 2022

irizzant commented Oct 7, 2022

multiline parser stops working after Pod restarts #5256

multiline parser stops working after Pod restarts #5256

Comments

irizzant commented Apr 7, 2022 • edited Loading

Bug Report

chenlingmin commented Apr 8, 2022 • edited Loading

irizzant commented Apr 11, 2022

trallnag commented Apr 19, 2022

Your Environment

lecaros commented Apr 22, 2022

irizzant commented Apr 26, 2022

trallnag commented Apr 27, 2022

dalcouffe commented May 4, 2022

ehazan commented May 8, 2022 • edited Loading

milen-simeonov commented Jun 30, 2022

github-actions bot commented Sep 29, 2022

irizzant commented Sep 29, 2022 • edited Loading

kwkgaya commented Oct 7, 2022

milen-simeonov commented Oct 7, 2022

irizzant commented Oct 7, 2022

irizzant commented Apr 7, 2022 •

edited

Loading

chenlingmin commented Apr 8, 2022 •

edited

Loading

ehazan commented May 8, 2022 •

edited

Loading

irizzant commented Sep 29, 2022 •

edited

Loading