Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid JSON as a result of using the multiline feature of the tail input plugin #2394

Closed
iyuvalk opened this issue Jul 27, 2020 · 4 comments
Labels

Comments

@iyuvalk
Copy link

iyuvalk commented Jul 27, 2020

Bug Report

Describe the bug
When using fluentbit to collect log events from log files and join them by using the multiline feature of the tail input plugin, duplicate keys are being generated in the output JSON.

To Reproduce

  • Steps to reproduce the problem:
    Under the current folder create the following three folders:
    • config
    • source
    • dest
    • taildb

Save the following content in a file named fluent-bit.conf under ./config:

[SERVICE]
    Flush           10
    Daemon          Off
    Log_Level       info
    HTTP_Server     On
    HTTP_Listen     0.0.0.0
    HTTP_Port       2020
    Parsers_File    fluent-bit-parsers.conf

[INPUT]
    Name             tail
    Path             /source/*.log
    Tag              kube.*
    Refresh_Interval 5
    Mem_Buf_Limit    5MB
#    Skip_Long_Lines  On
    DB               /tail-db/tail-containers-state.db
    DB.Sync          Normal
    Multiline        On
    Parser_Firstline stack_trace_start
    Parser_1         stack_trace_middle

[OUTPUT]
    Name          file
    Match         *
    Path          /dest
    File          output.log
    Format        plain

Save the following content in a file named fluent-bit-parsers.conf under ./config:

[PARSER]
    Name stack_trace_start
    Format regex
    Regex ^(?<log>Exception.*)$

[PARSER]
    Name stack_trace_middle
    Format regex
    Regex ^(?<log>[\s<]+.*)$

Save the following content in a file named sample-log.log under ./source:

Exception in thread "main" java.lang.Error: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /revisions/f1.data_153060 was not found on this server.</p>
</body></html>

	at com.acme.otx.f1.managers.HttpManager$.parseCsvRows(HttpManager.scala:82)
	at com.acme.otx.f1.managers.HttpManager$.revData(HttpManager.scala:49)
	at com.acme.otx.f1.OtxF1$.updateData(OtxF1.scala:34)
	at com.acme.otx.f1.OtxF1$.delayedEndpoint$com$acme$otx$f1$OtxF1$1(OtxF1.scala:13)
	at com.acme.otx.f1.OtxF1$delayedInit$body.apply(OtxF1.scala:10)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1$adapted(App.scala:80)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.App.main(App.scala:80)

Run the following command:

docker run -d --mount type=bind,source="$(pwd)"/config,target=/fluent-bit/etc --mount type=bind,source="$(pwd)"/source,target=/source --mount type=bind,source="$(pwd)"/dest,target=/dest --mount type=bind,source="$(pwd)"/taildb,target=/tail-db fluent/fluent-bit:1.5.2

In the folder ./dest you will find a file named output.log with the following invalid JSON content (Note the duplicate "log" keys in the JSON):

{"log":"Exception in thread \"main\" java.lang.Error: <!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">","log":"<html><head>","log":"<title>404 Not Found</title>","log":"</head><body>","log":"<h1>Not Found</h1>","log":"<p>The requested URL /revisions/f1.data_153060 was not found on this server.</p>","log":"</body></html>","log":"\tat com.acme.otx.f1.managers.HttpManager$.parseCsvRows(HttpManager.scala:82)","log":"\tat com.acme.otx.f1.managers.HttpManager$.revData(HttpManager.scala:49)","log":"\tat com.acme.otx.f1.OtxF1$.updateData(OtxF1.scala:34)","log":"\tat com.acme.otx.f1.OtxF1$.delayedEndpoint$com$acme$otx$f1$OtxF1$1(OtxF1.scala:13)","log":"\tat com.acme.otx.f1.OtxF1$delayedInit$body.apply(OtxF1.scala:10)","log":"\tat scala.Function0.apply$mcV$sp(Function0.scala:39)","log":"\tat scala.Function0.apply$mcV$sp$(Function0.scala:39)","log":"\tat scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)","log":"\tat scala.App.$anonfun$main$1$adapted(App.scala:80)","log":"\tat scala.collection.immutable.List.foreach(List.scala:392)","log":"\tat scala.App.main(App.scala:80)"}

Expected behavior
Under no circumstances should fluentbit generate a JSON that contains duplicate keys, right? (-:

Screenshots
N/A

@iyuvalk
Copy link
Author

iyuvalk commented Aug 9, 2020

Anyone???

@github-actions
Copy link
Contributor

github-actions bot commented May 5, 2021

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label May 5, 2021
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@edsiper
Copy link
Member

edsiper commented Jul 20, 2021

Multiline Update

As part of Fluent Bit v1.8, we have released a new Multiline core functionality. This new big feature allows you to configure new [MULTILINE_PARSER]s that support multi formats/auto-detection, new multiline mode on Tail plugin, and also on v1.8.2 (to be released on July 20th, 2021) a new Multiline Filter.

For now, you can take at the following documentation resources:

Documentation pages now point to complete config examples that are available on our repository.

Thanks everyone for supporting this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants