Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_tail: create stream_id by file inode(#4190) #4197

Merged
merged 2 commits into from
Oct 26, 2021

Conversation

nokute78
Copy link
Collaborator

Fixes #4190

Currently, stream_id of multiline is created by filename.
It means stream_id of rotated file will be same id since the filename is same.

This patch is to create stream_id from inode string.
It will create unique stream_id even if the file is rotated.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • [N/A] Documentation required for this feature

Configuration

#4190 (comment)

a.conf

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    HTTP_Monitor Off
    HTTP_Port    2020
    storage.path              /tmp/fluent-bit-data/
    storage.sync              normal
    storage.checksum          off
    storage.max_chunks_up     128
    storage.backlog.mem_limit 512M
    storage.metrics           on

[INPUT]
    Name              tail
    Path              /tmp/logs/*.log
    multiline.parser  cri
    Tag               kube.*
    Refresh_Interval  1
    Mem_Buf_Limit     50MB
    Buffer_Chunk_Size 1MB
    Buffer_Max_Size   2MB
    Skip_Empty_Lines  On
    Skip_Long_Lines   On
    Rotate_Wait       5
    DB                /tmp/tail-containers-state.db
    DB.Sync           Normal
    storage.type      filesystem

[OUTPUT]
    Name stdout
    Match *

fluent-bit.logrotate:

/tmp/logs/*.log {
	rotate 5
	monthly
	rotate 12
	compress
	delaycompress
	missingok
	notifempty
	create 644 root root
}
  1. mkdir /tmp/logs
  2. echo "$(date -u +'%Y-%m-%dT%H:%M:%S.%NZ') stdout F testing..." >> /tmp/logs/x.log
  3. fluent-bit -c a.conf
  4. logrotate --force fluent-bit.logrotate
  5. echo "$(date -u +'%Y-%m-%dT%H:%M:%S.%NZ') stdout F testing..." >> /tmp/logs/x.log

Debug output

There is no stream_id error like this.
[2021/10/15 13:32:53] [error] [multiline] invalid stream_id 11907403896869813768, could not append content to multiline context

$ ../bin/fluent-bit -c a.conf 
Fluent Bit v1.9.0
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/10/17 12:25:58] [ info] [engine] started (pid=26733)
[2021/10/17 12:25:58] [ info] [storage] version=1.1.4, initializing...
[2021/10/17 12:25:58] [ info] [storage] root path '/tmp/fluent-bit-data/'
[2021/10/17 12:25:58] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/10/17 12:25:58] [ info] [storage] backlog input plugin: storage_backlog.1
[2021/10/17 12:25:58] [ info] [cmetrics] version=0.2.2
[2021/10/17 12:25:58] [ info] [input:tail:tail.0] multiline core started
[2021/10/17 12:25:58] [ info] [input:storage_backlog:storage_backlog.1] queue memory limit: 488.3M
[2021/10/17 12:25:58] [ info] [sp] stream processor started
[2021/10/17 12:25:58] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1703973 watch_fd=1 name=/tmp/logs/x.log
[0] kube.tmp.logs.x.log: [1634441162.278157692, {"time"=>"2021-10-17T03:26:02.278157692Z", "stream"=>"stdout", "_p"=>"F", "log"=>"testing..."}]
[2021/10/17 12:26:05] [ info] [input:tail:tail.0] inode=1703973 handle rotation(): /tmp/logs/x.log => /tmp/logs/x.log.1
[2021/10/17 12:26:05] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1703973 watch_fd=1
[2021/10/17 12:26:05] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1703973 watch_fd=2 name=/tmp/logs/x.log.1
[2021/10/17 12:26:05] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1703983 watch_fd=3 name=/tmp/logs/x.log
[0] kube.tmp.logs.x.log: [1634441170.868503898, {"time"=>"2021-10-17T03:26:10.868503898Z", "stream"=>"stdout", "_p"=>"F", "log"=>"testing..."}]
[0] kube.tmp.logs.x.log: [1634441172.198605785, {"time"=>"2021-10-17T03:26:12.198605785Z", "stream"=>"stdout", "_p"=>"F", "log"=>"testing..."}]
[2021/10/17 12:26:13] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1703973 watch_fd=2
^C[2021/10/17 12:26:13] [engine] caught signal (SIGINT)
[2021/10/17 12:26:13] [ info] [input] pausing tail.0
[2021/10/17 12:26:13] [ info] [input] pausing storage_backlog.1
[2021/10/17 12:26:13] [ warn] [engine] service will stop in 5 seconds
[2021/10/17 12:26:18] [ info] [engine] service stopped
[2021/10/17 12:26:18] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1703983 watch_fd=3

Valgrind output

$ valgrind --leak-check=full ../bin/fluent-bit -c a.conf 
==26743== Memcheck, a memory error detector
==26743== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==26743== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==26743== Command: ../bin/fluent-bit -c a.conf
==26743== 
Fluent Bit v1.9.0
* Copyright (C) 2019-2021 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/10/17 12:27:31] [ info] [engine] started (pid=26743)
[2021/10/17 12:27:31] [ info] [storage] version=1.1.4, initializing...
[2021/10/17 12:27:31] [ info] [storage] root path '/tmp/fluent-bit-data/'
[2021/10/17 12:27:31] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/10/17 12:27:31] [ info] [storage] backlog input plugin: storage_backlog.1
[2021/10/17 12:27:31] [ info] [cmetrics] version=0.2.2
[2021/10/17 12:27:32] [ info] [input:tail:tail.0] multiline core started
[2021/10/17 12:27:32] [ info] [input:storage_backlog:storage_backlog.1] queue memory limit: 488.3M
[2021/10/17 12:27:32] [ info] [sp] stream processor started
[2021/10/17 12:27:32] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1703983 watch_fd=1 name=/tmp/logs/x.log
==26743== Warning: client switching stacks?  SP change: 0x57e59c8 --> 0x4db4ac0
==26743==          to suppress, use: --max-stackframe=10686216 or greater
==26743== Warning: client switching stacks?  SP change: 0x4db4a38 --> 0x57e59c8
==26743==          to suppress, use: --max-stackframe=10686352 or greater
==26743== Warning: client switching stacks?  SP change: 0x57e59c8 --> 0x4db4a38
==26743==          to suppress, use: --max-stackframe=10686352 or greater
==26743==          further instances of this message will not be shown.
[0] kube.tmp.logs.x.log: [1634441254.498786373, {"time"=>"2021-10-17T03:27:34.498786373Z", "stream"=>"stdout", "_p"=>"F", "log"=>"testing..."}]
[2021/10/17 12:27:38] [ info] [input:tail:tail.0] inode=1703983 handle rotation(): /tmp/logs/x.log => /tmp/logs/x.log.1
[2021/10/17 12:27:38] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1703983 watch_fd=1
[2021/10/17 12:27:38] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1703983 watch_fd=2 name=/tmp/logs/x.log.1
[2021/10/17 12:27:38] [ info] [input:tail:tail.0] inotify_fs_add(): inode=1703973 watch_fd=3 name=/tmp/logs/x.log
[0] kube.tmp.logs.x.log: [1634441262.603936634, {"time"=>"2021-10-17T03:27:42.603936634Z", "stream"=>"stdout", "_p"=>"F", "log"=>"testing..."}]
[0] kube.tmp.logs.x.log: [1634441264.109924829, {"time"=>"2021-10-17T03:27:44.109924829Z", "stream"=>"stdout", "_p"=>"F", "log"=>"testing..."}]
^C[2021/10/17 12:27:46] [2021/10/17 12:27:46] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1703983 watch_fd=2
[engine] caught signal (SIGINT)
[2021/10/17 12:27:46] [ info] [input] pausing tail.0
[2021/10/17 12:27:46] [ info] [input] pausing storage_backlog.1
[2021/10/17 12:27:46] [ warn] [engine] service will stop in 5 seconds
[2021/10/17 12:27:51] [ info] [engine] service stopped
[2021/10/17 12:27:51] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=1703973 watch_fd=3
==26743== 
==26743== HEAP SUMMARY:
==26743==     in use at exit: 0 bytes in 0 blocks
==26743==   total heap usage: 1,782 allocs, 1,782 frees, 4,030,797 bytes allocated
==26743== 
==26743== All heap blocks were freed -- no leaks are possible
==26743== 
==26743== For lists of detected and suppressed errors, rerun with: -s
==26743== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Copy link

@ggermis ggermis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the referenced fluent-bit config file conf/multiline_rotation.conf is not checked in

ggermis
ggermis previously approved these changes Oct 17, 2021
Copy link

@ggermis ggermis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested these changes locally, and they indeed solve the issue we were having. (logs stopped getting processed and fluent-bit logs flooding with the message invalid stream_id x, could not append content to multiline context)

Awesome job @nokute78 !!

If stream_id is created by filename, rotated file id will be same.
It causes releasing new multiline instance after file rotation.

Signed-off-by: Takahiro Yamashita <[email protected]>
@nokute78
Copy link
Collaborator Author

@ggermis Thank you for comment and testing. I pushed missing config file.
Your information to reproduce the issue is very helpful for me.

@edsiper edsiper merged commit 4725fb2 into fluent:master Oct 26, 2021
@edsiper
Copy link
Member

edsiper commented Oct 26, 2021

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport to v1.8.x Used to tag items that must be backported to such version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

invalid stream_id x, could not append content to multiline context
4 participants