Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluent-bit keeps opening new fd for a file when a large number of files are watched #1843

Closed
wtan825 opened this issue Dec 27, 2019 · 7 comments

Comments

@wtan825
Copy link

wtan825 commented Dec 27, 2019

Bug Report

Describe the bug
fluent-bit keeps opening new fd for a file when a large number of files are watched.

To Reproduce

  • Example log message if applicable:
100.109.195.91 - - [17/Feb/2017:00:08:11 +0800] "GET /data/upload/shop/common/loading.gif HTTP/1.0" 200 134 "http://www.mall121.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Trident/4.0; Windows NT 6.1; SLCC2 2.5.5231; .NET CLR 2.0.50727; .NET CLR 4.1.23457; .NET CLR 4.0.23457; Media Center PC 6.0; MS-WK 8)" "140.205.201.12" echo 2 times
100.109.195.78 - - [17/Feb/2017:00:08:11 +0800] "GET /shop/templates/default/images/u-safe.png HTTP/1.0" 404 3675 "http://www.mall121.com/" "Mozilla/4.0 (compatible; MSIE 8.0; Trident/4.0; Windows NT 6.1; SLCC2 2.5.5231; .NET CLR 2.0.50727; .NET CLR 4.1.23457; .NET CLR 4.0.23457; Media Center PC 6.0; MS-WK 8)" "140.205.201.12" echo 3 times
...
  • Steps to reproduce the problem:
    Tail a dir, which include 10000 files. then send those logs to es plugin

Expected behavior
fluent-bit should not keep opening new fd for a file. And memory usage should be normal。

Screenshots

ps aux | fluent
root       148 55.6 34.2 39217020 2800348 ?    Sl   06:02  57:57 /fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf

ls /logtest/var/applog/ | wc -l
10928

cd /logtest/var/applog/
du * -sh

4.0K	log.8609
4.0K	log.861
4.0K	log.8610
4.0K	log.8611
4.0K	log.8612
4.0K	log.8613
4.0K	log.8614
4.0K	log.8615
4.0K	log.8616
4.0K	log.8617
4.0K	log.8618
4.0K	log.8619
4.0K	log.862
4.0K	log.8620
4.0K	log.8621
4.0K	log.8622
4.0K	log.8623
4.0K	log.8624
4.0K	log.8625
4.0K	log.8626
4.0K	log.8627
4.0K	log.8628
4.0K	log.8629
4.0K	log.863
4.0K	log.8630
4.0K	log.8631
4.0K	log.8632
4.0K	log.8633
4.0K	log.8634
...

ls -l /proc/148/fd |grep log.808 |sort -n
lr-x------ 1 root root 64 Dec 27 06:06 11665 -> /logtest/var/applog/log.808
lr-x------ 1 root root 64 Dec 27 06:06 11666 -> /logtest/var/applog/log.8080
lr-x------ 1 root root 64 Dec 27 06:06 11667 -> /logtest/var/applog/log.8081
lr-x------ 1 root root 64 Dec 27 06:06 11668 -> /logtest/var/applog/log.8082
lr-x------ 1 root root 64 Dec 27 06:06 11669 -> /logtest/var/applog/log.8083
lr-x------ 1 root root 64 Dec 27 06:06 11670 -> /logtest/var/applog/log.8084
lr-x------ 1 root root 64 Dec 27 06:06 11671 -> /logtest/var/applog/log.8085
lr-x------ 1 root root 64 Dec 27 06:06 11672 -> /logtest/var/applog/log.8086
lr-x------ 1 root root 64 Dec 27 06:06 11673 -> /logtest/var/applog/log.8087
lr-x------ 1 root root 64 Dec 27 06:06 11674 -> /logtest/var/applog/log.8088
lr-x------ 1 root root 64 Dec 27 06:06 11675 -> /logtest/var/applog/log.8089
lr-x------ 1 root root 64 Dec 27 06:06 14588 -> /logtest/var/applog/log.808
lr-x------ 1 root root 64 Dec 27 06:06 14589 -> /logtest/var/applog/log.8080
lr-x------ 1 root root 64 Dec 27 06:06 14590 -> /logtest/var/applog/log.8081
lr-x------ 1 root root 64 Dec 27 06:06 14591 -> /logtest/var/applog/log.8082
lr-x------ 1 root root 64 Dec 27 06:06 14592 -> /logtest/var/applog/log.8083
lr-x------ 1 root root 64 Dec 27 06:06 14593 -> /logtest/var/applog/log.8084
lr-x------ 1 root root 64 Dec 27 06:06 14594 -> /logtest/var/applog/log.8085
lr-x------ 1 root root 64 Dec 27 06:06 14595 -> /logtest/var/applog/log.8086
lr-x------ 1 root root 64 Dec 27 06:06 14596 -> /logtest/var/applog/log.8087
lr-x------ 1 root root 64 Dec 27 06:06 14597 -> /logtest/var/applog/log.8088
lr-x------ 1 root root 64 Dec 27 06:06 14598 -> /logtest/var/applog/log.8089
lr-x------ 1 root root 64 Dec 27 06:06 17514 -> /logtest/var/applog/log.808
lr-x------ 1 root root 64 Dec 27 06:06 17515 -> /logtest/var/applog/log.8080
lr-x------ 1 root root 64 Dec 27 06:06 17516 -> /logtest/var/applog/log.8081
lr-x------ 1 root root 64 Dec 27 06:06 17517 -> /logtest/var/applog/log.8082
lr-x------ 1 root root 64 Dec 27 06:06 17518 -> /logtest/var/applog/log.8083
lr-x------ 1 root root 64 Dec 27 06:06 17519 -> /logtest/var/applog/log.8084
lr-x------ 1 root root 64 Dec 27 06:06 17520 -> /logtest/var/applog/log.8085
lr-x------ 1 root root 64 Dec 27 06:06 17521 -> /logtest/var/applog/log.8086
lr-x------ 1 root root 64 Dec 27 06:06 17522 -> /logtest/var/applog/log.8087
lr-x------ 1 root root 64 Dec 27 06:06 17523 -> /logtest/var/applog/log.8088
lr-x------ 1 root root 64 Dec 27 06:06 17524 -> /logtest/var/applog/log.8089
lr-x------ 1 root root 64 Dec 27 06:06 20441 -> /logtest/var/applog/log.808
lr-x------ 1 root root 64 Dec 27 06:06 20442 -> /logtest/var/applog/log.8080
lr-x------ 1 root root 64 Dec 27 06:06 20443 -> /logtest/var/applog/log.8081
lr-x------ 1 root root 64 Dec 27 06:06 20444 -> /logtest/var/applog/log.8082
lr-x------ 1 root root 64 Dec 27 06:06 20445 -> /logtest/var/applog/log.8083
lr-x------ 1 root root 64 Dec 27 06:06 20446 -> /logtest/var/applog/log.8084
lr-x------ 1 root root 64 Dec 27 06:06 20447 -> /logtest/var/applog/log.8085
lr-x------ 1 root root 64 Dec 27 06:06 20448 -> /logtest/var/applog/log.8086
lr-x------ 1 root root 64 Dec 27 06:06 20449 -> /logtest/var/applog/log.8087
lr-x------ 1 root root 64 Dec 27 06:06 20450 -> /logtest/var/applog/log.8088
lr-x------ 1 root root 64 Dec 27 06:06 20451 -> /logtest/var/applog/log.8089
lr-x------ 1 root root 64 Dec 27 06:06 23365 -> /logtest/var/applog/log.808
...

memory usage:
about 3GB

1666056a384ec5b1daa9a1fa568ee2a1

Your Environment

  • Version used:
    docker image: fluent-bit:0.14.9

  • Configuration:

apiVersion: v1
data:
  fluent-bit.conf: |-
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
    [INPUT]
        Name              tail
        Tag              tag
        Path              /logtest/var/applog/*
        Path_Key          filename
        DB                 /var/log/logtest/flb_app.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

    [OUTPUT]
        Name   es
        Match  *
        Host   elasticsearch-es1
        Port   9200
        Index moredir
kind: ConfigMap
metadata:
  creationTimestamp: 2019-12-26T11:01:38Z
  labels:
    deploymentName: lama-moredir-cjp73e66
  name: cnap-log-lama-moredir-cjp73e66-1577358098
  namespace: cnap-tanwei-test-prod-ywmfmqxx

*Kubernetes version: 1.13

  • Server type and version:
  • Operating System and version: Linux version 3.10.0-1062.4.1.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ) Fix include paths #1 SMP Fri Oct 18 17:15:30 UTC 2019
@edsiper
Copy link
Member

edsiper commented Dec 27, 2019

@wtan825

thanks for opening this ticket. There are two things that need to be addressed:

  1. Upgrade to Fluent Bit v1.3.5, make sure to read the upgrade notes since some minor configurations are required for your Kuberentes filter:
  1. The high number of file descriptors usage is expected, I understand this is not ideal but if we don't monitor a file and it gets rotated we might lose the chance to read it data. As a workaround, we are working in an hybrid mechanism based on inotify + stat to monitor files.

As a workaround please increase the number of file descriptors allowed for your container or environment.

@wtan825
Copy link
Author

wtan825 commented Dec 31, 2019

@edsiper
Thanks for replying. It is reasonable to open a file descriptor for each monitored file. In my case, however, fluentbit keeps opening new fd for the same monitored file. Is that expected?
By the way, is it possible to stop watching a inactive file? when a dir is watched, some rotated files in that dir do not need being watched.

@wtan825
Copy link
Author

wtan825 commented Dec 31, 2019

7e8b1106ca7828f9acd261bc11745b85

this image can address the problem.

@edsiper
Copy link
Member

edsiper commented Jan 7, 2020

@wtan825

thanks for the info, 2 questions:

  1. do you have a way to check if the open file descriptors points to some rotated file ?, can you check that all open file descriptors belongs to the same inode ?

  2. what is lama ?

@wtan825
Copy link
Author

wtan825 commented Jan 8, 2020

@wtan825

thanks for the info, 2 questions:

  1. do you have a way to check if the open file descriptors points to some rotated file ?, can you check that all open file descriptors belongs to the same inode ?
  2. what is lama ?

@edsiper

  1. the open file descriptors do not point to rotated file. all open file descriptors for the same monitored file do not belong to the same inode.
  2. lama is used to generate logs. we use fluent-bit as a sidecar to collect logs.
    we use following script to generate logs in lama:
j=0;
while [ $j -le $1 ];do
    i=0;
        while [ $i -le $2 ];do
        echo "100.109.195.78 - - [17/Feb/2017:00:08:11 +0800] \"GET /shop/templates/default/images/    u-safe.png HTTP/1.0\" 404 3675 \"http://www.mall121.com/\" \"Mozilla/4.0 (compatible; MSIE 8.0; Tri    dent/4.0; Windows NT 6.1; SLCC2 2.5.5231; .NET CLR 2.0.50727; .NET CLR 4.1.23457; .NET CLR 4.0.2345    7; Media Center PC 6.0; MS-WK 8)\" \"140.205.201.12\" echo $j file $i times" >>/var/applog/log.$j
        i=$(( $i + 1 ))
        done
    #echo "$(date +%y%m%dT%H:%M:%S) INFO $i"
    sleep 2
    j=$(( $j + 1 ))
done```

./echo.sh 10000 1000

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Feb 16, 2022
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants