Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler corruption on high number of retries #1956

Closed
edsiper opened this issue Feb 14, 2020 · 15 comments
Closed

scheduler corruption on high number of retries #1956

edsiper opened this issue Feb 14, 2020 · 15 comments
Assignees

Comments

@edsiper
Copy link
Member

edsiper commented Feb 14, 2020

originally reported on #1950 by @rmacian

2020/02/14 15:07:30] [debug] [storage] [cio file] alloc_size from 36864 to 102400
[2020/02/14 15:07:30] [debug] [storage] [cio file] synced at: tail.4/1-1581692850.200312113.flb
[2020/02/14 15:07:30] [debug] [in_tail] file=/var/log/containers/purchase-event-manager-5-vmmb9_gvp_purchase-event-manager-02060e929d3e460da6ac351689cd2f419d6ba192cced026a458110b095d93812.log read=32689 lines=71
[2020/02/14 15:07:30] [debug] [task] destroy task=0x7f14075faee0 (task_id=33)
[2020/02/14 15:07:30] [debug] [storage] tail.0:1-1581692755.636380503.flb mapped OK
[2020/02/14 15:07:30] [debug] [task] created task=0x7f14075faee0 id=33 OK
[2020/02/14 15:07:30] [debug] [out_forward] request 143790 bytes to flush
[2020/02/14 15:07:30] [debug] [out_fw] 154 entries tag='kube.var.log.containers.message-manager-north-5-f4cvk_tid_message-manager-north-4f3d19b958f53e2dc6e59555ba887569d0153f63a399ef1ba8d80b61f873e316.log' tag_len=148
[2020/02/14 15:07:30] [debug] [task] created task=0x7f140743ef40 id=126 OK
[2020/02/14 15:07:30] [debug] [out_forward] request 1239 bytes to flush
[2020/02/14 15:07:30] [debug] [out_fw] 1 entries tag='kube.var.log.containers.epg-agent-wait-2-46fc8_gvp_epg-agent-wait-e41122f15d432269e0156784949f781883d92669b0c5592300be5123a6060544.log' tag_len=134
[2020/02/14 15:07:30] [ warn] [task] retry for task 7 could not be re-scheduled
[2020/02/14 15:07:30] [debug] [retry] task retry=0x7f140740bae0, invalidated from the scheduler
[2020/02/14 15:07:30] [debug] [task] destroy task=0x7f14074400c0 (task_id=7)
[engine] caught signal (SIGSEGV)
#0  0x5653af67bebb      in  __mk_list_del() at lib/monkey/include/monkey/mk_core/mk_list.h:87
#1  0x5653af67bef2      in  mk_list_del() at lib/monkey/include/monkey/mk_core/mk_list.h:93
#2  0x5653af67c779      in  flb_sched_request_destroy() at src/flb_scheduler.c:314
#3  0x5653af67c8c7      in  flb_sched_event_handler() at src/flb_scheduler.c:375
#4  0x5653af67a17a      in  flb_engine_start() at src/flb_engine.c:548
#5  0x5653af5e8cc6      in  main() at src/fluent-bit.c:854
#6  0x7f14088b72e0      in  ???() at ???:0
#7  0x5653af5e71a9      in  ???() at ???:0
#8  0xffffffffffffffff  in  ???() at ???:0
@edsiper
Copy link
Member Author

edsiper commented Feb 14, 2020

@rmacian

Please provide a full copy of your configuration, so I can try to reproduce the problem.

@edsiper edsiper added the waiting-for-user Waiting for more information, tests or requested changes label Feb 14, 2020
@rmacian
Copy link

rmacian commented Feb 15, 2020

I am using ansible to deploy, that's why you will see some loop for defining an input for each namespace defined.

fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     ${FLB_LOG_LVL}
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020
        Streams_File stream_processor.conf
        storage.path              /var/lib/fluentbit
        storage.sync              normal
        storage.checksum          off
        storage.backlog.mem_limit 8M
    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-stdout.conf
  input-kubernetes.conf: |
{% for namespace in namespaces %}
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*_{{namespace}}_*.log
        Parser            docker
        DB                /var/log/flb_kube_{{namespace}}.db
        Mem_Buf_Limit     8MB
        storage.type      filesystem
        Skip_Long_Lines   On
        Refresh_Interval  10
{% endfor %}
output-stdout.conf: |
    [OUTPUT]
        Name          forward
        Match         kube.*
        Upstream      upstream.conf
filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
parsers.conf: |
    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        # Decode_Field_As   escaped    log

I noticed that the cpu on the node that crashed fired up to the limit established in kubernetes of 800 milicores while the other 3 pods were around 20mc. I checked the fluentd from openshift on the same node and was only taking 60mc. The difference is that my fluentbit uses the tail input and I see that fluentd on openshift uses the fluent-plugin-systemd.

It's more efficient to get the logs directly from the journal system instead of the filesystem? I cannot see how to get the pod's output in any unit in systemctl.

@edsiper
Copy link
Member Author

edsiper commented Feb 15, 2020

what about if you try Fluent Bit systemd input plugin ?

@rmacian
Copy link

rmacian commented Feb 17, 2020

I can't find where the logs from my containers are in openshift within systemd.

I thought perhaps the error was because the /var/lib was not properly mounted but the crash has happened again

[2020/02/17 15:55:35] [debug] [out_forward] request 97351 bytes to flush
[2020/02/17 15:55:35] [debug] [out_fw] 80 entries tag='kube.var.log.containers.alexa-2-f7jng_gvc_alexa-a5eda341f781e92acd64a3f6b5004b7e5c754d7e937f9c3e924fbe427dc08c16.log' tag_len=116
[2020/02/17 15:55:35] [ warn] [task] retry for task 14 could not be re-scheduled
[2020/02/17 15:55:35] [debug] [retry] task retry=0x7faf04e0bc20, invalidated from the scheduler
[2020/02/17 15:55:35] [debug] [task] destroy task=0x7faf03deba00 (task_id=14)
[engine] caught signal (SIGSEGV)
#0  0x557f455bfebb      in  __mk_list_del() at lib/monkey/include/monkey/mk_core/mk_list.h:87
#1  0x557f455bfef2      in  mk_list_del() at lib/monkey/include/monkey/mk_core/mk_list.h:93
#2  0x557f455c0779      in  flb_sched_request_destroy() at src/flb_scheduler.c:314
#3  0x557f455c08c7      in  flb_sched_event_handler() at src/flb_scheduler.c:375
#4  0x557f455be17a      in  flb_engine_start() at src/flb_engine.c:548
#5  0x557f4552ccc6      in  main() at src/fluent-bit.c:854
#6  0x7faf063772e0      in  ???() at ???:0
#7  0x557f4552b1a9      in  ???() at ???:0
#8  0xffffffffffffffff  in  ???() at ???:0

Now is 11:27 AM

I also see:

du -hs *
160M    tail.0
0       tail.1
0       tail.2
478M    tail.3
3.1G    tail.4
0       tail.5

It looks like a performance issue, I cannot send as fast as I read

@rmacian
Copy link

rmacian commented Feb 19, 2020

things are getting worse.

root@vepboanvllbo002:/var/lib/fluentbit # du -hs *
668M tail.0
0 tail.1
0 tail.2
503M tail.3
5.7G tail.4
0 tail.5

and I am also seeing errors like:

[2020/02/19 05:46:38] [error] [out_fw] error writing content body
[2020/02/19 07:57:24] [debug] [out_forward] request 3899746 bytes to flush
[2020/02/19 07:57:24] [debug] [out_fw] 4545 entries tag='kube.var.log.containers.gal-connector-29-zsm8g_tit_gal-connector-de8c4177657b67334b677421d9b34b45bbe41eeff126130212359d4e2b564760.log' tag_len=133
[2020/02/19 08:18:43] [debug] [out_forward] request 3973685 bytes to flush
[2020/02/19 08:18:43] [debug] [out_fw] 4661 entries tag='kube.var.log.containers.gal-connector-29-zsm8g_tid_gal-connector-de8c4177657b67334b677421d9b34b45bbe41eeff126130212359d4e2b564760.log' tag_len=133
[2020/02/19 09:12:40] [debug] [out_forward] request 3976173 bytes to flush
[2020/02/19 09:12:40] [debug] [out_fw] 4665 entries tag='kube.var.log.containers.gal-connector-29-zsm8g_tid_gal-connector-de8c4177657b67334b677421d9b34b45bbe41eeff126130212359d4e2b564760.log' tag_len=133
[2020/02/19 09:14:00] [error] [out_fw] error writing content body
[2020/02/19 09:14:00] [error] [out_fw] error writing content body
[2020/02/19 09:15:00] [debug] [out_forward] request 3986994 bytes to flush

I set the image to 1.3.7-debug and put the loglevel at trace and crashes:

[2020/02/19 10:14:43] [debug] [storage] [cio stream] new stream registered: tail.1
[2020/02/19 10:14:43] [debug] [storage] [cio scan] opening stream tail.1
[2020/02/19 10:14:43] [debug] [storage] [cio stream] new stream registered: tail.2
[2020/02/19 10:14:43] [debug] [storage] [cio scan] opening stream tail.2
[2020/02/19 10:14:43] [debug] [storage] [cio stream] new stream registered: tail.3
[2020/02/19 10:14:43] [debug] [storage] [cio scan] opening stream tail.3
[2020/02/19 10:14:43] [debug] [storage] [cio stream] new stream registered: tail.4
[2020/02/19 10:14:43] [debug] [storage] [cio scan] opening stream tail.4

[engine] caught signal (SIGTERM)
[engine] caught signal (SIGSEGV)
[2020/02/19 10:15:08] [ info] [input] pausing tail.0
#0 0x56324dd96b59 in in_tail_pause() at plugins/in_tail/tail.c:329
#1 0x56324dd6627a in flb_input_pause_all() at src/flb_input.c:696
#2 0x56324dd6f653 in flb_engine_exit() at src/flb_engine.c:628
#3 0x56324dcdce0d in flb_signal_handler() at src/fluent-bit.c:242
#4 0x7f30a0ff405f in ???() at ???:0
#5 0x56324dfd2108 in cio_file_is_up() at lib/chunkio/src/cio_file.c:1063
#6 0x56324dfd0cfd in open_and_up() at lib/chunkio/src/cio_file.c:431
#7 0x56324dfd0f8f in cio_file_open() at lib/chunkio/src/cio_file.c:509
#8 0x56324dfceba8 in cio_chunk_open() at lib/chunkio/src/cio_chunk.c:76
#9 0x56324dfd296c in cio_scan_stream_files() at lib/chunkio/src/cio_scan.c:77
#10 0x56324dfd2a99 in cio_scan_streams() at lib/chunkio/src/cio_scan.c:115
#11 0x56324dfcfee2 in cio_load() at lib/chunkio/src/chunkio.c:113
#12 0x56324dd72c38 in flb_storage_create() at src/flb_storage.c:240
#13 0x56324dd6edca in flb_engine_start() at src/flb_engine.c:405
#14 0x56324dcddfa6 in main() at src/fluent-bit.c:854
#15 0x7f30a0fe12e0 in ???() at ???:0
#16 0x56324dcdc489 in ???() at ???:0
#17 0xffffffffffffffff in ???() at ???:0

@edsiper
Copy link
Member Author

edsiper commented Feb 19, 2020

there are two errors:

  1. your Fluent Bit is getting network issues when writing the content to the output destination, you can see that in the error message "[out_fw] error writing content body", so since that's failing your file system queue is getting filed quickly, which is expected. You have to troubleshoot why the remote end-point is dropping the connection or if you have a balancer check what's going on.

  2. The second problem is clearly a bug in Fluent Bit only highlighted under the conditions of the problem in 1 above. I am investigating in the code to see how it can be fixed.

@edsiper edsiper removed the waiting-for-user Waiting for more information, tests or requested changes label Feb 19, 2020
@rmacian
Copy link

rmacian commented Feb 19, 2020

The last week to had to open an issue with RedHat because the fluentd queues in openshift were stopped by this error: "invalid byte sequence in UTF-8" not sending logs to elasticsearch at all. We have some app which causes this error and We have to add the ENABLE_UTF8_FILTER in fluentd:

https://bugzilla.redhat.com/show_bug.cgi?id=1562004

Maybe this be affecting fluent-bit too ?

@edsiper
Copy link
Member Author

edsiper commented Feb 20, 2020

@rmacian

Fluent Bit v1.3.8 has been officially released, please upgrade and send us some feedback:

https://fluentbit.io/announcements/v1.3.8/

Docker image: fluent/fluent-bit:1.3.8

@edsiper edsiper added the waiting-for-user Waiting for more information, tests or requested changes label Feb 20, 2020
@rmacian
Copy link

rmacian commented Feb 20, 2020

I was on 1.3.8-next 1 in one of the pods yesterday.

  1. This morning the last log was from an hour ago. I saw lots of
[2020/02/20 06:53:50] [error] [out_fw] error writing content body
[2020/02/20 07:15:22] [trace] [out_fw] ended write()=81743 bytes
[2020/02/20 07:15:22] [error] [out_fw] error writing content body

and finally a

[2020/02/20 07:15:22] [error] [out_fw] no upstream connections available

before upgrading to 1.3.8 I checked that the upstreams were available so it looks like it was not trying to reconnect.

After upgrading to 1.3.8 it started to send logs and after a while stopped again, some error writing content body and finally I stopped it when I got

/var/lib/fluentbit # du -hs *
46M     tail.0
0       tail.1
0       tail.2
163M    tail.3
1002M   tail.4
0       tail.5

After a restart it looks like it's checking something (does not crash as before) but after 30 minutes the pod is still not ready and those are the last lines I see:

424f167a4e38c93e3b7c36.log promote to TAIL_EVENT
[2020/02/20 08:23:29] [debug] [storage] tail.3:1-1582187009.235291050.flb mapped OK
[2020/02/20 08:23:29] [debug] [filter_kube] API Server (ns=orquestador-dev, pod=p3x-redis-ui-15-q9fpz) http_do=0, HTTP Status: 200
[2020/02/20 08:23:29] [debug] [in_tail] file=/var/log/containers/p3x-redis-ui-15-q9fpz_orquestador-dev_p3x-redis-ui-97c079ac843e403012e1a251884ef73f8921b71e7488d33d0e5532f98956b630.log read=174 lines=1
[2020/02/20 08:23:29] [debug] [storage] tail.3:1-1582187009.249927116.flb mapped OK
[2020/02/20 08:23:29] [debug] [filter_kube] API Server (ns=orquestador-dev, pod=redis-probe-test-34-f8qz7) http_do=0, HTTP Status: 200
[2020/02/20 08:23:29] [debug] [in_tail] file=/var/log/containers/redis-probe-test-34-f8qz7_orquestador-dev_redis-probe-test-79b1b04b0e7e5c223cf13a04ba3e1c3665540b3a85486b3b9ce85162cc7a3142.log read=3976 lines=26
[2020/02/20 08:23:29] [debug] [storage] tail.4:1-1582187009.265055451.flb mapped OK
[2020/02/20 08:23:29] [debug] [in_tail] file=/var/log/containers/redis-probe-test-34-f8qz7_orquestador-dev_redis-probe-test-79b1b04b0e7e5c223cf13a04ba3e1c3665540b3a85486b3b9ce85162cc7a3142.log read=3976 lines=26

The next line was written after 32 minutes:

[2020/02/20 08:55:48] [debug] [filter_kube] API Server (ns=gvp, pod=athena-29-hwxg2) http_do=0, HTTP Status: 200

and now is sending some logs.. what has been doing during this 32 minutes?

@edsiper
Copy link
Member Author

edsiper commented Feb 20, 2020

32 minutes is a lot of time, but I am seeing you are facing network issues. While writing data the socket gets disconnected from the remote end-point.

Would you please try the new keepalive option in your forward output section, e.g:

    [OUTPUT]
        Name          forward
        Match         kube.*
        Host           abc
        Port            abc
        Keepalive   on

note I've removed your Upstream entry.

@rmacian
Copy link

rmacian commented Feb 20, 2020

I've enabled the keepalive and restarted everything. What I see is the more files you have on disk the more delays the init of the containers. Note this

/var/lib/fluentbit # du -hs *
12M     tail.0
0       tail.1
0       tail.2
24M     tail.3
38M     tail.4
0       tail.5

a delay of 30 secs after reading the storage and beginning processing logs:

[2020/02/20 15:42:17] [debug] [storage] tail.4:1-1582213337.803295972.flb mapped OK
[2020/02/20 15:42:47] [debug] [filter_kube] API Server (ns=gvp, pod=athena-29-7j642) http_do=0, HTTP Status: 200

and the first flush 1 second after

[2020/02/20 15:42:48] [debug] [out_forward] request 30634 bytes to flush

Now check this other pod:

# du -hs *
184M    tail.0
0       tail.1
0       tail.2
29M     tail.3
57M     tail.4
0       tail.5

22 minutes of delay between reading the storage and process the first log

[2020/02/20 15:42:22] [debug] [storage] tail.4:1-1582213342.431496854.flb mapped OK
[2020/02/20 15:54:29] [debug] [filter_kube] API Server (ns=gvp, pod=athena-29-hwxg2) http_do=0, HTTP Status: 200

but when begins to process add more files so more waiting and any flush going on , because of ?

[2020/02/20 15:42:22] [debug] [in_tail] file=/var/log/containers/go-bff-43-2w2fn_orquestador-dev_go-bff-2091678d1facaa3940219daf4981144372b33d109e5a7d0f28ac45a61e7f3b36.log read=32767 lines=6
[2020/02/20 15:42:22] [debug] [in_tail] file=/var/log/containers/orquestador-redis-3-z4gl7_orquestador-dev_redis-f593666cdde7704cce7c1a62ffb78c36f30bba12a7424f167a4e38c93e3b7c36.log promote to TAIL_EVENT
[2020/02/20 15:42:22] [debug] [in_tail] file=/var/log/containers/p3x-redis-ui-15-q9fpz_orquestador-dev_p3x-redis-ui-97c079ac843e403012e1a251884ef73f8921b71e7488d33d0e5532f98956b630.log promote to TAIL_EVENT
[2020/02/20 15:42:22] [debug] [storage] tail.3:1-1582213342.415953266.flb mapped OK
[2020/02/20 15:42:22] [debug] [filter_kube] API Server (ns=orquestador-dev, pod=redis-probe-test-34-f8qz7) http_do=0, HTTP Status: 200
[2020/02/20 15:42:22] [debug] [in_tail] file=/var/log/containers/redis-probe-test-34-f8qz7_orquestador-dev_redis-probe-test-79b1b04b0e7e5c223cf13a04ba3e1c3665540b3a85486b3b9ce85162cc7a3142.log read=610 lines=4
[2020/02/20 15:42:22] [debug] [storage] tail.4:1-1582213342.431496854.flb mapped OK
[2020/02/20 15:54:29] [debug] [filter_kube] API Server (ns=gvp, pod=athena-29-hwxg2) http_do=0, HTTP Status: 200
[2020/02/20 15:54:29] [debug] [in_tail] file=/var/log/containers/athena-29-hwxg2_gvp_athena-0c0de3a7282bed3254deb1fcff5efc06312a6f3688fc362161a8339340b06260.log read=32767 lines=45
[2020/02/20 15:54:29] [debug] [storage] tail.4:1-1582214069.651781785.flb mapped OK
[2020/02/20 15:54:29] [debug] [filter_kube] API Server (ns=gvp, pod=epg-agent-default-30-s54fc) http_do=0, HTTP Status: 200
[2020/02/20 15:54:29] [debug] [in_tail] file=/var/log/containers/epg-agent-default-30-s54fc_gvp_epg-agent-default-14258f4ab67bd71f223b10751f872d599d58cc7ed033f25136aa3d617c69b060.log read=14831 lines=56
[2020/02/20 15:54:29] [debug] [in_tail] file=/var/log/containers/epg-agent-default-30-s54fc_gvp_epg-agent-default-cb071ed5468d2a4403217e498f7b2b4f6e51410e600c1742ebd62f2dcc118222.log promote to TAIL_EVENT
[2020/02/20 15:54:29] [debug] [storage] tail.4:1-1582214069.668116248.flb mapped OK
[2020/02/20 15:54:29] [debug] [filter_kube] API Server (ns=gvp, pod=epg-agent-import-15-cngjk) http_do=0, HTTP Status: 200
[2020/02/20 15:54:29] [debug] [in_tail] file=/var/log/containers/epg-agent-import-15-cngjk_gvp_epg-agent-import-3468bc0dfe5fd05653b193a3341cfe2c56cd77516de15e09f804bc3519307251.log read=14298 lines=54
[2020/02/20 15:54:29] [debug] [in_tail] file=/var/log/containers/epg-agent-import-15-cngjk_gvp_epg-agent-import-42383eada102aa7d5fe56ca3c6c92f1654a4ada5f2bc6067049067d80d111c4c.log promote to TAIL_EVENT
[2020/02/20 15:54:29] [debug] [storage] tail.4:1-1582214069.683940243.flb mapped OK

Now is stucked here

I have got the config of the td-agent listening on the other side:

<buffer>
@type memory
flush_thread_count 16
flush_mode interval
flush_interval 1s
#chunk_limit_size 64MB
chunk_limit_records 10000
</buffer>

@edsiper
Copy link
Member Author

edsiper commented Feb 20, 2020

what's the average memory usage of Fluent Bit Pod ?

@rmacian
Copy link

rmacian commented Feb 20, 2020

oc adm top pods
NAME                   CPU(cores)   MEMORY(bytes)
fluent-bit-c9wt2       1m           2Mi
fluent-bit-mb252       919m         25Mi
fluent-bit-s6lvr       0m           3Mi
fluent-bit-zn9hw       985m         4Mi
td-agent-bit-6-qzg9m   0m           10Mi

@rmacian
Copy link

rmacian commented Feb 24, 2020

No crashes once network issues are fixed but still getting pauses of 1 hour and high cpu usage

@edsiper
Copy link
Member Author

edsiper commented Feb 24, 2020

I had a Zoom meeting with Raul, this is no longer an issue. Fixed after upgrade to v1.3.8.

@edsiper edsiper added bug fixed and removed troubleshooting waiting-for-user Waiting for more information, tests or requested changes labels Feb 24, 2020
@edsiper edsiper closed this as completed Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants