Fluent Bit crashes for tail & opentelemetry with logs_uri #6457

wtobis · 2022-11-25T17:02:15Z

Bug Report

Describe the bug
Fluent Bit crashes with error

[2022/11/10 20:02:37] [ info] [input:tail:tail.0] inotify_fs_add(): inode=2533274793273371 watch_fd=1 name=/var/log/mylogs.log
[2022/11/10 20:02:38] [engine] caught signal (SIGSEGV)
#0  0x55b173e87101      in  mk_list_size() at lib/monkey/include/monkey/mk_core/mk_list.h:165
#1  0x55b173e8b368      in  handle_output_event() at src/flb_engine.c:289
#2  0x55b173e8cb85      in  flb_engine_start() at src/flb_engine.c:971
#3  0x55b173e34ab4      in  flb_lib_worker() at src/flb_lib.c:629
#4  0x7fa2ca09dea6      in  ???() at ???:0
#5  0x7fa2c996da2e      in  ???() at ???:0
#6  0xffffffffffffffff  in  ???() at ???:0

when use tail input and opentelemetry output (with logs_uri) plugins.

Important - for other input plugins (e.g. dummy) it works.

To Reproduce

Start Fluent Bit with provided configuration
Move test.log file with some logs into /var/log/ directory

Screenshots
The whole log message from start to failure:

[2022/11/10 20:02:24] [ info] [fluent bit] version=2.0.4, commit=abb65a1f31, pid=1
[2022/11/10 20:02:24] [ info] [storage] ver=1.3.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2022/11/10 20:02:24] [ info] [cmetrics] version=0.5.6
[2022/11/10 20:02:24] [ info] [ctraces ] version=0.2.5
[2022/11/10 20:02:24] [ info] [input:tail:tail.0] initializing
[2022/11/10 20:02:24] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2022/11/10 20:02:24] [ info] [sp] stream processor started
[2022/11/10 20:02:24] [ info] [output:stdout:stdout.0] worker #0 started
[2022/11/10 20:02:37] [ info] [input:tail:tail.0] inotify_fs_add(): inode=2533274793273371 watch_fd=1 name=/var/log/mylogs.log
[2022/11/10 20:02:38] [engine] caught signal (SIGSEGV)
#0  0x55b173e87101      in  mk_list_size() at lib/monkey/include/monkey/mk_core/mk_list.h:165
#1  0x55b173e8b368      in  handle_output_event() at src/flb_engine.c:289
#2  0x55b173e8cb85      in  flb_engine_start() at src/flb_engine.c:971
#3  0x55b173e34ab4      in  flb_lib_worker() at src/flb_lib.c:629
#4  0x7fa2ca09dea6      in  ???() at ???:0
#5  0x7fa2c996da2e      in  ???() at ???:0
#6  0xffffffffffffffff  in  ???() at ???:0

Your Environment

Version used: Docker image: fluent/fluent-bit:2.0.4
Configuration: fluent-bit.conf:

[INPUT]
	Name tail
	Path /var/log/*.log
	Refresh_Interval 10 

[FILTER]
	Name modify
	Match *
	Add testattribute attrvalue

[OUTPUT]
	Name stdout
	Match *

[OUTPUT]
	Name  opentelemetry
	Match *
	Host  hostname
	Port  443
	logs_uri /v1/logs
	add_label testlabel labelvalue

Operating System and version: Windows 10 + Docker Desktop
Filters and plugins: tail, modify, opentelemetry and stdout (just for testing)

Additional context
It might be related to this issue because before 2.0.4 this configuration didn't fail.

The text was updated successfully, but these errors were encountered:

BertelBB · 2022-12-01T08:32:54Z

Also encountering this in v2.0.5 although I am not using the opentelemetry output plugin. I'm only using elasticsearch output plugin

Syn3rman · 2022-12-01T14:34:31Z

@wtobis I wasn't able to reproduce this in docker on a mac. I'm wondering if this might be related to the tail plugin on windows

BertelBB · 2022-12-01T14:36:13Z

@wtobis I wasn't able to reproduce this in docker on a mac. I'm wondering if this might be related to the tail plugin on windows

Don't think it is isolated to Windows only since I am running on Linux

Syn3rman · 2022-12-01T14:39:32Z

Could you please send your config too?

BertelBB · 2022-12-01T15:07:08Z

Could you please send your config too?

Environment: AKS with Linux nodes
FluentBit version: 2.0.5

[SERVICE]
    flush 1
    daemon off
    log_level warning
    parsers_file custom_parsers.conf
    parsers_file parsers.conf
    http_server on
    http_listen 0.0.0.0
    http_port 2020
    storage.path /fluent-bit/etc/data

[INPUT]
    name tail
    alias kube
    path /var/log/containers/*.log
    path_key log_file_path
    db /fluent-bit/etc/db/kube.db
    parser cri-custom
    tag kube.*
    buffer_chunk_size 32k
    buffer_max_size 256k
    mem_buf_limit 5m
    read_from_head true
    refresh_interval 10
    skip_empty_lines on
    skip_long_lines off
    storage.type filesystem

[FILTER]
    name kubernetes
    match *
    kube_url https://kubernetes.default.svc.cluster.local:443
    kube_ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    kube_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
    kube_tag_prefix kube.var.log.containers.
    buffer_size 256k
    merge_log on
    merge_log_trim on
    keep_log off
    k8s-logging.parser on
    k8s-logging.exclude on
    annotations off
    labels on

[OUTPUT]
    name es
    alias elasticsearch
    match *
    host ${FLUENT_ES_HOST}
    port ${FLUENT_ES_PORT}
    http_user ${FLUENT_ES_USER}
    http_passwd ${FLUENT_ES_PW}
    buffer_size 1m
    index kubernetes
    generate_id on
    logstash_format on
    logstash_prefix ${FLUENT_ES_LOGSTASH_PREFIX}
    replace_dots on
    retry_limit 5
    tls on
    tls.verify off
    trace_error off

agup006 · 2022-12-07T05:45:28Z

@BertelBB i think this is missing the opentelemtry output config. Also while I don’t expect it to be fixed could you also sanity check 2.0.6?

Syn3rman · 2022-12-12T09:21:57Z

Similar to what I reported in the linked issue (6512), reducing the batch size of log records to flush seems to fix this issue. A short term fix will be to provide it as a config parameter and default it to a smaller number (64 or 128), but we might have to look into the retry logic for this.

edsiper · 2023-01-27T22:34:17Z

fixed by #6559 #6583

wtobis added the status: waiting-for-triage label Nov 25, 2022

abeach-nr mentioned this issue Dec 7, 2022

Fluent-Bit Crashes when using opentelemetry logs output (version 2.0.6) #6512

Closed

Syn3rman mentioned this issue Dec 15, 2022

out_opentelemetry: make log records batch size to configurable #6559

Closed

6 tasks

edsiper mentioned this issue Dec 19, 2022

out_opentelemetry: make log records batch size to configurable #6583

Merged

edsiper closed this as completed Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fluent Bit crashes for tail & opentelemetry with logs_uri #6457

Fluent Bit crashes for tail & opentelemetry with logs_uri #6457

wtobis commented Nov 25, 2022

BertelBB commented Dec 1, 2022

Syn3rman commented Dec 1, 2022

BertelBB commented Dec 1, 2022

Syn3rman commented Dec 1, 2022

BertelBB commented Dec 1, 2022 •

edited

Loading

agup006 commented Dec 7, 2022

Syn3rman commented Dec 12, 2022

edsiper commented Jan 27, 2023

Fluent Bit crashes for tail & opentelemetry with logs_uri #6457

Fluent Bit crashes for tail & opentelemetry with logs_uri #6457

Comments

wtobis commented Nov 25, 2022

Bug Report

BertelBB commented Dec 1, 2022

Syn3rman commented Dec 1, 2022

BertelBB commented Dec 1, 2022

Syn3rman commented Dec 1, 2022

BertelBB commented Dec 1, 2022 • edited Loading

agup006 commented Dec 7, 2022

Syn3rman commented Dec 12, 2022

edsiper commented Jan 27, 2023

BertelBB commented Dec 1, 2022 •

edited

Loading