got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)" #1716

kefiras · 2024-04-05T11:31:03Z

Describe the bug:
Error when using syslog output

Expected behaviour:
Logs should be sent to defined syslog cluster output

Steps to reproduce the bug:
Configure below resource

apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterOutput
metadata:
  name: syslog
  namespace: logging
spec:
  syslog:
    buffer:
      timekey: 30s
      timekey_wait: 0s
    host: syslog.example.net
    insecure: true
    port: 20444
    transport: tls

apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterFlow
metadata:
  name: hosttailer-flow
  namespace: logging
spec:
  filters:
  - tag_normaliser: {}
  globalOutputRefs:
  - syslog
  match:
  - select:
      labels:
        app.kubernetes.io/name: host-tailer

Additional context:
Fluentd throws errors:

2024-04-05 11:29:00 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)"
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/socket.rb:41:in `socket_create'
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:65:in `find_or_create_socket'
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:39:in `write'
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:in `try_flush'
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
  2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2024-04-05 11:29:00 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] bad chunk is moved to /buffers/backup/worker0/clusterflow_logging_hosttailer-flow_clusteroutput_logging_syslog/61557c1e8b4b20b9380467be5ff0a45b.log
2024-04-05 11:29:01 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)"
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/socket.rb:41:in `socket_create'
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:65:in `find_or_create_socket'
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:39:in `write'
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:in `try_flush'
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
  2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2024-04-05 11:29:01 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] bad chunk is moved to /buffers/backup/worker0/clusterflow_logging_hosttailer-flow_clusteroutput_logging_syslog/61557c20915175b74f5d02915b7386cb.log

Environment details:

Kubernetes version 1.27
Cloud-provider/provisioner : AKS
logging-operator version : 4.6.0
Install method (e.g. helm or static manifests): helm
Logs from the misbehaving component (and any other relevant logs):
Resource definition (possibly in YAML format) that caused the issue, without sensitive data:

/kind bug

The text was updated successfully, but these errors were encountered:

pepov · 2024-04-08T13:57:25Z

@kefiras this error message alone doesn't tell much about the original problem

have you tried looking at the referred bad chunk?

2024-04-05 11:29:01 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] bad chunk is moved to /buffers/backup/worker0/clusterflow_logging_hosttailer-flow_clusteroutput_logging_syslog/61557c20915175b74f5d02915b7386cb.log

have you tried raising the log level? (logLevel: debug in fluentd spec)
have you/can you check the error/warning messages on the receiving side if there were any?

kefiras · 2024-04-09T10:24:39Z

Debug is already enabled

bad chunk

??f?.FsN??time?2024-04-09T10:18:22.776368974Z?message?:Apr  9 10:18:22 aks-prometheus-18130450-vmss000000 kernel: [498058.497065] calico-packet: IN=azve56f4c00502 OUT=azva623c2d61aa MAC=aa:aa:aa:aa:aa:aa:6a:73:f2:79:14:75:08:00 SRC=10.244.3.144 DST=10.244.3.135 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=35709 DF PROTO=TCP SPT=36200 DPT=2020 WINDOW=64240 RES=0x00 SYN URGP=0 ?app?host-tailer?container_image?Lrepo-aks.qa.example.net/example/linux/exm/exm/vendor/fluent/fluent-bit:2.1.8?clustername?aks1kexm1?datacenter?eastus2?env?nonprod?family?logging?mnemonic?exm?hostname?"aks-prometheus-18130450-vmss000000?namespace?logging?pod_id?$2461060d-4eb9-41ec-8fe2-eefcf4bad090?pod_name?filetail-host-tailer-phq7s?service?syslog/ $

I haven't checked receiving side but I doubt anything is send

stale · 2024-06-08T11:13:00Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions!

liz-86 · 2024-07-04T08:00:01Z

We encountered the same error. Is it possible to open this issue again?

pepov · 2024-07-04T09:17:25Z

@liz-86 can you add some details to this? do you see this error with the latest image versions as well?

liz-86 · 2024-07-04T09:48:35Z

Yes, we tested our configuration (much the same as the above mentioned but with tcp transport and not tls) with the latest fluentd image (kube-logging/fluentd-images:v1.16-full).
Our ClusterOutput:

apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterOutput
metadata:
  name: syslog
  namespace: logging
spec:
  syslog:
    buffer:
      flush_thread_count: 16
      timekey: 1m
      timekey_use_utc: true
      timekey_wait: 30s
    format:
      type: json
    host: syslog.example.net
    insecure: true
    port: 5056
    transport: tcp

The created fluentd.conf is the following (from k8s secret loggging-operator-logging-fluentd-app):

  <match **>
    @type syslog_rfc5424
    @id clusterflow:logging:syslog-flow:clusteroutput:logging:syslog-output
    host syslog.example.net
    insecure true
    port 5056
    transport tcp
    <buffer tag,time>
      @type file
      chunk_limit_size 8MB
      flush_thread_count 16
      path /buffers/clusterflow:logging:syslog-flow:clusteroutput:logging:syslog-output.*.buffer
      retry_forever true
      timekey 1m
      timekey_use_utc true
      timekey_wait 30s
    </buffer>
    <format>
      @type json
    </format>
  </match>

TimWelter · 2024-07-04T14:18:41Z

Same issue here.

Provider: RKE2
Kubernetes Version: v1.27.12 +rke2r1
Chart: Logging (103.1.1+up4.4.0)

pepov · 2024-07-05T12:00:30Z

What are your fluentd and fluentbit image versions?

pepov · 2024-07-05T20:03:55Z

It seems I totally misunderstood the issue originally. I've looked at it once again and it seems that the ruby3 upgrade broke the syslog plugin because of the deprecation and removal of https://blog.saeloun.com/2019/10/07/ruby-2-7-keyword-arguments-redesign/

I've made a change here: pepov/fluent-plugin-syslog_rfc5424@6404b61

Then applied on my fork of the fluentd image here: kube-logging/fluentd-images@main...pepov:fluentd-images:main

I didn't have the time to test it with a syslog receiver, could you please give it a try with ghcr.io/pepov/fluentd:v1.16-full?

liz-86 · 2024-07-08T04:57:44Z

Thanks for looking into the issue. I can confirm that with the new image there are no more errors in the fluentd. I need to talk to another team to see if there are getting the desired logs. But it looks good at the moment.

Thanks again!

EDIT: All seems to be working perfectly. The other team's are getting logs. :)

pepov · 2024-07-08T08:06:33Z

thx for the confirmation, I'm making the PRs to have the fix released asap

pepov · 2024-07-08T19:00:33Z

The images have been updated with the fix with the 148th build:
v1.16-full-build.148
v1.16-full

For logging operator 4.8:
v1.16-4.8-full-build.148
v1.16-4.8-full

kefiras added the bug Something isn't working label Apr 5, 2024

stale bot added the wontfix This will not be worked on label Jun 8, 2024

stale bot closed this as completed Jun 15, 2024

pepov reopened this Jul 4, 2024

stale bot removed the wontfix This will not be worked on label Jul 4, 2024

pepov mentioned this issue Jul 8, 2024

fix syslog plugin ruby3 compatibility kube-logging/fluentd-images#140

Merged

pepov closed this as completed in kube-logging/fluentd-images#140 Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)" #1716

got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)" #1716

kefiras commented Apr 5, 2024

pepov commented Apr 8, 2024

kefiras commented Apr 9, 2024

stale bot commented Jun 8, 2024

liz-86 commented Jul 4, 2024

pepov commented Jul 4, 2024

liz-86 commented Jul 4, 2024

TimWelter commented Jul 4, 2024

pepov commented Jul 5, 2024

pepov commented Jul 5, 2024 •

edited

Loading

liz-86 commented Jul 8, 2024 •

edited

Loading

pepov commented Jul 8, 2024

pepov commented Jul 8, 2024

got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)" #1716

got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)" #1716

Comments

kefiras commented Apr 5, 2024

pepov commented Apr 8, 2024

kefiras commented Apr 9, 2024

stale bot commented Jun 8, 2024

liz-86 commented Jul 4, 2024

pepov commented Jul 4, 2024

liz-86 commented Jul 4, 2024

TimWelter commented Jul 4, 2024

pepov commented Jul 5, 2024

pepov commented Jul 5, 2024 • edited Loading

liz-86 commented Jul 8, 2024 • edited Loading

pepov commented Jul 8, 2024

pepov commented Jul 8, 2024

pepov commented Jul 5, 2024 •

edited

Loading

liz-86 commented Jul 8, 2024 •

edited

Loading