Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aws-for-fluent-bit] Connection Refused in Liveness Probe #983

Open
jcarvalho opened this issue Aug 18, 2023 · 2 comments · May be fixed by #1168
Open

[aws-for-fluent-bit] Connection Refused in Liveness Probe #983

jcarvalho opened this issue Aug 18, 2023 · 2 comments · May be fixed by #1168
Labels
bug Something isn't working

Comments

@jcarvalho
Copy link

Describe the bug
When upgrading the aws-for-fluent-bit from version 0.1.27 to 0.1.28, our Fluent Bit pods enter a CrashLoopBackoff state, due to failures in the newly introduced Liveness Probe.

Pod events show the following message:

Liveness probe failed: Get "http://[2600:1f18:REDACTED::2]:2000/api/v1/health": dial tcp [2600:1f18:REDACTED::2]:2000: connect: connection refused

I believe this is related to the fact that the default HTTP_Listen is set to 0.0.0.0, which means it will not respond to any IPv6 probes (also confirmed this by shelling into the container and trying out curl -6 localhost:2020, which failed).

Changing the Chart Values to set HTTP_Listen to [::] fixes the issue, and it appears that the probes listen on both IPv4 and IPv6 addresses (but I don't have an IPv4 EKS Cluster to test this).

Steps to reproduce
Spin up an IPv6 EKS Cluster, install the aws-for-fluent-bit Chart in version 0.1.28. The pods will enter CrashLoopBackoff.

Expected outcome
The new Liveness Probe works correctly with the default Chart configuration.

Environment

  • Chart name: aws-for-fluent-bit
  • Chart version: 0.1.28
  • Kubernetes version: 1.26
  • Using EKS (yes/no), if so version? Yes, v1.26.7-eks-2d98532

Additional Context:
The EKS Cluster is configured for IPv6 addressing.

@jcarvalho jcarvalho added the bug Something isn't working label Aug 18, 2023
@jatinmehrotra
Copy link

I think this error also exist for chart version 0.1.29 with eks version 1.25 even if cluster is configured for ipv4 addressing

@jatinmehrotra
Copy link

jatinmehrotra commented Aug 31, 2023

@jcarvalho
I was able to reproduce the same error message when I set health check off in my helm chart, probably you need to check whether you have similar settings, if yes maybe removing it will altogether give you an error describe in this issue #995

service:
  ## Allow the service to be exposed for monitoring
  ## For liveness check to work, Health_Check must be set to On
  ## https://docs.fluentbit.io/manual/administration/monitoring
  extraService: |
    Health_Check Off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants