-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending ci, noticed one super tiny formatting tweak
Co-Authored-By: Chris Koehnke <[email protected]>
jenkins test this please |
jenkins test this please |
1 similar comment
jenkins test this please |
Filebeat tests are failing with the following error:
|
OK, the issue is related Filebeat HTTP endpoint which is activated by default in our Helm templates. Strangely, when running default and oss tests on CI, Filebeat pods fail because the HTTP endpoint port (5066) is already used inside the pod. I could confirm that this port is already used even when removing
|
A little state of art of our Helm CI tests:
While we had no issues with Filebeat < 7.5.0, each Filebeat pods could coexists in the same node and same namespace with HTTP endpoint enabled and no port conflict because of pods port isolation (http port is not exposed outside of the pod in our configuration), it seems that this is no more working with Filebeat 7.5.0 as one Filebeat pod is able to see port 5066 opened by another Filebeat pod in the same node / same namespace and so it will fail with "port already used" error. I could confirm that when removing the |
Same issue can be reproduced with different namespaces:
|
After digging with @odacremolbap's help we found that's the issue is related to #321 which set By enabling host networking, Filebeat pods can directly see the network interfaces of the host machine where the pod was started. Filebeat pod is also accessible on all network interfaces of the host machine. The side effect is that 2 Filebeat pods requiring the same port cannot run on the same node and can lead to port conflicts. On top of that, creating a pod with Why did out Filebeat tests worked with Quoting @odacremolbap message on Slack
Conclusion Regarding this I guess we should disable Host Networking by default and only allow to enable it by overriding Helm values. In addition, the point to enable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just one comment on the HostNetwork section phrasing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
jenkins test this please |
logstash can take longer than 60s to fully start and we can sometime reach the point where Logstash full start happen a few second after liveness probe sending the kill signal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
filebeat/README.md
Outdated
@@ -12,6 +12,7 @@ This helm chart is a lightweight way to configure and run our official [Filebeat | |||
## Usage notes and getting started | |||
* The default Filebeat configuration file for this chart is configured to use an Elasticsearch endpoint. Without any additional changes, Filebeat will send documents to the service URL that the Elasticsearch helm chart sets up by default. You may either set the `ELASTICSEARCH_HOSTS` environment variable in `extraEnvs` to override this endpoint or modify the default `filebeatConfig` to change this behavior. | |||
* The default Filebeat configuration file is also configured to capture container logs and enrich them with Kubernetes metadata by default. This will capture all container logs in the cluster. | |||
* This chart disables the [HostNetwork](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces) setting by default for compatibility reasons with the majority of kubernetes providers and scenarios. Some kubernetes providers may not allow enabling `hostNetwork` and deploying multiple Filebeat pods on the same node isn't possible with `hostNetwork`. However Filebeat does recommend activating it. If your kubernetes provider is compatible with `hostNetwork` and you don't need to run multiple Filebeat daemonsets, you can activate it [here](./values.yaml#L36). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/values.yaml#L36
This line number won't be correct anymore if any values are added above it. I think it would be better to just leave the link out to avoid confusion.
logstash/values.yaml
Outdated
@@ -150,7 +150,7 @@ livenessProbe: | |||
httpGet: | |||
path: / | |||
port: http | |||
initialDelaySeconds: 60 | |||
initialDelaySeconds: 90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still cutting it pretty close and could make it hard for Logstash processes that are a bit slower to startup. This should be high enough that we can be very sure that it isn't going to prevent a container from starting up. Maybe around 5 minutes would be a nicer value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
${CHART}/tests/*.py
${CHART}/examples/*/test/goss.yaml