-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Cilium v1.16.4 #493
Conversation
WalkthroughThe pull request includes updates to the Cilium Helm chart, primarily focusing on version increments from Changes
Possibly related PRs
Suggested labels
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json (1)
381-387
: Consider making max_active_downstream_connections configurableThe overload manager configuration adds protection against resource exhaustion by limiting active downstream connections. However, the hard-coded limit of 50,000 connections might not be suitable for all deployments.
Consider making this value configurable through Helm values:
- "max_active_downstream_connections": "50000" + "max_active_downstream_connections": "{{ .Values.envoy.overloadManager.maxActiveDownstreamConnections }}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (10)
packages/system/cilium/charts/cilium/Chart.yaml
(2 hunks)packages/system/cilium/charts/cilium/README.md
(9 hunks)packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json
(3 hunks)packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml
(3 hunks)packages/system/cilium/charts/cilium/templates/hubble-relay/serviceaccount.yaml
(1 hunks)packages/system/cilium/charts/cilium/templates/validate.yaml
(1 hunks)packages/system/cilium/charts/cilium/values.schema.json
(1 hunks)packages/system/cilium/charts/cilium/values.yaml
(9 hunks)packages/system/cilium/charts/cilium/values.yaml.tmpl
(3 hunks)packages/system/cilium/images/cilium/Dockerfile
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- packages/system/cilium/images/cilium/Dockerfile
- packages/system/cilium/charts/cilium/README.md
🔇 Additional comments (12)
packages/system/cilium/charts/cilium/templates/hubble-relay/serviceaccount.yaml (1)
16-16
: LGTM! Security enhancement for service account token mounting
The addition of automountServiceAccountToken
follows Kubernetes security best practices by allowing explicit control over service account token mounting for the Hubble relay pods. This helps implement the principle of least privilege.
packages/system/cilium/charts/cilium/Chart.yaml (1)
82-82
: LGTM! Version update is consistent
The version update from 1.16.3 to 1.16.4 is consistently applied to both appVersion
and version
fields.
Let's verify version consistency across other files in the repository:
Also applies to: 98-98
packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.json (1)
341-341
: LGTM: Initial fetch timeout configuration added
Adding explicit timeouts for initial xDS (LDS/CDS) fetch operations is a good practice to prevent indefinite hangs during Envoy startup.
Also applies to: 357-357
packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml (3)
1278-1278
: LGTM: Initial fetch timeout configuration
The proxy-initial-fetch-timeout configuration is properly added and aligns with the Envoy bootstrap configuration.
721-723
: LGTM: Improved socket LB termination configuration
The change provides better backward compatibility by falling back to socketLB.enabled when terminatePodConnections is not set.
724-725
: LGTM: Socket LB tracing configuration added
New tracing capability for socket LB is properly integrated into the configuration.
packages/system/cilium/charts/cilium/values.schema.json (1)
1956-1958
: LGTM: Schema updates for new configurations
The schema properly defines:
- initialFetchTimeoutSeconds as integer type
- envoy.enabled to accept null values for more flexible configuration
packages/system/cilium/charts/cilium/values.yaml.tmpl (4)
Line range hint 156-159
: Version updates look consistent!
The version updates from v1.16.3 to v1.16.4 and corresponding digest updates have been applied consistently across all components:
- Cilium agent
- Hubble relay
- Cilium operator (all variants)
- Preflight check
Also applies to: 1317-1319, 2483-2491, 2765-2767
1009-1010
: New observability feature: Socket LB tracing
A new configuration option has been added to enable tracing for socket-based load balancing, which will help with debugging and observability.
2162-2163
: New reliability feature: Envoy initial fetch timeout
Added configuration for the initial fetch timeout on xDS streams, which helps control how long Envoy will wait for the initial configuration fetch.
1280-1283
: Security improvement: Reduced certificate validity duration
The certificate validity duration has been reduced from 1095 days (3 years) to 365 days (1 year) to improve security and ensure compatibility with MacOS, which does not accept self-signed certificates with validity periods exceeding 825 days.
packages/system/cilium/charts/cilium/values.yaml (1)
Line range hint 1-3
: Generated file matches template
This is an auto-generated file from values.yaml.tmpl. All changes are consistent with the template file reviewed above.
{{ if and | ||
(or (and (ge (int .Values.cluster.id) 128) (le (int .Values.cluster.id) 255)) (and (ge (int .Values.cluster.id) 384) (le (int .Values.cluster.id) 511))) | ||
(or .Values.eni.enabled .Values.alibabacloud.enabled (eq .Values.cni.chainingMode "aws-cni")) -}} | ||
{{ fail "Cilium is currently affected by a bug that causes traffic matched by network policies to be incorrectly dropped when running in either ENI mode (both AWS and AlibabaCloud) or AWS VPC CNI chaining mode, if the cluster ID is 128-255 (and 384-511 when maxConnectedClusters=511). Please refer to https://github.com/cilium/cilium/issues/21330 for additional details." }} | ||
{{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Important validation needs to be documented in upgrade notes
The validation for cluster ID ranges (128-255, 384-511) when using ENI mode or AWS VPC CNI chaining mode is not documented in any of the chart's documentation files. This validation is critical as it prevents network policy traffic from being incorrectly dropped.
- Add documentation about this limitation in the chart's README.md or values.yaml under the relevant sections (cluster.id, eni.enabled, or cni.chainingMode)
- Consider adding a note in UPGRADE.md (if it exists) to warn users about this limitation when upgrading to v1.16.4
🔗 Analysis chain
LGTM! Important validation for preventing traffic drops.
The new validation block correctly prevents using problematic cluster ID ranges (128-255, 384-511) when running in ENI mode or AWS VPC CNI chaining mode, which could cause network policy traffic to be incorrectly dropped.
Let's verify if this validation is documented in the upgrade notes:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the cluster ID validation is documented in the upgrade notes
# Expected: Find mentions of cluster ID ranges and the associated bug
# Search for cluster ID related documentation in README/upgrade notes
fd -t f -e md | xargs rg -i "cluster.*id.*128.*255|21330"
Length of output: 59
Script:
#!/bin/bash
# Let's try a broader search for upgrade notes and documentation
# First, let's check for any UPGRADE or migration guides
fd -t f "UPGRADE|MIGRATION|CHANGELOG" -i
# Then, let's check the root README and docs directory
fd -t f "README|\.md$" -d 2
Length of output: 786
Script:
#!/bin/bash
# Let's also check the Helm chart's documentation specifically
fd -t f "README|VALUES|NOTES" -i packages/system/cilium/charts/cilium
Length of output: 388
Script:
#!/bin/bash
# Let's also check for any mentions of this issue with a broader search
rg -i "network polic.*drop|eni.*mode.*cluster|vpc.*cni.*cluster"
Length of output: 952
Script:
#!/bin/bash
# Let's check the content of the Cilium chart's README and values files
# to see if this limitation is documented
cat packages/system/cilium/charts/cilium/README.md
Length of output: 65589
Script:
#!/bin/bash
# Also check the values.yaml for any comments or documentation
cat packages/system/cilium/charts/cilium/values.yaml | rg -i -A 5 -B 5 "cluster.*id|eni|aws|vpc"
Length of output: 11261
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced new configuration options for socket-based load balancing tracing and initial fetch timeout settings in the Cilium deployment. - Enhanced validation checks for deprecated options to prevent misconfigurations. - **Bug Fixes** - Improved error messaging for deprecated or invalid settings. - **Documentation** - Updated version numbers in README and configuration files to reflect the new version (1.16.4). - **Chores** - Updated Dockerfile and image tags to reference the latest version (1.16.4). <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores