-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor IPTable Rules #2697
Refactor IPTable Rules #2697
Conversation
45b54e0
to
1e181a3
Compare
c72b7e4
to
26edd44
Compare
700c562
to
dc524ce
Compare
This will lead to non-zero chains being in iptables after upgrades even though we are not using them. I think it is better to delete the AWS chains during nodeInit and we anyway rebuild the chains so should be safe... Let's discuss it offline. |
803671e
to
929c112
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question, otherwise all changes look good to me
0bfe8a3
to
4a18ecc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits. I think we are good to update the test agent image in this PR now, assuming your testing shows everything passing
4a18ecc
to
99d2b15
Compare
99d2b15
to
871ef9f
Compare
871ef9f
to
f52f1a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What type of PR is this?
bug
Which issue does this PR fix:
#2373
What does this PR do / Why do we need it:
nf_tables mode restricts the jump stack size to at most 16. We currently have (one jump per VPC CIDR/Excluded CIDR) + 1, so having 15+ CIDRS will result in a jump stack size greater than 16, causing failure. This pr refactors our iptable rules to retain current behavior while keeping the jump stack size to a size of 2.
If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
Testing done on this change:
Manually checked iptable rules for 16 CIDRS, aws-node is up and running
Manually tested adding exclusion cidrs using
AWS_VPC_K8S_CNI_EXCLUDE_SNAT_CIDRS
Also manually checked up to VPC Limit (50 CIDRS).
Manually checked packet using tcpdump
Integration Tests:
Ensured traffic to the internet aren't dropped while rules are being updated (deleted and added 10+ cidrs while ping in a pod was running)
Scale Testing
130 Pod
730 pod
5000 pod
Will this PR introduce any new dependencies?:
No
Will this break upgrades or downgrades? Has updating a running cluster been tested?:
No, Yes
Does this change require updates to the CNI daemonset config files to work?:
No
Does this PR introduce any user-facing change?:
No
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.