-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet reports secondary InternalIP in AWS with multiple ENIs #61921
Comments
@cheftako Do you have any insight into why the apiserver is deduplicating node addresses of the same |
Reading through a couple of places (below) in API server, it looks like it assumes there is exactly 1 Address per NodeAddressType for a kubelet.
The issue and corresponding PR where the change was made in kubelet to add IPs from all ENIs attached to an EC2 instance. If my understanding above is right, did this PR result in conflict with that assumption leading to this behavior? |
@wlan0 Do you know anything about this one? |
ping @nckturner |
@wlan0 @cheftako Our question is: should kubelet be reporting exactly 1 address for each NodeAddressType? When there are multiple interfaces on a node, kubelet (the AWS cloud provider code) sends a list of addresses, which (I think) leads to the node controller picking 1 without knowing which is the correct one to choose. Basically we think #50112 leads to incorrect behavior. |
Cloud node controller seems to assume this behavior when it computes whether an IP has changed https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/cloud/node_controller.go#L444. I believe this is corresponds to https://github.com/kubernetes/kubernetes/blob/master/pkg/util/node/node.go#L81 and https://github.com/kubernetes/kubernetes/blob/master/pkg/util/node/node.go#L99. So right now we do seem to have some baked in assumptions in various parts of the system about a node having 1 IP per Type. Beyond that I would suggest talking to sig-networking. Maybe @bowei can help? |
Yeah, that's what we found as well. I think in the original PR the user was trying to specify a secondary IP with the |
Unless anyone has some strong objections, what we'd like to do is go back to using the |
I still don't have much context to this issue yet but it seems like setting multiple addresses of the same type on node status is the wrong direction as some people have already mentioned. For handling multiple private addresses that can be set on the node interface we should either create new address types (although I prefer if we don't) or have whatever IP provided by |
Ya this seems fine. I agree with @nckturner's comment above. The exact issue I ran into was that the old code did not allow the user to use a secondary ENI with the |
^ it seems like we override all the node address types if the kubelet sets |
@jlzhao27 thanks for the quick reply, seems like steps forward are:
|
Here's a summary of where I'm at with this. The kubelet informs the API server of it's addresses in the PR #50112 changed the AWS implementation of When using the AWS CNI plugin, the secondary ENI with multiple IPs are not used by the node itself, but assigned to individual pods. In this case, the kubelet doesn't get enough information from the EC2 metadata about which secondary ENI's or IPs are actually used by the node itself, and ends up reporting all IP's (in the case of an m4.large, that is 10 IP's per ENI) to the API server. The API server does a reduction somewhere along the line where it only accepts one Our current workaround is to specify the The kubelet logic around validating the As far as solutions, there are 3 things I've considered.
Short of those 3 solutions, I'm not sure how to allow for a user to specify a secondary ENI as the node IP and how to to correctly report the primary IP when using the AWS CNI provider. Am I missing anything big on option 3? |
@micahhausler thanks for the comprehensive update! Would you be able to join the cloud provider WG meeting today again so we can hash out some final details? I think we're close to a workig solution :). |
I'll be there! |
Summary of the conversation in WG Cloud Provider:
|
Created #63158 around the --node-ip issue |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
It seems like getting the |
I think this option (particularly the part of using |
/reopen |
@jnicholls: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@bowei: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@micahhausler: There are no sig labels on this issue. Please add an appropriate label by using one of the following commands:
Please see the group list for a listing of the SIGs, working groups, and committees available. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1 similar comment
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/area provider/aws |
@bowei: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
:sigh: |
It doesn't choose blindly, it's supposed to sort by device-number https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go#L1409 so eth0 should be the default. Feel free to create a new issue to discuss further, IMO it's a bit confusing to continue discussion here as technically the original issue has been fixed. |
/close |
@wongma7: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
/sig aws
What happened:
When a node gets a second ENI (elastic network interface) attached to it, the kubelet PATCH's up to the API server all IP's for all network interfaces on the node (it gets this from the EC2 metadata) The payload might look like this:
When trying to detect if the node has changed, the node controller manager de-duplicates addresses of the same
Type
from each node, picking the last one the in the list the Kubelet reports.What you expected to happen:
The
eth0
IP should be reported as thePrivateIP
address on the node's.status.addresses[]
list.How to reproduce it (as minimally and precisely as possible):
Create a new cluster (with kops), install the AWS CNI provider, and check the node's
InternalIP
kubectl get no -o json | jq .items[].status.addresses
You'll get something like
Anything else we need to know?:
Environment:
kubectl version
): 1.9.xAmazon Linux
uname -a
):4.9.85-47.59.amzn2.x86_64
Other reports:
Maybe related to #42125
Since the Node controller only keeps the last
InternalIP
address in the PATCH, can we have the EC2 metadata lookup queryhttp://169.254.169.254/latest/meta-data/local-ipv4
for theInternalIP
? There may be something I'm missing, and I'd love to get pointed in the right direction. If this is unintentional, is this something that we could get a fix for backported onto the 1.9 branch? I'm happy to get a patch out for this.ping @justinsb @chrislovecnm
The text was updated successfully, but these errors were encountered: