-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect FloatingIP workflow #1985
Comments
What does f16855bf-8ba1-4f75-ad8c-763e80134571 look like, does it have a router? It's not really documented, but we don't create any new ports for the FIPs, we just look for an existing port that the FIP can be attached to by checking if there's a port with a subnet that has an attached router to the floating ip network. I've mostly tested it out with spec.ports omitted with the default setup, but I can test it out with something closer to your setup if I know more about how that network is setup. |
Yes, I meant that the new port is being created by Openstack. But not in our cloud. I'm not so familiar with Openstack internals and don't have an access to different configurations except our particular cloud. GET https://compute-api:9696/v2.0/networks/f16855bf-8ba1-4f75-ad8c-763e80134571{ "network": { "id": "f16855bf-8ba1-4f75-ad8c-763e80134571", "name": "internal", "tenant_id": "278fda03174b4fee9358559baffca010", "admin_state_up": true, "mtu": 8913, "default_vnic_type": null, "status": "ACTIVE", "subnets": [ "616388c0-519f-418e-80b4-3687a546a65e" ], "shared": false, "availability_zone_hints": [], "availability_zones": [ "nova" ], "ipv4_address_scope": null, "ipv6_address_scope": null, "router:external": false, "description": "", "port_security_enabled": true, "rbac_policies": [ { "id": "c869c7ef-3c51-4fb6-88f5-c591989fe3ef", "action": "access_as_shared", "target_tenant": "d278dea8631e47ffba5a908265968fbb" } ], "qos_policy_id": null, "tags": [], "created_at": "2024-02-06T12:43:10Z", "updated_at": "2024-03-20T20:39:09Z", "revision_number": 5, "project_id": "278fda03174b4fee9358559baffca010", "provider:network_type": "vxlan" } } GET https://compute-api:9696/v2.0/routers/7142d8f1-2b11-4ae2-a343-eacd77a2ceee{ "router": { "id": "7142d8f1-2b11-4ae2-a343-eacd77a2ceee", "name": "DefaultRouter", "tenant_id": "278fda03174b4fee9358559baffca010", "admin_state_up": true, "status": "ACTIVE", "external_gateway_info": { "network_id": "c7c8509d-7083-41c9-b799-e30e855e9bc0", "external_fixed_ips": [ { "subnet_id": "aa2bc8f7-fa02-4851-ba13-93e57d4c69e1", "ip_address": "69.**.**.**" } ], "enable_snat": true }, "description": "", "availability_zones": [ "nova" ], "availability_zone_hints": [], "routes": [ ], "flavor_id": null, "tags": [], "created_at": "2024-02-06T11:49:58Z", "updated_at": "2024-03-29T14:41:39Z", "revision_number": 17, "project_id": "278fda03174b4fee9358559baffca010" } } That router's If a VM has FIP attached then outgoing connections are being SNAT'ed from that FIP. GET https://compute-api:9696/v2.0/ports?device_id=7142d8f1-2b11-4ae2-a343-eacd77a2ceee{ "ports": [ { "id": "0411af2f-d447-4f3c-88a7-1e8a57e70015", "name": "", "network_id": "f16855bf-8ba1-4f75-ad8c-763e80134571", "tenant_id": "", "mac_address": "fa:16:3e:44:38:7e", "admin_state_up": true, "status": "ACTIVE", "device_id": "7142d8f1-2b11-4ae2-a343-eacd77a2ceee", "device_owner": "network:router_centralized_snat", "fixed_ips": [ { "subnet_id": "616388c0-519f-418e-80b4-3687a546a65e", "ip_address": "10.21.11.1" } ], "allowed_address_pairs": [], "extra_dhcp_opts": [], "security_groups": [], "description": "", "binding:vnic_type": "normal", "port_security_enabled": false, "qos_policy_id": null, "qos_network_policy_id": null, "tags": [], "created_at": "2024-02-06T14:02:02Z", "updated_at": "2024-03-23T18:11:57Z", "revision_number": 40, "project_id": "" }, { "id": "ded9eafe-3ee0-4f29-9f7f-953470f3a3ae", "name": "", "network_id": "f16855bf-8ba1-4f75-ad8c-763e80134571", "tenant_id": "278fda03174b4fee9358559baffca010", "mac_address": "fa:16:3e:48:d2:da", "admin_state_up": true, "status": "ACTIVE", "device_id": "7142d8f1-2b11-4ae2-a343-eacd77a2ceee", "device_owner": "network:router_interface_distributed", "fixed_ips": [ { "subnet_id": "616388c0-519f-418e-80b4-3687a546a65e", "ip_address": "10.21.10.1" } ], "allowed_address_pairs": [], "extra_dhcp_opts": [], "security_groups": [], "description": "", "binding:vnic_type": "normal", "port_security_enabled": false, "qos_policy_id": null, "qos_network_policy_id": null, "tags": [], "created_at": "2024-02-06T14:02:02Z", "updated_at": "2024-04-02T10:33:28Z", "revision_number": 68, "project_id": "278fda03174b4fee9358559baffca010" } ] } I've came up with a quick fix already: https://github.com/serge-name/cluster-api-provider-openstack/commit/bb19917957b82959f8406ed9778eebf82ebd7855 works fine so far. Right now I am short in time to create a decent PR. |
Does it work for you if you replace network:router_interface with network:router_interface_distributed ?
|
Yes, |
@bilbobrovall thanks a lot! Your commit elastx@ce38e8b works fine for me and fixes the issue. There are several minor errors due to premature and frequent (8 API reqs in 2 seconds) checks for FIP. Not a problem for me, just a thing that can be improved later. Logs are follow: |
👍 It's probably just neutron taking some time, and I think the retries should be fine for now since there's an exponential backoff when a reconciler returns the same error, but the initial retries feels a bit tight in this case. |
/kind bug
What steps did you take and what happened:
I tried capo build for
1d5d2d5e45462dab056e37a6c948361e81875ea9
. Some key details follow:OpenStackFloatingIPPool
(non-relevant fields removed)MachineDeployment
andOpenStackMachineTemplate
✅ Floating IP was successfully created. Here we get correct data
fip.FloatingIP == "185.***.**.**", fip.FloatingNetworkID == "c7c8509d-7083-41c9-b799-e30e855e9bc0"
:cluster-api-provider-openstack/controllers/openstackmachine_controller.go
Lines 440 to 443 in 1d5d2d5
❌ Here we get
port == nil
and an error "Failed while associating ip from pool: port for floating IP "185...*" on network c7c8509d-7083-41c9-b799-e30e855e9bc0 does not exist":cluster-api-provider-openstack/controllers/openstackmachine_controller.go
Lines 450 to 458 in 1d5d2d5
More details follow.
Here:
cluster-api-provider-openstack/pkg/cloud/services/networking/port.go
Line 65 in 1d5d2d5
Openstack API returns the following (non-relevant fields skipped):
Please notice that we don't have a port associated with FIP network
c7c8509d-7083-41c9-b799-e30e855e9bc0
. And both FIP network ID and the FIP itself are not going to appear in the ports info because in our Openstack cloud floating IPs are not being added to ports directly. But NAT185.***.**.**
→10.21.10.29
would be set up.If the new k8s node got FIP it could be found here:
https://compute-api:8774/v2.1/TENANT_ID/servers/d1b99e45-991c-4143-93a3-9a8d3eddb416
And the reply might be looking like this (non-relevant fields skipped):
Here it tries to find a fixed IP in the FIP network but in our openstack cloud all FIPs have
device_owner == "network:floatingip"
so it gets just an empty list:cluster-api-provider-openstack/pkg/cloud/services/networking/port.go
Lines 71 to 76 in 1d5d2d5
What did you expect to happen:
Successfully deployed k8s node with FIP attached.
Anything else you would like to add:
None so far. But please ask me any details. The issue is reproducible and I can add even more details if you want.
Environment:
Cluster API Provider OpenStack version (Or
git rev-parse HEAD
if manually built):1d5d2d5e45462dab056e37a6c948361e81875ea9
Cluster-API version:
1.6.3
OpenStack version: Virtuozzo (https://virtuozzo.com), based on Openstack Xena
Minikube/KIND version: N/A
Kubernetes version (use
kubectl version
):1.29.3
OS (e.g. from
/etc/os-release
): Talos (https://talos.dev)1.6.7
The text was updated successfully, but these errors were encountered: