Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xde deletion handler should recreate DLS devnet on failure #181

Closed
rzezeski opened this issue Jul 11, 2022 · 2 comments
Closed

xde deletion handler should recreate DLS devnet on failure #181

rzezeski opened this issue Jul 11, 2022 · 2 comments

Comments

@rzezeski
Copy link
Contributor

As documented in #178, we can get in a situation where a mac device/OPTE Port exists without a corresponding entry in the DLS devnet table. This happens because delete_xde() partially succeeds: deleting the DLS entry but failing to unregister the mac device and delete the OPTE Port. In this scenario we should be recreating the DLS denvet device, as is done by all other mac providers.

@rzezeski
Copy link
Contributor Author

I tested the fix by reproducing the situation in #178 and verifying that the DLS devnet device is recreated on mac_unregister() failure.

sled-agent log:

{"msg":"Stopped and uninstalled zone","v":0,"name":"SledAgent","level":30,"time":"2022-07-12T17:41:52.390694858Z","hostname":"kalm","pid":669,"zone":"oxz_propolis-server_48d20a02-7305-41c0-bc7c-ed5eaeb9ff0b","instance_id":"39a7d233-ea6e-4644-9ac0-7a281632f10e","component":"InstanceManager"}
Failed to delete VNIC: Failed to delete vnic vopte0: Command [/usr/sbin/dladm delete-vnic vopte0] executed and failed with status: exit status: 1. Stdout: , Stderr: dladm: vnic deletion failed: link busy

WARNING: Failed to delete OPTE port 'opte0'

DTrace one-liner to verify failure of mac_unregister() and successful recreation of DLS devnet device.


rpz@kalm:~$ pfexec dtrace -qn 'delete_xde:entry { self->t = 1; } dls_devnet_destroy:return,dls_devnet_create:return,mac_unregister:return /self->t/ { printf("%s => %d\n", probefunc, arg1); } delete_xde:return { self->t = 0; }'
dls_devnet_destroy => 0
mac_unregister => 16
dls_devnet_create => 0

Verification that the opte0 xde device is still reported by dladm.

rpz@kalm:~/oxidecomputer/opte/opteadm$ pfexec opteadm list-ports
LINK                             MAC ADDRESS              IPv4 ADDRESS     STATE   
opte0                            A8:40:25:F7:C4:59        172.30.0.5       running 
rpz@kalm:~/oxidecomputer/opte/opteadm$ dladm
LINK        CLASS     MTU    STATE    BRIDGE     OVER
e1000g0     phys      1500   up       --         --
net0        vnic      1500   up       --         e1000g0
net1        vnic      1500   up       --         e1000g0
stub0       etherstub 9000   up       --         --
underlay0   vnic      9000   up       --         stub0
oxControlService0 vnic 9000  up       --         stub0
oxControlStorage0 vnic 9000  up       --         stub0
oxControlStorage1 vnic 9000  up       --         stub0
oxControlStorage2 vnic 9000  up       --         stub0
oxControlStorage3 vnic 9000  up       --         stub0
oxControlStorage4 vnic 9000  up       --         stub0
oxControlPublic0 vnic 1500   up       --         e1000g0
oxControlService1 vnic 9000  up       --         stub0
oxControlService2 vnic 9000  up       --         stub0
opte0       xde       1500   up       --         --
vopte0      vnic      1500   up       --         opte0

Furthermore, with this fix in place, we can now manually delete the outstanding opte0 link and its corresponding vnic (should we ever find ourselves in this situation again).

rpz@kalm:~/oxidecomputer/opte/opteadm$ pfexec dladm delete-vnic vopte0                 
rpz@kalm:~/oxidecomputer/opte/opteadm$ dladm                   
LINK        CLASS     MTU    STATE    BRIDGE     OVER
e1000g0     phys      1500   up       --         --
net0        vnic      1500   up       --         e1000g0
net1        vnic      1500   up       --         e1000g0
stub0       etherstub 9000   up       --         --
underlay0   vnic      9000   up       --         stub0
oxControlService0 vnic 9000  up       --         stub0
oxControlStorage0 vnic 9000  up       --         stub0
oxControlStorage1 vnic 9000  up       --         stub0                
oxControlStorage2 vnic 9000  up       --         stub0          
oxControlStorage3 vnic 9000  up       --         stub0                                                                                               
oxControlStorage4 vnic 9000  up       --         stub0
oxControlPublic0 vnic 1500   up       --         e1000g0
oxControlService1 vnic 9000  up       --         stub0
oxControlService2 vnic 9000  up       --         stub0
opte0       xde       1500   up       --         --

rpz@kalm:~/oxidecomputer/opte/opteadm$ pfexec opteadm delete-xde opte0

rpz@kalm:~/oxidecomputer/opte/opteadm$ pfexec opteadm list-ports
LINK                             MAC ADDRESS              IPv4 ADDRESS     STATE

rpz@kalm:~/oxidecomputer/opte/opteadm$ dladm | grep opte

@rzezeski
Copy link
Contributor Author

Addressed in 815a275.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant