-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
route/link: support update operation for link objects #414
base: main
Are you sure you want to change the base?
Conversation
When bridge interfaces are in vlan-aware mode, one of the rtnetlink updates they produce was not handled correctly. When the set of vlans active on a vlan-aware bridge itself it being changed it produces RTM_NEWLINK message with AF_BRIDGE family (e.g. with `bridge vlan add dev br0 vid 3 self`). Unlike other messages arriving with AF_BRIDGE family - this one doesn't refer to one of the member interfaces, but to the bridge itself as if it was its own member. This message doesn't contain full information about bridge - mostly about vlans in IFLA_AF_SPEC attribute. It does, however, refer to the bridge interface itself via ifi_index parameter. `libnl` parses such message into a link object with (AF_BRIDGE, <br-ifindex>) key. At the same time, the regular rtnetlink message describing bridge interface itself (one that arrives with AF_UNSPEC family and IFLA_LINKINFO/IFLA_INFO_KIND == bridge) will also be parsed into link object with the same key (AF_BRIDGE, <br-ifindex>). That is caused by the libnl classifying it internally as bridge (via IFLA_INFO_KIND) and overriding its family to AF_BRIDGE. Such collision results in an unexpected behavior when we have a cache with the information about bridge (from AF_UNSPEC message) and then receive notification produced by `bridge vlan add/del ... self`. As both link entries have the same key - new one will override the old - and we will have a misleading entry in cache, which doesn't contain most of the information we've had about the bridge. It also defies some established expectations about bridge interfaces in libnl. E.g. `rtnl_link_is_bridge` is 1 for such entries, but `rtnl_link_get_type` is NULL. This situation can be amended by implementing update mechanism for link objects. With that, it is possible to properly handle such special rtnetlink messages and merge information available in them into full bridge description.
I must add that current update implementation handles rather narrow edge-case and doesn't cover full scope of things that one could expect from |
It seems hard to understand all the consequence of this patch. But the reasoning and the patch looks good to me, so unless somebody objects, I'll take it... (let's wait a bit). |
if (dst->l_af_ops->ao_update) | ||
return dst->l_af_ops->ao_update(dst, src); | ||
|
||
return -NLE_OPNOTSUPP; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's return early, so we handle failure (OPNOTSUPP) cases first. Also, add a code comment like:
struct rtnl_link *src = (struct rtnl_link *) new_obj;
struct rtnl_link *dst = (struct rtnl_link *) old_obj;
int r;
if (!dst->l_af_ops)
return -NLE_OPNOTSUPP;
if (!dst->l_af_ops->ao_update)
return -NLE_OPNOTSUPP;
r = dst->l_af_ops->ao_update(dst, src);
if (r < 0)
return r;
/* dst was already modified. We can no longer fail at this point. */
return 0;
I'm not sure if you can just replace the bridge data, as the bridge data may be quite incomplete, see #377 for an example. |
Replacing bridge data entirely in the link entry for bridge interfaces doesn't look wrong to me - primarily because bridge interfaces cannot be members of other bridges. As such, only a single operation ( That said, the approach is indeed not general enough and updated bridge data will be lost whenever a new AF_UNSPEC update for bridge itself comes (#414 (comment)). |
Not sure what the bridges (not) being members of other bridges have to do with it. What I mean is that the bridge_data in the AF_BRIDGE update from #377 only contains the vlan info, but not e.g. the MST info - which now would get lost on replacement. Which would also be a memory leak, since this is a linked list of allocated objects. So you would need to update/replace the old bridge_data's vlan info etc with the values from the AF_BRIDGE update, but not touch anything that isn't part of the minimal update. |
I see. You're right, some fields in Thank you for pointing this out, it needs to be addressed. |
Most of the contents of regular AF_BRIDGE updates have to do with the interface status as "port in some bridge". Since bridges can't be members of other bridges - most of the contents of |
Be aware that any AF_UNSPEC updates for links with the IFLA_LINK_TYPE "bridge" are "fixed up" to have the family AF_BRIDGE. This is happening in https://github.com/thom311/libnl/blob/main/lib/route/link.c#L683-L698. |
When bridge interfaces are in vlan-aware mode, one of the rtnetlink updates they produce was not handled correctly.
When the set of vlans active on a vlan-aware bridge itself it being changed it produces RTM_NEWLINK message with AF_BRIDGE family (e.g. with
bridge vlan add dev br0 vid 3 self
).Unlike other messages arriving with AF_BRIDGE family - this one doesn't refer to one of the member interfaces, but to the bridge itself as if it was its own member.
This message doesn't contain full information about bridge - mostly about vlans in IFLA_AF_SPEC attribute. It does, however, refer to the bridge interface itself via ifi_index parameter.
libnl
parses such message into a link object with(AF_BRIDGE, br-ifindex)
key.At the same time, the regular rtnetlink message describing bridge interface itself (one that arrives with AF_UNSPEC family and IFLA_LINKINFO/IFLA_INFO_KIND == bridge) will also be parsed into link object with the same key
(AF_BRIDGE, br-ifindex)
. That is caused by the libnl classifying it internally as bridge (via IFLA_INFO_KIND) and overriding its family to AF_BRIDGE.Such collision results in an unexpected behavior when we have a cache with the information about bridge (from AF_UNSPEC message) and then receive notification produced by
bridge vlan add/del ... self
. As both link entries have the same key - new one will override the old - and we will have a misleading entry in cache, which doesn't contain most of the information we've had about the bridge. It also defies some established expectations about bridge interfaces in libnl. E.g.rtnl_link_is_bridge
is 1 for such entries, butrtnl_link_get_type
is NULL.This situation can be amended by implementing update mechanism for link objects. With that, it is possible to properly handle such special rtnetlink messages and merge information available in them into full bridge description.
Addresses #377