-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification on ip mptcp endpoint
cleanup
#281
Comments
Intersting. Do you have any datapoint you can share? e.g. the problematic configuration and the paired pcap capture?
It should, when the paired IP addresses are delete from the kernel. That in turn could happen when the cable is actually unplugged, or not, depending on other factor e.g. is the system running NM? Note that the MPTCP protocol, for existing/already established connection, could (and should) decide to not use the path[s] on the broken link[s] (well) before the endpoint is actually deleted. |
Hi Marek,
mptcpd's initial goal is to control MPTCP's path manager. In the kernel, there are two types:
With mptcpd, you can create plugins to control path-manager, either the in-kernel one (like In most cases, the in-kernel path-manager can do the job. It can be configured manually with
It depends: a typical deployment is a client with 2 interfaces and a server with 1 and you just want two subflows. The in-kernel PM supports that with or without the fullmesh mode. If now the server has two interfaces, what is the best here? Some might want to have 2 subflows, some might prefer to use all possible paths with "fullmesh" and have 4. Now if you have 2 interfaces, each with one IPv4 and one IPv6, do you want to have 8 paths? :)
Do not hesitate if you have questions or ideas. Especially if you would like to write a nice blog post about that :-D |
Thanks for the answer. That clarifies a lot. Indeed, I have a situation when the server is doing add_addr, which is why the fullmesh is interesting. Can I send RM_ADDR from server to discourage the use of primary server ip/port? My server is loadbalanced / ECMP'ed, so secondary subflow against the original server ip/port has low chances of working. Going back to my original question, about
My lan has both ipv4 (192.168 range) and native ipv6. mptcpd if fresh from github. I think maybe it gets confused by something and thats why it refuses to cleanup the old |
I guess what you need is to have the C flag set in the
Mmh, yes, it looks like a bug. After a quick look, I don't even see the error message in the code :-/ |
I can't quite get the mptcpd from master to build due to I guess old autotools
(ubuntu jammy and custom 6.4 kernel) |
Did you install |
|
Oh, stange, you should have it, no? It is in the |
Thanks, yeah, Ok, config in /usr, is it configurable?
shows
|
Great!
I think it is standard when you don't pass a
Arf, I don't know how I miss that: that's an error message coming from the kernel side (we can see in the strace output that it got that via Netlink) because the endpoint with this IP already exists. I don't know the code of mptcpd well enough (sorry for that but I can ask for help tomorrow at our weekly call) but I wonder if it is not due to mptcpd restart: it tries to re-add endpoints that already exist. I guess it doesn't support a restart in this mode (yet). It should not be blocking but you might have to launch |
Hi Marek, Sorry to come back to you only now (not easy with the holiday period and other changes). I suggest to have a small recap because the discussions went into different directions, just to make sure we are aligned:
|
os: Debian Gnu/Linux 12
ip mptcp endpoint show:
its runing ? is it normal? |
@birdofprey please do not comment on issues not related to the issue here (Clarification on |
ok, on linux you can't have more than 8 subflow configs
|
As a side note, here's a manage_mptcp_endpoint.sh bash script, which does what I would expect mptcpd to do. Example run:
(during the run I plugged in enx5855ca236af3 interface and then plugged it out) |
With the in-kernel path manager, there is indeed a limit (which is not in the userspace PM if I'm not mistaken). Technically, it was supposed to be a limit only on the endpoints if I remember well, mainly to save some bytes in kernel structures and because we thought that 8 endpoints (8 public IPs) used at the same time sounded like a lot already. If you have a good use case, we are open to change, it is just that at that time, we didn't find any realistic one and nobody complained (and we thought that specific use cases might want to use the userspace PM anyway). But regarding the limit on the subflows, it looks like we use the same software limit (8) for the number of subflows we want to establish. I don't see a technical limitation for that and if you have 8 endpoints, you can easily use more than 8 subflows, e.g. in a fullmesh and then have up to 64 subflows. This simple kernel patch should be enough to try with more than 8 subflows (but still with a limit of 8 endpoints) if it is easy for you to modify the kernel: diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 9661f3812682..b22dd35f94e1 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -1768,7 +1768,7 @@ static int mptcp_nl_cmd_dump_addrs(struct sk_buff *msg,
return msg->len;
}
-static int parse_limit(struct genl_info *info, int id, unsigned int *limit)
+static int parse_limit(struct genl_info *info, int id, unsigned int *limit, unsigned int max)
{
struct nlattr *attr = info->attrs[id];
@@ -1776,7 +1776,7 @@ static int parse_limit(struct genl_info *info, int id, unsigned int *limit)
return 0;
*limit = nla_get_u32(attr);
- if (*limit > MPTCP_PM_ADDR_MAX) {
+ if (*limit > max) {
GENL_SET_ERR_MSG(info, "limit greater than maximum");
return -EINVAL;
}
@@ -1792,12 +1792,12 @@ mptcp_nl_cmd_set_limits(struct sk_buff *skb, struct genl_info *info)
spin_lock_bh(&pernet->lock);
rcv_addrs = pernet->add_addr_accept_max;
- ret = parse_limit(info, MPTCP_PM_ATTR_RCV_ADD_ADDRS, &rcv_addrs);
+ ret = parse_limit(info, MPTCP_PM_ATTR_RCV_ADD_ADDRS, &rcv_addrs, MPTCP_PM_ADDR_MAX);
if (ret)
goto unlock;
subflows = pernet->subflows_max;
- ret = parse_limit(info, MPTCP_PM_ATTR_SUBFLOWS, &subflows);
+ ret = parse_limit(info, MPTCP_PM_ATTR_SUBFLOWS, &subflows, 64);
if (ret)
goto unlock;
Both limit are software ones, they can be modified if needed.
Nice script, thank you for sharing it. I agree mptcpd should do that (+ take the endpoints already available when starting it). It doesn't look as simple when it is done in C with the Netlink API, etc. :-D I guess this part in mptcpd has not been fully tested, more focused on the userspace PM side. (I can ask). Note that NM 1.40+ should be able to do that too. (and hopefully better :) ) Because different issues have been discussed here, it might be clearer to open a new issue dedicated to that. I can quickly do that. |
Ok, thanks for explanation. It makes total sense. However, from another point of view: assuming both ipv4 and ipv6 stack, 8 IPs is 4 interfaces of two IPs. But for Ipv6 you often have multiple addrs (mngtmpaddr noprefixroute vs temporary). So... that means we need to be careful which one we add to mptcp. In my script I choose non-temporary, and scope=global (ie: no site local). It's obvious in hindsight, it wasn't obvious on the go. I think the 8-IP limit should be configurable in sysctl. Then in the docs we could write that 8 makes sense and bumping it is more often counterproductive than not. Another issue, which hopefully this github discussion clarifies was that we saw the error But anyway, I'm not arguing for bumping it. It just I wasnt aware of the limit before. |
I'm playing with mptcpd.
I don't quite get what plugins do. I don't quite understand if mptcpd supports "userspace path manager" sysctl net.mptcp.pm_type thing.
Having said that, I don't think MPTCP is interesting (from client point of view) without fullmesh. Or maybe I'm missing something. Without it the handover doesn't seem to be aggressive enough.
With fullmesh I can see mptcpd adding
ip mptcp endpoint
paths:Which is great. However I would expect these things to disappear when I yank out ethernet cable or disable wifi. And that doesn't seem to happen. Is mptcpd cleaning the old endpoints correctly? Is there a way to configure it somehow?
I'm on ubuntu jammy and new 6.4 kernel and custom-compiled mptcpd off git master.
The text was updated successfully, but these errors were encountered: