UDP connections from pods to daemonset are lost when daemonset is replaced #373

Suckzoo · 2019-04-03T08:56:37Z

Background

Kubernetes version: 1.11
vpc-cni version: 1.3.2
EKS version: eks.2

Problem

We deployed a daemonset that accepts UDP packets through hostPort 8125. At first, we observed that other pods are correctly sending packets to pods of the daemonset.: pods are trying to send UDP packets to their host, and the hosts redirects the packets to their pods of the daemonset.

Then we replaced and redeployed the daemonset with the yaml file which is totally identical with the previous daemonset's one. After we deploy the daemonset again, the replaced daemonset does not accept packets from the pods. Pods are doing their best to send their packets, but the packets are never being delivered to the daemonset.

How to Reproduce

Deploy a daemonset that accepts UDP packets through hostPort. An UDP echo server would be enough.
Deploy pods that sends UDP packets consistently to their host's designated hostPort.
You can check packets are incoming to the daemonset.
Replace daemonset, i.e. kubectl replace --force -f daemonset.yml
Packets from pod(specifically, the running process spawned by dockerfile) do not reach the newly-deployed daemonset.

Expected Behavior

Replaced daemonset should also accept the packets. In other words, cni must reroute the packets to the newly deployed daemonset.

Trivia

We tried to run a shell inside the pods from step 2 and sent UDP packets manually to the host. We've tcpdumped entire packets incoming to the daemonset and we've observed that the packets manually sent are correctly reaching to the daemonset.
We tried to delete the running pods from step 2. Likewise, we've observed that the packets sent from newly deployed pods are correctly reaching to the daemonset.

Should you need more information, please let me know via mentioning me.
Thanks in advance.

The text was updated successfully, but these errors were encountered:

sethp-nr · 2019-04-03T17:58:08Z

When we observed similar behavior, it wasn't the fault of the CNI. In our case, the client was doing two unusual things:

Calling connect on a UDP socket
Caching the DNS resolution in a static member variable and never re-resolving it

The effect of [1] was to cause the kernel to "pin" the UDP flow: it only went through the iptables rules once at connect time. When the client sent packets on that socket, they followed that flow. The effect of [2] was exactly what you'd expect from a no-TTL DNS cache, just harder to find (which is why I mention it).

Unfortunately, I don't recall how we solved it: UDP has no in-band way to signal that the receiver's gone away and the client should try reconnecting. Not connecting would have caused a performance hit, but maybe that's acceptable in your case?

Suckzoo · 2019-04-04T02:56:35Z

In our case, we're sending UDP flows via host's IP, not the name of the server. We think DNS is irrelevant with our issue.

sethp-nr · 2019-04-04T17:02:03Z

That makes sense. From #153, it seems to me the hostPort handling is delegated out to the upstream portmap plugin. That's billed as:

portmap: An iptables-based portmapping plugin. Maps ports from the host's address space to the container.

Which suggests to me that the behavior is "expected," or at least an issue more fixable in one of the upstream bits (portmap or the linux kernel, maybe). We never got around to trying it, but another thought we had was to turn down the nf_conntrack_udp_timeout to see if that made the kernel forget about "connected" UDP sockets fast enough for an acceptable amount of packet loss.

Suckzoo · 2019-04-08T03:26:08Z

@sethp-nr Thanks for suggesting the workaround :)
We tried to set nf_conntrack_udp_timeout to 0. The result was, as you expected, we observed that packets are flowing into the newly created container. However, there is a big problem: name resolving does not work at all. The container failed to resolve any address.

Suckzoo · 2019-04-08T03:29:42Z

For now, I think we can make one of these choices:

Detect container destruction in vpc-cni and flush entries from the portmap's iptable.
Detect container destruction in portmap and flush entries from the portmap's iptable

As I'm not familiar with both portmap and vpc-cni, I'm not sure which to fix. Which would be the best option?

sethp-nr · 2019-04-08T20:10:18Z

Well, since this CNI delegates to portmap's implementation for host ports, it seems to me that the right place would be the upstream project.

In fact, it looks like there's already an issue about this case: containernetworking/plugins#123

Suckzoo · 2019-04-09T05:03:01Z

Seems I should go there to discuss :) Closing this issue since it seems portmap plugin is responsible for this. Thanks a lot @sethp-nr :)

mogren added the needs investigation label Apr 3, 2019

Suckzoo closed this as completed Apr 9, 2019

ishustava mentioned this issue Mar 18, 2020

UDP port 8301 does not work with client.exposeGossipPort set to true hashicorp/consul-helm#389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UDP connections from pods to daemonset are lost when daemonset is replaced #373

UDP connections from pods to daemonset are lost when daemonset is replaced #373

Suckzoo commented Apr 3, 2019

sethp-nr commented Apr 3, 2019

Suckzoo commented Apr 4, 2019

sethp-nr commented Apr 4, 2019

Suckzoo commented Apr 8, 2019

Suckzoo commented Apr 8, 2019 •

edited

Loading

sethp-nr commented Apr 8, 2019

Suckzoo commented Apr 9, 2019

UDP connections from pods to daemonset are lost when daemonset is replaced #373

UDP connections from pods to daemonset are lost when daemonset is replaced #373

Comments

Suckzoo commented Apr 3, 2019

Background

Problem

How to Reproduce

Expected Behavior

Trivia

sethp-nr commented Apr 3, 2019

Suckzoo commented Apr 4, 2019

sethp-nr commented Apr 4, 2019

Suckzoo commented Apr 8, 2019

Suckzoo commented Apr 8, 2019 • edited Loading

sethp-nr commented Apr 8, 2019

Suckzoo commented Apr 9, 2019

Suckzoo commented Apr 8, 2019 •

edited

Loading