-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calico/eBPF node flooding unknown unicast traffic #8918
Labels
Comments
tomastigera
added
kind/bug
likelihood/low
impact/high
area/bpf
eBPF Dataplane issues
labels
Jun 17, 2024
3 tasks
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 18, 2024
Set the MTU to match the smallest MTU of the host devices as per the current autodetection. That makes sure that if large MTU is used, the extra device does not create a bottleneck with a small MTU. Also when MTU changes, on host devices, it get adjusted automatically. Fixes projectcalico#8918
3 tasks
@tomastigera could it be possible to backport #8921 to 3.27.x ? |
aitorpazos
pushed a commit
to team-telnyx/infra-oci-calico-upstream
that referenced
this issue
Jun 18, 2024
* fix when CALI_ST_SKIP_FIB is set on the way to the host, set CALI_CT_FLAG_SKIP_FIB on conntrack - not just when from WEP * add test for ^^^ and issue projectcalico#6450 * In addition to skipping FIB when there is no route to post-dnat destination, also skip FIB when there is a route, but it is not local while there was no service involved. In that case, we are not forwarding a service (NodePort) to another node and we should only forward locally. Let the host decide what to do with such a packet. Fixes projectcalico#8918 (cherry picked from commit 327c4fd)
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 18, 2024
* fix when CALI_ST_SKIP_FIB is set on the way to the host, set CALI_CT_FLAG_SKIP_FIB on conntrack - not just when from WEP * add test for ^^^ and issue projectcalico#6450 * In addition to skipping FIB when there is no route to post-dnat destination, also skip FIB when there is a route, but it is not local while there was no service involved. In that case, we are not forwarding a service (NodePort) to another node and we should only forward locally. Let the host decide what to do with such a packet. Fixes projectcalico#8918
This was referenced Jun 18, 2024
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 18, 2024
* fix when CALI_ST_SKIP_FIB is set on the way to the host, set CALI_CT_FLAG_SKIP_FIB on conntrack - not just when from WEP * add test for ^^^ and issue projectcalico#6450 * In addition to skipping FIB when there is no route to post-dnat destination, also skip FIB when there is a route, but it is not local while there was no service involved. In that case, we are not forwarding a service (NodePort) to another node and we should only forward locally. Let the host decide what to do with such a packet. Fixes projectcalico#8918
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 25, 2024
Set the MTU to match the smallest MTU of the host devices as per the current autodetection. That makes sure that if large MTU is used, the extra device does not create a bottleneck with a small MTU. Also when MTU changes, on host devices, it get adjusted automatically. If an overlay is used, the special device MTU is adjusted to the size of the overlay. Fixes projectcalico#8918
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 27, 2024
Set the MTU to match the smallest MTU of the host devices as per the current autodetection. That makes sure that if large MTU is used, the extra device does not create a bottleneck with a small MTU. Also when MTU changes, on host devices, it get adjusted automatically. If an overlay is used, the special device MTU is adjusted to the size of the overlay. Fixes projectcalico#8918
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 28, 2024
Set the MTU to match the smallest MTU of the host devices as per the current autodetection. That makes sure that if large MTU is used, the extra device does not create a bottleneck with a small MTU. Also when MTU changes, on host devices, it get adjusted automatically. If an overlay is used, the special device MTU is adjusted to the size of the overlay. Fixes projectcalico#8918
tomastigera
added a commit
to tomastigera/project-calico-calico
that referenced
this issue
Jun 28, 2024
Set the MTU to match the smallest MTU of the host devices as per the current autodetection. That makes sure that if large MTU is used, the extra device does not create a bottleneck with a small MTU. Also when MTU changes, on host devices, it get adjusted automatically. If an overlay is used, the special device MTU is adjusted to the size of the overlay. Fixes projectcalico#8918
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Expected Behavior
Unknown unicast traffic with a dst mac address not local to a K8s/Calico node should be dropped.
Current Behavior
We are having an issue with our Calido/eBPF implementation. When we shut down any VMs in the datacenter, some traffic is flooded as unknown unicast, initially, it’s a small number of ICMP echo requests from a probe.
Some K8s nodes we have running calico/eBPF accept this traffic ( even if the dst mac in the packet is not local to the node ) and forward this traffic back to the network. This is creating an amplifying effect as other K8s nodes take this traffic and forward it back to the network. A few packets are becoming 10K-15K.
When we see it in the capture we are doing in the hypervisor, we can also see that the K8s node is changing the src mac address to the address it has in the local interface.
Possible Solution
Make sure Calico/eBPF drops packets with a dst mac address not local to the node.
Steps to Reproduce (for bugs)
3.K8s receives this traffic and sends it back to the network (changing the src mac address ) creating an amplification effect, flooding more traffic in the network.
Context
The issue is impacting the k8s due to the resouces necessary to deal with the flooding and it's also impacting the network and other nodes due the volume of pps being flooded.
Your Environment
The text was updated successfully, but these errors were encountered: