Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match dstIP in Classifier to address windows promiscuous mode issue #6528

Merged
merged 1 commit into from
Aug 8, 2024

Conversation

XinShuYang
Copy link
Contributor

@XinShuYang XinShuYang commented Jul 17, 2024

When promiscuous mode is enabled, OVS incorrectly forwards packets destined for
non-local IP addresses from the uplink to the host interface. Due to IP forwarding
being enabled, these packets are re-sent to br-int according to the default route
and are eventually forwarded to the uplink. Since the source MAC of these packets
has been changed to the local host’s MAC, this can potentially cause errors on the switch.

This patch matches dstIP field in ClassifierTable to ensure proper packet handling
and preventing unintended forwarding.

@XinShuYang
Copy link
Contributor Author

/test-windows-e2e

@XinShuYang
Copy link
Contributor Author

/test-windows-all

@XinShuYang XinShuYang requested a review from wenyingd July 17, 2024 03:24
@XinShuYang
Copy link
Contributor Author

/test-latest-all

@XinShuYang
Copy link
Contributor Author

/test-windows-all

@XinShuYang
Copy link
Contributor Author

/test-windows-containerd-e2e

@XinShuYang
Copy link
Contributor Author

/test-windows-containerd-conformance

@XinShuYang XinShuYang changed the title Match dstmac in Classifier to address windows promiscuous mode issue Match dstIP in Classifier to address windows promiscuous mode issue Jul 23, 2024
Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that we used to have a dedicated change trying to merge similar functions from pipeline_windows and pipeline_others into pipeline.go. So maybe we can think about introducing a mutateFunc in the the unique hostBridgeLocalFlows(), and provide different mutate implementations in the OS separate files?

pkg/agent/openflow/pipeline_windows.go Outdated Show resolved Hide resolved
@rajnkamr
Copy link
Contributor

promiscuous mode is usually enabled for debugging, can we add documentation for this case !

@XinShuYang
Copy link
Contributor Author

promiscuous mode is usually enabled for debugging, can we add documentation for this case !

On Windows, promiscuous mode is not enabled by Antrea. This patch is intended to enhance the pipeline by ensuring the packets destined for other nodes are not forwarded to the local port, which is also reasonable for the debugging purpose.

@XinShuYang
Copy link
Contributor Author

/test-windows-containerd-e2e

@rajnkamr
Copy link
Contributor

promiscuous mode is usually enabled for debugging, can we add documentation for this case !

On Windows, promiscuous mode is not enabled by Antrea. This patch is intended to enhance the pipeline by ensuring the packets destined for other nodes are not forwarded to the local port, which is also reasonable for the debugging purpose.

Ok, When promiscuous mode is enabled, the interface ignores the MAC address filter and captures all network traffic it can see. Are we also considering to match in case destip is ipv6 ? what about linux case, this problem can be seen on linux as well !

@XinShuYang
Copy link
Contributor Author

promiscuous mode is usually enabled for debugging, can we add documentation for this case !

On Windows, promiscuous mode is not enabled by Antrea. This patch is intended to enhance the pipeline by ensuring the packets destined for other nodes are not forwarded to the local port, which is also reasonable for the debugging purpose.

Ok, When promiscuous mode is enabled, the interface ignores the MAC address filter and captures all network traffic it can see. Are we also considering to match in case destip is ipv6 ? what about linux case, this problem can be seen on linux as well !

Currently Windows doesn‘t support IPv6, and in Linux, since uplink port is not added to the bridge by default, we didn't find the same issue.

@XinShuYang XinShuYang force-pushed the windowsuplink branch 3 times, most recently from 14f5066 to c74ca81 Compare July 23, 2024 06:52
@XinShuYang
Copy link
Contributor Author

I remember that we used to have a dedicated change trying to merge similar functions from pipeline_windows and pipeline_others into pipeline.go. So maybe we can think about introducing a mutateFunc in the the unique hostBridgeLocalFlows(), and provide different mutate implementations in the OS separate files?

Sure, added mutate function implementation, PTAL.

@XinShuYang XinShuYang requested a review from wenyingd July 24, 2024 02:12
@XinShuYang XinShuYang marked this pull request as ready for review July 24, 2024 02:12
Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, one minor comment.

@@ -23,6 +23,12 @@ import (
binding "antrea.io/antrea/pkg/ovs/openflow"
)

func (f *featurePodConnectivity) uplinkMatchMutate(flowBuilder binding.FlowBuilder) binding.FlowBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comments here to explain why we need this mutate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, comments updated.

@XinShuYang
Copy link
Contributor Author

It feels like this could impact Linux Nodes as well, when bridging mode is enabled. However, this is very much an edge case, so not a big deal.

You mentioned that the issue was because the VMs had the same MAC address. Why did this happen in the first place? Is it because we used some VM templates and the MAC address was hardcoded? Duplicate MAC addresses won't work if traffic has to go through a L2 switch.

BTW, it seems that the issue is really because of the duplicate MAC addresses, and maybe not so much about the promiscuous mode (which is just inefficient). When the MAC addresses are different:

* the host interface is likely to drop the packet, even if promiscuous mode is enabled on the uplink / adapter

* even if it doesn't, I assume the OS won't try to do IP forwarding if the MAC address doesn't match?

Hi @antoninbas I double checked and found that my previous description of the root cause was inaccurate. The root cause is when promiscuous mode is enabled, the Windows uplink receives all packets on the network without destination MAC checks. According to the previous OVS pipeline, these packets are forwarded to the local host. Since br-int on Windows must have IP forwarding enabled to support features like NodePort, these unrelated packets are sent back to br-int according to the default route (during this process, their source MAC is changed to the local machine's MAC). Then, again, these packets are forwarded to the uplink and leave the host according to the OVS pipeline. This result could affects the correct forwarding by the switch because it receives packets with incorrect source MAC. Given that we have to keep IP forwarding enabled on br-int, the fix is to add a filter to the existing uplink-to-local port rule. cc @wenyingd

@antoninbas
Copy link
Contributor

@XinShuYang do you know if the host interface is also in promiscuous mode? Otherwise I would expect the packet to be dropped after they are forwarded by OVS.

Since br-int on Windows must have IP forwarding enabled to support features like NodePort, these unrelated packets are sent back to br-int according to the default route

At least on Linux, I don't think the OS will do IP forwarding for packets with the wrong MAC, regardless of any promiscuous mode. Promiscuous mode allows a bridge to receive all packets regardless of their MAC address (useful to bridge VMs to a physical network), but the OS should still do a MAC check before IP / L3 forwarding. I could bd wrong though.

@XinShuYang
Copy link
Contributor Author

@XinShuYang do you know if the host interface is also in promiscuous mode? Otherwise I would expect the packet to be dropped after they are forwarded by OVS.

Since br-int on Windows must have IP forwarding enabled to support features like NodePort, these unrelated packets are sent back to br-int according to the default route

At least on Linux, I don't think the OS will do IP forwarding for packets with the wrong MAC, regardless of any promiscuous mode. Promiscuous mode allows a bridge to receive all packets regardless of their MAC address (useful to bridge VMs to a physical network), but the OS should still do a MAC check before IP / L3 forwarding. I could bd wrong though.

Here is the result I got from powershell:

Get-NetAdapter | Format-List -Property PromiscuousMode, Name
PromiscuousMode : False
Name            : br-int
PromiscuousMode : True
Name            : Ethernet0 2
PromiscuousMode : False
Name            : antrea-gw0

This seems to indicate that even with promiscuous mode disabled on br-int, packets sent to it through the OVS pipeline will bypass the MAC check... BTW, in my vSphere lab network environment, the uplink interface always has promiscuous mode enabled, and only enabling promiscuous mode on the standard switch will send unrelated packets to the Windows host.

@@ -25,6 +25,10 @@ import (
binding "antrea.io/antrea/pkg/ovs/openflow"
)

func (f *featurePodConnectivity) matchUplinkInPort(flowBuilder binding.FlowBuilder) binding.FlowBuilder {
return flowBuilder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment means to update the function implementation like this,

func (f *featurePodConnectivity) matchUplinkInPort(flowBuilder binding.FlowBuilder) binding.FlowBuilder {
	return flowBuilder.MatchInPort(f.uplinkPort)
}

And you shall remove line 2972 in pipeline.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I encountered error "antrea.io/antrea/pkg/ovs/openflow".FlowBuilder has no field or method matchUplinkInPort, is there a better way to add featurePodConnectivity in this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated,PTAL.

pkg/agent/openflow/pipeline_windows.go Outdated Show resolved Hide resolved
@antoninbas
Copy link
Contributor

This seems to indicate that even with promiscuous mode disabled on br-int, packets sent to it through the OVS pipeline will bypass the MAC check...

Ok

BTW, in my vSphere lab network environment, the uplink interface always has promiscuous mode enabled, and only enabling promiscuous mode on the standard switch will send unrelated packets to the Windows host.

Yes, I imagine that promiscuous mode works a bit differently in a virtual environment with a virtual switch.

It's still hard for me to understand why the Windows OS would do IP forwarding on arbitrary packets with an unknown MAC address.

@XinShuYang XinShuYang force-pushed the windowsuplink branch 2 times, most recently from 66e9ffe to 8b7cdc6 Compare August 1, 2024 03:10
@XinShuYang
Copy link
Contributor Author

This seems to indicate that even with promiscuous mode disabled on br-int, packets sent to it through the OVS pipeline will bypass the MAC check...

Ok

BTW, in my vSphere lab network environment, the uplink interface always has promiscuous mode enabled, and only enabling promiscuous mode on the standard switch will send unrelated packets to the Windows host.

Yes, I imagine that promiscuous mode works a bit differently in a virtual environment with a virtual switch.

It's still hard for me to understand why the Windows OS would do IP forwarding on arbitrary packets with an unknown MAC address.

@antoninbas I can see the IP-to-MAC info in the ARP table for packets captured on br-int. Based on the behavior observed from Windows, it appears that packets indeed ignore MAC checks during IP forwarding. My guess is the Windows network stack completes the Layer 2 decapsulation before handling IP forwarding. When it decides to forward the packet back to br-int based on the default route, it encapsulates the packet with Layer 2 info, where the src MAC is the local host's MAC, and the dst MAC is the MAC maintained in the ARP table.

wenyingd
wenyingd previously approved these changes Aug 1, 2024
Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XinShuYang
Copy link
Contributor Author

/test-windows-all

antoninbas
antoninbas previously approved these changes Aug 1, 2024
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Would it make sense to use f.matchUplinkInPort for other occurrences of MatchInPort(f.uplinkPort) as well? I see 2 of them in pipeline.go

@XinShuYang
Copy link
Contributor Author

LGTM Would it make sense to use f.matchUplinkInPort for other occurrences of MatchInPort(f.uplinkPort) as well? I see 2 of them in pipeline.go

@antoninbas MatchInPort(f.uplinkPort) in podUplinkClassifierFlows is used for IPAM, which is a feature not currently supported by Windows. Another instance is in hostBridgeUplinkVLANFlows. Since the VLANTable is located after the Classifier in the pipeline, I also believe there is no need to modify the rules?

@luolanzone luolanzone added the area/OS/windows Issues or PRs related to the Windows operating system. label Aug 2, 2024
@luolanzone
Copy link
Contributor

@XinShuYang so this is actually a connection issue when the promiscuous mode is enabled on Windows, right?
@antoninbas I am thinking that if we should add the label 'action/release-note' if we consider it as a bug fix.

@antoninbas
Copy link
Contributor

@XinShuYang got it, then maybe we should update the method name again, to something more specific such as matchUplinkInPortInClassifierTable.

@luolanzone sounds good to me. We can also backport as needed, there is very little risk involved with this change IMO.

When promiscuous mode is enabled, OVS incorrectly forwards packets destined for
non-local IP addresses from the uplink to the host interface. Due to IP forwarding
being enabled, these packets are re-sent to br-int according to the default route
and are eventually forwarded to the uplink. Since the source MAC of these packets
has been changed to the local host’s MAC, this can potentially cause errors on the switch.

This patch matches dstIP field in ClassifierTable to ensure proper packet handling
and preventing unintended forwarding.

Signed-off-by: Shuyang Xin <[email protected]>
@XinShuYang
Copy link
Contributor Author

@XinShuYang got it, then maybe we should update the method name again, to something more specific such as matchUplinkInPortInClassifierTable.

Sure, code updated.

@antoninbas
Copy link
Contributor

/test-windows-all

@XinShuYang
Copy link
Contributor Author

Hi @antoninbas , windows CI passed, can we merged this PR?

@antoninbas antoninbas merged commit 2f863f4 into antrea-io:main Aug 8, 2024
53 of 60 checks passed
@antoninbas antoninbas added the action/release-note Indicates a PR that should be included in release notes. label Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/OS/windows Issues or PRs related to the Windows operating system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants