Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] CNI Server installs OpenFlow entries by PortStatus #6763

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

wenyingd
Copy link
Contributor

@wenyingd wenyingd commented Oct 23, 2024

This change has introduced a worker in podConfigurator to listen for the OpenFlow PortStatus message when a new OpenFlow port is allocated in OVS. After receiving the message, antrea-agent Windows will install Pod related OpenFlow entries. If the OpenFlow port is not allocated within 30s after the CmdAdd request is responded, an event with type "NetworkNotReady" is added on the Pod; Whenever the Pod networking forwarding rules are installed, an event with type "NetworkIsReady" is added.

Fix: #6721

pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/server_windows_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
// unReadyInterfaces is a map to store the OVS ports which is waiting for the PortStatus from OpenFlow switch.
// The key in the map is the OVS port name, and its value is unReadyPodInfo.
// It is used only on Windows now.
unReadyInterfaces sync.Map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But addUnReadyPodInterface is called from a separate goroutine, no? You cannot have a map be updated from different goroutines.

pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
}()
}

func (m *podIfaceMonitor) updatePodFlows(ifName string, ofPort int32) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fair, I was trying to reduce the number of dependencies that need to be injected into the pod monitor

we can leave it as a future change

pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
@wenyingd wenyingd force-pushed the async_ofport branch 2 times, most recently from 7ed399f to cb09094 Compare October 25, 2024 12:34
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
go wait.Until(m.worker, time.Second, stopCh)

go wait.Until(func() {
ticker := time.Tick(retryInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using time.Tick without stopping the ticker can cause memory leaks. Can we replace it with time.NewTicker, and make sure to defer ticker.Stop()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ticker is removed in the latest change as Antonin suggested.

@wenyingd
Copy link
Contributor Author

/test-windows-all

@wenyingd
Copy link
Contributor Author

/test-windows-networkpolicy
/test-windows-e2e

@XinShuYang
Copy link
Contributor

/test-windows-e2e

pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_monitor.go Outdated Show resolved Hide resolved
@wenyingd wenyingd force-pushed the async_ofport branch 3 times, most recently from 2bc767d to d9ea32b Compare October 30, 2024 10:50
@wenyingd wenyingd force-pushed the async_ofport branch 2 times, most recently from c81059a to 3d917bf Compare November 5, 2024 03:16
@wenyingd wenyingd force-pushed the async_ofport branch 3 times, most recently from 162600d to 6f59f5e Compare November 5, 2024 10:45
@wenyingd
Copy link
Contributor Author

wenyingd commented Nov 5, 2024

/test-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the remaining comments on tests are addressed

pkg/agent/cniserver/pod_configuration.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_linux.go Outdated Show resolved Hide resolved
Comment on lines 97 to 86
kubeClient := fakeclientset.NewClientset()
for _, obj := range objects {
if obj != nil {
kubeClient.Tracker().Add(obj)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't seem updated

pkg/agent/cniserver/pod_configuration_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_test.go Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_test.go Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_windows.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/server.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/server_windows_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/pod_configuration_test.go Outdated Show resolved Hide resolved
pkg/agent/cniserver/server_windows_test.go Outdated Show resolved Hide resolved
…ge is received

This change has introduced a worker in `podConfigurator` to listen for the
OpenFlow PortStatus message when a new OpenFlow port is allocated in OVS. After
receiving the message, antrea-agent Windows will install Pod related OpenFlow
entries. If the OpenFlow port is not allocated within 30s after the CmdAdd
request is responded, an event with type "NetworkNotReady" is added on the Pod;
Whenever the Pod networking forwarding rules are installed, an event with type
"NetworkIsReady" is added.

Signed-off-by: Wenying Dong <[email protected]>
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenyingd
Copy link
Contributor Author

wenyingd commented Nov 6, 2024

/test-all

@wenyingd
Copy link
Contributor Author

wenyingd commented Nov 6, 2024

@antoninbas Could you help take another look at this change?

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only very minor comments, so I will go ahead and merge this

if tc.installOpenFlowErr != nil {
require.Equal(t, 1, configurator.unreadyPortQueue.Len())
key, _ := configurator.unreadyPortQueue.Get()
assert.Equal(t, key, podIfName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for future reference, the expected value should come first in assert.Equal

Comment on lines +276 to +280
var gotEvent string
select {
case gotEvent = <-testClients.recorder.Events:
default:
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment indicating that recording events happens synchronously and hence a timeout is not necessary would have been nice here

}
}
}

// initPortStatusMonitor has subscribed a channel to listen for the OpenFlow PortStatus message, and it also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// initPortStatusMonitor has subscribed a channel to listen for the OpenFlow PortStatus message, and it also
// initPortStatusMonitor subscribes a channel to listen for the OpenFlow PortStatus message, and it also

@antoninbas
Copy link
Contributor

I think we need to run Windows tests again

@antoninbas
Copy link
Contributor

/test-windows-all

@wenyingd
Copy link
Contributor Author

wenyingd commented Nov 7, 2024

@antoninbas Windows CI testbed is down because of the limited resource, it may need another 1d to redeploy the setup and run tests. I have verified the change 3 days ago on Windows testbed, and the later change is mostly about the test code. So maybe we can bypass the re-run?

windows-e2e/windows-networkpolicy had passed, windows-conformance was failed because the dependent image was not able to download.

@antoninbas
Copy link
Contributor

@wenyingd yes we can merge without the testing in that case, but we will need to run the Windows CI jobs with this change prior to the release. cc @luolanzone

@antoninbas antoninbas merged commit 85e437c into antrea-io:main Nov 7, 2024
57 checks passed
@antoninbas antoninbas added area/OS/windows Issues or PRs related to the Windows operating system. kind/bug Categorizes issue or PR as related to a bug. action/release-note Indicates a PR that should be included in release notes. labels Nov 7, 2024
@XinShuYang
Copy link
Contributor

@wenyingd yes we can merge without the testing in that case, but we will need to run the Windows CI jobs with this change prior to the release. cc @luolanzone

The Windows tests have passed with the latest Antrea code. Due to resource limitations, we are unable to deploy the Windows testbed before the 2.2 release. After discussion, we plan to skip Windows tests before this release, and there will be no additional Windows-related patches in release 2.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/OS/windows Issues or PRs related to the Windows operating system. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
5 participants