-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add primary UDN net-attach-def #1841
Conversation
b6faf6a
to
50d9dbd
Compare
50d9dbd
to
00d32ac
Compare
@@ -144,6 +144,31 @@ echo 'Adjust kubevirt-ipam-controller to CNAO' | |||
sed -i '/containers:/i\{{ if .EnableSCC }}\ | |||
serviceAccountName: passt-binding-cni\ | |||
{{ end }}' 003-passtbindingcni.yaml | |||
|
|||
cat <<EOF > NetworkAttachmentDefinition_primary-user-defined-network.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead all the changes in this file, since this yaml is generated by CNAO
you can just keep it in the data folder, and add it to line 176
which means what to not delete
rm -rf data/kubevirt-ipam-controller/!(002-rbac.yaml|<file_name_here>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea. thanks!
3c20939
to
1960a98
Compare
@@ -49,6 +49,7 @@ func renderKubevirtIPAMController(conf *cnao.NetworkAddonsConfigSpec, manifestDi | |||
} | |||
data.Data["IsOpenshift"] = clusterInfo.OpenShift4 | |||
data.Data["EnableSCC"] = clusterInfo.SCCAvailable | |||
data.Data["EnableNetworkAttachmentDefinition"] = clusterInfo.NetAttachDefAvailable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the CRD is NA, we should add a fatal
so basically this isn't needed (just need to fatal if CRD is NA + conf.ipam == true)
the only advantage of having this var, is that we can disable it easily if needed
but imho it doesnt worth it for this reason
moreover we can check if CRD exists, only if conf.ipam == true, not always
wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was discussed that we want to fail early.. totally forgot. will fix. thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fail fast, ye but only if conf.ipam == true + nad doesnt exists please imo
1960a98
to
33ccb5f
Compare
maybe it is because the wait was too fast, and wait doesnt wait for the resource to be created unless it is something deeper, and to know we would need logs |
33ccb5f
to
98b864a
Compare
conf := &cnao.NetworkAddonsConfigSpec{ImagePullPolicy: v1.PullAlways, Multus: &cnao.Multus{}, KubevirtIpamController: &cnao.KubevirtIpamController{}, PlacementConfiguration: &cnao.PlacementConfiguration{Workloads: &cnao.Placement{}}} | ||
manifestDir := "../../data" | ||
openshiftNetworkConf := &osv1.Network{} | ||
clusterInfo := &ClusterInfo{SCCAvailable: true, OpenShift4: false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why SCCAvailable true ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shouldn't be. fixed.
test/check/check.go
Outdated
@@ -346,6 +347,10 @@ func checkForComponent(component *Component) error { | |||
errsAppend(checkForPrometheusRule(component.PrometheusRule)) | |||
} | |||
|
|||
if component.NetworkAttachmentDefinition != "" { | |||
errsAppend(checkForNetworkAttachmentDefinition(component.PrometheusRule)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PrometheusRule is wrong here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah copy err. thanks!
@@ -649,6 +658,21 @@ func checkForPrometheusRule(name string) error { | |||
return nil | |||
} | |||
|
|||
func checkForNetworkAttachmentDefinition(name string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but why we need all this logic if we just create the nad and if it doesnt exists we have the error that nad doesnt exists without any additional changes?
is it a must because it was added here ?
https://github.com/kubevirt/cluster-network-addons-operator/pull/1841/files#diff-dd3d0d54317e754fd95de914de01a73b5578626b42e755fdad4998186d552600R90
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but why we need all this logic if we just create the nad
This code is for CNAO CI to make sure the NAD is deployed. It's now CNAO's responsibility, so CI should check it's deployed, and also removed.
and if it doesnt exists we have the error that nad doesnt exists without any additional changes?
small correction - only if the NAD's CRD doesn't exists then CNAO will fail to deploy with the error you mention.
is it a must because it was added here ? https://github.com/kubevirt/cluster-network-addons-operator/pull/1841/files#diff-dd3d0d54317e754fd95de914de01a73b5578626b42e755fdad4998186d552600R90
technically yeah, but it's intentional. As I mentioned above - it's CNAO's job to make sure the resources it is in charge of deploying are deployed.
Looks like kubevirt/kubevirt now wait 15m |
this change is 4y ago it seems, but worth to try to extend |
98b864a
to
aee52c8
Compare
conf := &cnao.NetworkAddonsConfigSpec{ImagePullPolicy: v1.PullAlways, Multus: &cnao.Multus{}, KubevirtIpamController: &cnao.KubevirtIpamController{}, PlacementConfiguration: &cnao.PlacementConfiguration{Workloads: &cnao.Placement{}}} | ||
manifestDir := "../../data" | ||
openshiftNetworkConf := &osv1.Network{} | ||
clusterInfo := &ClusterInfo{SCCAvailable: false, OpenShift4: false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically since all are false, those are the default values anyhow
so you can just create an empty struct
unless one of them is better to be explicit false, but this is not the case here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE
I think the best we cant do is to install kubevirt using kind.sh
|
aee52c8
to
0e96625
Compare
@@ -742,6 +742,9 @@ func GetClusterRole(allowMultus bool) *rbacv1.ClusterRole { | |||
"get", | |||
"list", | |||
"watch", | |||
"create", | |||
"update", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need update ? (maybe the reconcile use it when it recreate?)
please consider moving it under the else of allowMultus
otherwise there are duplicate / overlapping rbacs due to multusClusterRoles in U/S, which i tried to refrain from (well indeed on some places like get list watch i did add, because it was just for NAD and needed on both U/S and D/S, so whatever you prefer)
you can move the flag into multusClusterRoles and it will determine if to add * or to add just the diff needed please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need update ? (maybe the reconcile use it when it recreate?)
We do.
please consider moving it under the else of
allowMultus
otherwise there are duplicate / overlapping rbacs due to multusClusterRoles in U/S, which i tried to refrain fromyou can move the flag into multusClusterRoles and it will determine if to add * or to add just the diff needed please
OK, DONE
Maybe, but it should be done in a separate PR to be sure. |
started lately |
test/e2e/workflow/deployment_test.go
Outdated
timeout := CheckImmediately | ||
if len(components) > 1 { | ||
By("allowing eventual consistency on co-dependant components deployments") | ||
timeout = 10 * time.Second | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about just always allow 10s ?
and then we can drop this code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I tend to agree..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe the commit message of that need to be changed please
eebbf53
to
1e7b6c1
Compare
about the workflow failure - this is not a solution but maybe a temporary workaround and then we can fix it on a follow-up (imo 10s is not enough as we spoke): EDIT - if CNAO waits for processing / ready state too short, it might fail because in the middle the state becomes fail for various errors, only eventually it will be processing / ready - not sure if relevant here, but i did saw it before |
if component.NetworkAttachmentDefinition != "" { | ||
errsAppend(checkForNetworkAttachmentDefinitionRemoval(component.NetworkAttachmentDefinition)) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition can be moved into the function
but we can consider please doing it on a follow-up for all those methods not in this PR
same for the checkForNetworkAttachmentDefinition
family
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we can consider doing it on a follow-up for all those methods not in this PR
+1
1e7b6c1
to
dc9e18b
Compare
The issue was not relevant to timing, something much more dumb. |
Please let me know once all is passing and ready for review, |
dc9e18b
to
f634d35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits, beside that lgtm
thanks
err := testenv.Client.Get(context.Background(), types.NamespacedName{Name: name, Namespace: corev1.NamespaceDefault}, &networkAttachmentDefinition) | ||
if err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use
if ...; err != nil {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a cosmetic change, that should be done to all of the checkForX
functions above. Let's do it in a separate refactor PR that does not change logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well this is a new function, but i wont insist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make the change in a separate PR after this one is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thx
if err != nil { | ||
return err | ||
} | ||
|
||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you can just return err basically, but one can consider it less readable (imho it is fine fwiw)
so either this please or short writing
if ...; err != nil {
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -742,6 +742,9 @@ func GetClusterRole(allowMultus bool) *rbacv1.ClusterRole { | |||
"get", | |||
"list", | |||
"watch", | |||
"create", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing we can consider
limit this to specific NS (i.e default atm)
or even better - use it as Role instead ClusterRole
well for that we need a Role / RoleBinding in the default NS (not that this is risky a bit with backward compatibility once the NS is changed - for this reason we can postpone thinking of it and keep it ClusterRole atm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts exactly.. we can decide to use more specifc RBAC when we decide what the final resting place of this NAD
/lgtm cancel please at least consider #1841 (comment) EDIT - since using a Role can be risky later for backward compatibility in case the NS will be changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first pass let's not reference UDN here.
apiVersion: "k8s.cni.cncf.io/v1" | ||
kind: NetworkAttachmentDefinition | ||
metadata: | ||
name: primary-user-defined-network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this primary-user-defined-network naming just for the name of the kubevirt binding, so it's exposed to users just as VM spec,
We should be specific here same ass we are with passt-binding-cni
DaemonSet
So name should be something like
name: [CNAO namespace]-kubevirt-passt-binding-plugin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CNAO namespace]
why CNAO namespace? CNAO only deploys, the ns has nothing to do with this NAD..
Also, CNAO's ns change on U/S,D/S, do we want the NAD's name to be different? (I mean, it's not braking anything, it's just kinda strange)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some back and forth let's name this
name: primary-udn-kubevirt-binding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE
spec: | ||
config: '{ | ||
"cniVersion": "1.0.0", | ||
"name": "primary-user-defined-network", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's differenciate from nad name, since those are different
name: [CNAO namespace]-kubevirt-passt-binding-plugin-net
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some back and forth let's name this
name: primary-udn-kubevirt-binding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE
@@ -148,7 +148,7 @@ echo 'Adjust kubevirt-ipam-controller to CNAO' | |||
|
|||
echo 'Copy manifests' | |||
shopt -s extglob | |||
rm -rf data/kubevirt-ipam-controller/!(002-rbac.yaml) | |||
rm -rf data/kubevirt-ipam-controller/!(002-rbac.yaml|004-primary-udn-networkattachdef.yaml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: Let's not reference UDN at the nad, just at the kubevirt CR config on HCO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I'm not against this, but how will anyone ever understand the connection between the NAD and the binding? shouldn't they have at least something in common? a label maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's name this
004-primary-udn-kubevirt-binding-networkattachdef.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just short nad ?
004-primary-udn-kubevirt-binding-nad.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DONE
This resource does not belong to kubevirt-ipam-controller, but is currently piggy-backing this component in order to deploy the primary-udn-kubevirt-binding net-attach-def [0]. This net-attach-def is deployed on default namespace, as this way it will be available to all VMs that need to consume it. Note that for CI, KubevirtIpamController will not be able to deploy without Multus (that deploys the net-attach-def CRD) alongside it, or the net-attach-def deployment will fail, with error: `could not apply (k8s.cni.cncf.io/v1, Kind=NetworkAttachmentDefinition) default/primary-user-defined-network: could not retrieve existing (k8s.cni.cncf.io/v1, Kind=NetworkAttachmentDefinition) default/primary-user-defined-network: no matches for kind "NetworkAttachmentDefinition" in version "k8s.cni.cncf.io/v1"` [0] https://kubevirt.io/user-guide/network/network_binding_plugins/#deployment Signed-off-by: Ram Lavi <[email protected]>
f634d35
to
d59c852
Compare
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: RamLavi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Please create a patch release, so it will be consumed wherever needed |
This reverts commit ccb7ca2. Signed-off-by: Ram Lavi <[email protected]>
What this PR does / why we need it:
This PR adds support to user-defined-network binding by deploying the appropriate NetworkAttachmentDefinition.
This is part of the process described on kubevirt user guide.
The behavior added is currently dev-preview. Hence, the NetworkAttachmentDefinition is not deployed by the proper component, but is deployed by ipam-extentions. This should change in future PRs.
Special notes for your reviewer:
Release note: