Convert cluster.openshift.io/Network in to the NetworkConfig CRD; update Network status #47

squeed · 2018-12-04T11:28:20Z

This adds a second controller that watches the cluster-level network configuration. If changes are made, it merges those in to (or creates) the CRD configuration.

When the changes are applied, the "real" controller then updates the Network status object.

squeed · 2018-12-04T11:28:31Z

Pending openshift/api#141 merge

danwinship · 2018-12-04T14:18:54Z

But an admin can directly modify the CRD as well, right? (In fact, they have to, to configure some options). What happens if they modify the CRD to contradict the cluster config?

squeed · 2018-12-04T15:03:54Z

@danwinship yeah, I was thinking about that. I can see two possibilities. Either we say "hey, don't do that!", or we remove those fields from the CRD.

Removing overlapping fields from the CRD seems like the right way to me. I don't expect us to be adding much more in the way of fields to the Cluster object (except maybe Isolation).

I'll update the PR (and simplify most of the controller loop) to do this. We'll have to keep the duplicate fields for now, and roll it out in stages.

FWIW, part of this PR makes it possible to get networking up with just the cluster Network object (by defaulting to NetworkPolicy mode for openshift-sdn.

squeed · 2019-01-04T23:31:44Z

Okay, updated this. Now, if there are any fields that were improperly updated in the "downstream" object, revert those changes when we reconcile.

squeed · 2019-01-07T20:39:21Z

Odd, operator logs seem fine.
/retest

squeed · 2019-01-07T22:14:05Z

Hit another AWS quota issue: Jan 07 21:30:41.565 W persistentvolume=pvc-11be8786-12c3-11e9-ab14-0a9830816660 error deleting EBS volume "vol-03b08c117479245b2": "RequestLimitExceeded: Request limit exceeded.\n\tstatus code: 503, request id: b4ada13a-d246-40a0-9f2e-c95ab2e81a1b"

/retest

squeed · 2019-01-08T00:06:09Z

Flake city.
/retest

squeed · 2019-01-08T19:44:05Z

/retest

squeed · 2019-01-08T22:12:50Z

All green. @danwinship, would you mind reviewing?

pkg/controller/clusterconfig/clusterconfig_controller.go

pkg/names/names.go

danwinship · 2019-01-09T15:18:11Z

pkg/network/cluster_config.go

+		size, _ := cidr.Mask.Size()
+		// The comparison is inverted; smaller number is larger block
+		if cnet.HostPrefix < uint32(size) {
+			return errors.Errorf("hostPrefix %d is larger than its cidr %s",


To match origin's validation, this should be <= not <. Also it checks that HostSubnetLength >= 2 aka HostPrefix <= 30

By my reading, it seems to allow a HostPrefix that is the same size as the CIDR? I'm looking here

huh... somehow i looked at that same code and got the wrong answer.

pkg/controller/networkconfig/cluster.go

danwinship · 2019-01-09T15:26:15Z

pkg/controller/networkconfig/cluster.go

+	// If there are changes to the "downstream" networkconfig, commit it back
+	// to the apiserver
+	log.Println("WARNING: NetworkConfig.networkoperator.openshift.io has fields being overwritten by Network.cluster.openshift.io configuration")
+	cfg.TypeMeta = metav1.TypeMeta{APIVersion: "networkoperator.openshift.io/v1", Kind: "NetworkConfig"}


why is this needed?

Good question. There's some kind of bug with the shared cache that is unsetting the Kind, but only on the second run of the reconcile loop.

Sometimes the straightest path is through the mud :-/

maybe this is one of those things where you're mutating an object you got from a cache rather than DeepCopying it and mutating the copy?

Probably not that, but something similar. The client always copies in to a receiving object, and all the types should be registered with the schema. Just confusing.

pkg/controller/networkconfig/cluster.go

pkg/controller/networkconfig/networkconfig_controller.go

pkg/util/ip/addr.go

squeed · 2019-01-10T01:00:30Z

@danwinship Thanks for the excellent review. suggestions incorporated, PTAL.

squeed · 2019-01-10T05:21:53Z

There are a few README changes needed, but I'd rather make them as part of a follow-up PR, since CI is all green and PRs exclusively with doc changes skip CI.

squeed · 2019-01-10T05:22:18Z

README.md

@@ -30,9 +59,14 @@ spec:
 ## Configuring IP address pools
 Users must supply at least two address pools - one for pods, and one for services. These are the ClusterNetworks and ServiceNetwork parameter. Some network plugins, such as OpenShiftSDN, support multiple ClusterNetworks. All address blocks must be non-overlapping. You should select address pools large enough to fit your anticipated workload.

+For future expansion, multiple `serviceNetwork` entries are allowed by the configuration but not actually supported by any network plugins. Supplying multiple addresses is invalid.
+
+Each `clusterNetwork` entry has an additional required parameter, `hostPrefix`, that specifies the address size to assign to assign to each individual node.  For example


Yes, the example is missing, bah.

so either add the example or remove "For example"

squeed · 2019-01-10T05:22:34Z

README.md

+
+Each `clusterNetwork` entry has an additional required parameter, `hostPrefix`, that specifies the address size to assign to assign to each individual node.  For example
+
+IP address pulls are always read from the Cluster configuration and propagated "downwards" in to the Operator configuration. Any changes to the Operator configuration will be ignored.


pulls -> pools.

fix that too

squeed · 2019-01-14T10:42:32Z

@danwinship can I get a LGTM?

danwinship

just doc nits mostly

danwinship · 2019-01-14T14:05:55Z

README.md

+#### Configuration objects
+*Cluster config*
+- *Type Name*: `Network.config.openshift.io`
+- *Instance Name*: `cluster`


(I assume this name matches what other people are doing?)

Yup. Once the installer switches from NetworkConfig to Network, we can align the names if we like.

danwinship · 2019-01-14T14:06:27Z

README.md

+*Cluster config*
+- *Type Name*: `Network.config.openshift.io`
+- *Instance Name*: `cluster`
+- *View Command*: `oc get Network.openshift.io cluster -oyaml`


Network.config.openshift.io

danwinship · 2019-01-14T14:08:07Z

README.md

+  networkType: OpenShiftSDN
+```
+
+*Operator Config*


(Clarify that the operator config in this example would have been auto-generated. Or alternatively, add a non-auto-generatable line to it and then explain that everything except that one line was auto-generated.)

danwinship · 2019-01-14T14:09:23Z

README.md

-The network operator has a complex configuration, but most parameters have a sensible default.
+The network operator gets its configuration from two objects: the Cluster and the Operator configuration. Most users only need to create the Cluster configuration - the operator will generate its configuration automatically. If you need finer-grained configuration of your network, you will need to create both configurations.
+
+Any changes to the Cluster configuration are propagated down in to the Operator configuration.


State explicitly that in case of conflicts, the operator config will be updated to match the cluster config

danwinship · 2019-01-14T14:11:42Z

README.md

@@ -30,9 +59,14 @@ spec:
 ## Configuring IP address pools
 Users must supply at least two address pools - one for pods, and one for services. These are the ClusterNetworks and ServiceNetwork parameter. Some network plugins, such as OpenShiftSDN, support multiple ClusterNetworks. All address blocks must be non-overlapping. You should select address pools large enough to fit your anticipated workload.

+For future expansion, multiple `serviceNetwork` entries are allowed by the configuration but not actually supported by any network plugins. Supplying multiple addresses is invalid.
+
+Each `clusterNetwork` entry has an additional required parameter, `hostPrefix`, that specifies the address size to assign to assign to each individual node.  For example


so either add the example or remove "For example"

danwinship · 2019-01-14T14:11:52Z

README.md

+
+Each `clusterNetwork` entry has an additional required parameter, `hostPrefix`, that specifies the address size to assign to assign to each individual node.  For example
+
+IP address pulls are always read from the Cluster configuration and propagated "downwards" in to the Operator configuration. Any changes to the Operator configuration will be ignored.


fix that too

danwinship · 2019-01-14T14:14:11Z

README.md

+metadata:
+  name: cluster
+spec:
+  serviceNetwork: ["172.30.0.0/16"]


maybe use the expanded syntax for parallel-ness with clusterNetwork? (And to match how oc get will display it.)

serviceNetwork: - "172.30.0.0/16" clusterNetwork: - cidr: "10.128.0.0/14" hostPrefix: 23

There, updated both to match oc get exactly (sans metadata).

danwinship · 2019-01-14T14:20:18Z

pkg/controller/clusterconfig/clusterconfig_controller.go

+// In other words, it watches Network.config.openshift.io/v1/cluster and updates
+// NetworkConfig.networkoperator.openshift.io/v1/default.
+func (r *ReconcileClusterConfig) Reconcile(request reconcile.Request) (reconcile.Result, error) {
+	log.Printf("Reconciling Network %s/%s\n", request.Namespace, request.Name)


"Network" here is ambiguous. Either Network.config.openshift.io or cluster config.

Also, it's non-namespaced so don't print the namespace. (That may apply to ReconcileNetworkConfig too?)

Good point; fixed (for both)

…twork status This adds a second controller that watches the cluster-level network configuration. If changes are made, it merges those in to (or creates) the CRD configuration. When the changes are applied, the "real" controller then updates the Network status object. In the event that the downstream NetworkConfig object has changes that would be overwritten by the upstream Network object, do the overwrite and update the NetworkConfig object back.

This can be removed if / when the installer does it.

squeed · 2019-01-14T17:24:58Z

@danwinship doc nits and log lines fixed, thanks.

squeed · 2019-01-14T18:32:00Z

/retest

danwinship · 2019-01-14T19:03:42Z

/lgtm
/hold

openshift-ci-robot · 2019-01-14T19:03:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, squeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [danwinship,squeed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

squeed · 2019-01-15T10:44:19Z

/retest

squeed · 2019-01-15T14:28:33Z

/hold cancel

squeed added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 4, 2018

squeed requested a review from danwinship December 4, 2018 11:28

openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 4, 2018

openshift-ci-robot requested review from JacobTanenbaum and pecameron December 4, 2018 11:28

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 4, 2018

openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 4, 2019

squeed removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jan 4, 2019

squeed changed the title ~~[wip] Convert cluster.openshift.io/Network in to the Network CRD; update Network status~~ Convert cluster.openshift.io/Network in to the NetworkConfig CRD; update Network status Jan 4, 2019

squeed mentioned this pull request Jan 5, 2019

network: add the network.cluster.openshift.io CRD openshift/installer#1001

Closed

squeed mentioned this pull request Jan 8, 2019

network: move from operator-specific CRD to cluster config openshift/installer#1013

Merged

danwinship requested changes Jan 9, 2019

View reviewed changes

squeed commented Jan 10, 2019

View reviewed changes

danwinship requested changes Jan 14, 2019

View reviewed changes

squeed added 3 commits January 14, 2019 18:23

vendor: update openshift/api

0ce84be

Temporarily create Network.cluster.openshift.io object.

8ab8f1f

This can be removed if / when the installer does it.

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2019

openshift-ci-robot assigned danwinship Jan 14, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2019

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 15, 2019

openshift-merge-robot merged commit eb4e3a7 into openshift:master Jan 15, 2019

squeed deleted the cluster-config branch March 6, 2019 11:14


		Each `clusterNetwork` entry has an additional required parameter, `hostPrefix`, that specifies the address size to assign to assign to each individual node. For example

		IP address pulls are always read from the Cluster configuration and propagated "downwards" in to the Operator configuration. Any changes to the Operator configuration will be ignored.

Convert cluster.openshift.io/Network in to the NetworkConfig CRD; update Network status #47

Convert cluster.openshift.io/Network in to the NetworkConfig CRD; update Network status #47

Conversation

squeed commented Dec 4, 2018

squeed commented Dec 4, 2018

danwinship commented Dec 4, 2018

squeed commented Dec 4, 2018

squeed commented Jan 4, 2019

squeed commented Jan 7, 2019

squeed commented Jan 7, 2019

squeed commented Jan 8, 2019

squeed commented Jan 8, 2019

squeed commented Jan 8, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

squeed commented Jan 10, 2019

squeed commented Jan 10, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

squeed commented Jan 14, 2019

danwinship left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

squeed commented Jan 14, 2019

squeed commented Jan 14, 2019

danwinship commented Jan 14, 2019

openshift-ci-robot commented Jan 14, 2019

squeed commented Jan 15, 2019

squeed commented Jan 15, 2019