Skip to content

Commit

Permalink
[Calico] Activate node controller in calico-kube-controllers and add …
Browse files Browse the repository at this point in the history
…CALICO_K8S_NODE_REF in calico-node, this commit fixes kubernetes#3224 and kubernetes#4533
  • Loading branch information
felipejfc committed Mar 6, 2018
1 parent 9e471fe commit 468d941
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 2 deletions.
9 changes: 9 additions & 0 deletions docs/networking.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,15 @@ For help with Calico or to report any issues:

Calico currently uses etcd as a backend for storing information about workloads and policies. Calico does not interfere with normal etcd operations and does not require special handling when upgrading etcd. For more information please visit the [etcd Docs](https://coreos.com/etcd/docs/latest/)

#### Calico troubleshooting
##### New nodes are taking minutes for syncing ip routes and new pods on them can't reach kubedns
This is caused by nodes in the Calico etcd nodestore no longer existing. Due to the ephemeral nature of AWS EC2 instances, new nodes are brought up with different hostnames, and nodes that are taken offline remain in the Calico nodestore. This is unlike most datacentre deployments where the hostnames are mostly static in a cluster. Read more about this issue at https://github.com/kubernetes/kops/issues/3224
This has been solved in kops 1.8.2, when creating a new cluster no action is needed, but if the cluster was created with a prior kops version the following actions should be taken:
* Use kops to update the cluster ```kops update cluster <name> --yes```
* Delete all calico-node pods in kube-system namespace, so that they will apply the new env CALICO_K8S_NODE_REF and update the current nodes in etcd
* Decommission all invalid nodes, [see here](https://docs.projectcalico.org/v2.6/usage/decommissioning-a-node)
* All nodes that are deleted from the cluster after this actions should be cleaned from calico's etcd storage and the delay programming routes should be solved.

### Canal Example for CNI and Network Policy

Canal is a project that combines [Flannel](https://github.com/coreos/flannel) and [Calico](http://docs.projectcalico.org/latest/getting-started/kubernetes/installation/hosted/) for CNI Networking. It uses Flannel for networking pod traffic between hosts via VXLAN and Calico for network policy enforcement and pod to pod traffic.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,11 @@ spec:
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set noderef for node controller.
- name: CALICO_K8S_NODE_REF
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Auto-detect the BGP IP address.
- name: IP
value: ""
Expand Down Expand Up @@ -375,6 +380,13 @@ spec:
requests:
cpu: 10m
env:
# By default only policy, profile, workloadendpoint are turned
# on, node controller will decommission nodes that do not exist anymore
# this and CALICO_K8S_NODE_REF in calico-node fixes #3224, but invalid nodes that are
# already registered in calico needs to be deleted manually, see
# https://docs.projectcalico.org/v2.6/usage/decommissioning-a-node
- name: ENABLED_CONTROLLERS
value: policy,profile,workloadendpoint,node
# The location of the Calico etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,11 @@ spec:
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set noderef for node controller.
- name: CALICO_K8S_NODE_REF
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Auto-detect the BGP IP address.
- name: IP
value: ""
Expand Down Expand Up @@ -324,6 +329,13 @@ spec:
requests:
cpu: 10m
env:
# By default only policy, profile, workloadendpoint are turned
# on, node controller will decommission nodes that do not exist anymore
# this and CALICO_K8S_NODE_REF in calico-node fixes #3224, but invalid nodes that are
# already registered in calico needs to be deleted manually, see
# https://docs.projectcalico.org/v2.6/usage/decommissioning-a-node
- name: ENABLED_CONTROLLERS
value: policy,profile,workloadendpoint,node
# The location of the Calico etcd cluster.
- name: ETCD_ENDPOINTS
valueFrom:
Expand Down
4 changes: 2 additions & 2 deletions upup/pkg/fi/cloudup/bootstrapchannelbuilder.go
Original file line number Diff line number Diff line change
Expand Up @@ -469,8 +469,8 @@ func (b *BootstrapChannelBuilder) buildManifest() (*channelsapi.Addons, map[stri
key := "networking.projectcalico.org"
versions := map[string]string{
"pre-k8s-1.6": "2.4.2-kops.1",
"k8s-1.6": "2.6.7-kops.1",
"k8s-1.7": "2.6.7-kops.1",
"k8s-1.6": "2.6.8-kops.1",
"k8s-1.7": "2.6.8-kops.1",
}

{
Expand Down

0 comments on commit 468d941

Please sign in to comment.