Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade not running to completion when changing OS image #44

Closed
Xenwar opened this issue Mar 25, 2020 · 10 comments
Closed

Upgrade not running to completion when changing OS image #44

Xenwar opened this issue Mar 25, 2020 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Xenwar
Copy link
Member

Xenwar commented Mar 25, 2020

What steps did you take and what happened:

  • Provision a control plane node
  • Save the Metal3MachineTemplate resource in the metal3 namespace to disk.
  • Edit its by changing its name, OS image url and hash to new valid values.
  • Then save new file to kubernetes, kubectl create -f ....
  • Edit the KubeadmControlPlane resource in the metal3 namespace by changing
    the infrastructureTemplate.name to that of the new name you used.

Then, wait for the Provisioning of a new node and the de-Provisioning of the
existing one.

However,

  • Both VMs will be in provisioned state
  • The kubernetes API server in both VMs are not running.

What did you expect to happen:
That a new node with the new OS image is Provisioned, and
That the existing node is de-provisioned.

Anything else you would like to add:

  • This is a blocker on the ongoing upgrade workflow
  • The same workflow was successful before 16.03.2020
  • The start of provisioning and de-provisioning might take up to 20 minutes each.
  • Relevant logs are added as comments.
  • Repeated entries from the logs are removed fore brevity.

I have created a similar issue in CAPI, here
It seems that it is already fixed on the CAPI side and it could be the provider components that are the cause.

Environment:

  • Cluster-api version: v1alpha3
  • CAPBM version: v0.3.0
  • Minikube version: v1.8.2
  • metal3-dev-env
  • Kubernetes version: v1.17.3

/kind bug

@metal3-io-bot metal3-io-bot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 25, 2020
@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

Removing previous log comments and adding new ones with more controllers.

kubectl get bmh -A
NAMESPACE   NAME     STATUS   PROVISIONING STATUS   CONSUMER                   BMC                         HARDWARE PROFILE   ONLINE   ERROR
metal3      node-0   OK       provisioned           newos-zgwg4                ipmi://192.168.111.1:6230   unknown            true     
metal3      node-1   OK       provisioned           test1-controlplane-t7g82   ipmi://192.168.111.1:6231   unknown            true     

node-1 was provisioned first.

@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

capi-controller-manager-6b584cb4b7-wb6vv.log

I0402 08:03:13.145220       1 listener.go:44] controller-runtime/metrics "msg"="metrics server is starting to listen"  "addr"="127.0.0.1:8080"
I0402 08:03:13.146883       1 main.go:175] setup "msg"="starting manager"  "version"=""
I0402 08:03:13.147601       1 leaderelection.go:242] attempting to acquire leader lease  capi-system/controller-leader-election-capi...
I0402 08:03:13.150161       1 internal.go:356] controller-runtime/manager "msg"="starting metrics server"  "path"="/metrics"
I0402 08:03:13.177366       1 leaderelection.go:252] successfully acquired lease capi-system/controller-leader-election-capi
I0402 08:03:13.251773       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinehealthcheck" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","selector":{},"unhealthyConditions":null},"status":{}}}
I0402 08:03:13.251934       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinehealthcheck" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:13.252763       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="cluster" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:13.253290       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machine" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:13.253412       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machine" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:13.254316       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machineset" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","selector":{},"template":{"metadata":{},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}}}},"status":{"replicas":0}}}
I0402 08:03:13.255699       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinedeployment" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","selector":{},"template":{"metadata":{},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}}}},"status":{}}}
I0402 08:03:13.354680       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinehealthcheck" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:13.354936       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="machinehealthcheck"
I0402 08:03:13.355181       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="machinehealthcheck" "worker count"=10
I0402 08:03:13.355820       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machineset" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:13.355865       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machineset" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:13.355896       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machineset" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:13.355919       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="machineset"
I0402 08:03:13.355941       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="machineset" "worker count"=10
I0402 08:03:13.355944       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="cluster" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:13.356085       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="cluster"
I0402 08:03:13.356105       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="cluster" "worker count"=10
I0402 08:03:13.356129       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="machine"
I0402 08:03:13.356146       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="machine" "worker count"=10
I0402 08:03:13.358029       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinedeployment" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","selector":{},"template":{"metadata":{},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}}}},"status":{"replicas":0}}}
I0402 08:03:13.358072       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinedeployment" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","selector":{},"template":{"metadata":{},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}}}},"status":{"replicas":0}}}
I0402 08:03:13.358101       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machinedeployment" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:13.358124       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="machinedeployment"
I0402 08:03:13.358141       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="machinedeployment" "worker count"=10
I0402 08:05:51.088690       1 tracker.go:54] controllers/Cluster "msg"="Adding watcher on external object" "cluster"="test1" "namespace"="metal3" "GroupVersionKind"="infrastructure.cluster.x-k8s.io/v1alpha3, Kind=Metal3Cluster"
I0402 08:05:51.088987       1 controller.go:138] controller-runtime/controller "msg"="Starting EventSource"  "controller"="cluster" "source"={"Type":{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"Metal3Cluster"}}
E0402 08:05:51.218680       1 cluster_controller.go:167] controllers/Cluster "msg"="Reconciliation for Cluster asked to requeue" "error"="could not find controlplane.cluster.x-k8s.io/v1alpha3, Kind=KubeadmControlPlane \"test1-controlplane\" for Cluster \"test1\" in namespace \"metal3\", requeuing: requeue in 30s" "cluster"="test1" "namespace"="metal3" 
E0402 08:05:51.387427       1 cluster_controller.go:167] controllers/Cluster "msg"="Reconciliation for Cluster asked to requeue" "error"="could not find controlplane.cluster.x-k8s.io/v1alpha3, Kind=KubeadmControlPlane \"test1-controlplane\" for Cluster \"test1\" in namespace \"metal3\", requeuing: requeue in 30s" "cluster"="test1" "namespace"="metal3" 
E0402 08:05:51.494141       1 cluster_controller.go:167] controllers/Cluster "msg"="Reconciliation for Cluster asked to requeue" "error"="could not find controlplane.cluster.x-k8s.io/v1alpha3, Kind=KubeadmControlPlane \"test1-controlplane\" for Cluster \"test1\" in namespace \"metal3\", requeuing: requeue in 30s" "cluster"="test1" "namespace"="metal3" 
I0402 08:06:21.423646       1 tracker.go:54] controllers/Cluster "msg"="Adding watcher on external object" "cluster"="test1" "namespace"="metal3" "GroupVersionKind"="controlplane.cluster.x-k8s.io/v1alpha3, Kind=KubeadmControlPlane"
I0402 08:06:21.423840       1 controller.go:138] controller-runtime/controller "msg"="Starting EventSource"  "controller"="cluster" "source"={"Type":{"apiVersion":"controlplane.cluster.x-k8s.io/v1alpha3","kind":"KubeadmControlPlane"}}
I0402 08:06:24.196014       1 tracker.go:54] controllers/Machine "msg"="Adding watcher on external object" "machine"="test1-controlplane-8fj79" "namespace"="metal3" "GroupVersionKind"="bootstrap.cluster.x-k8s.io/v1alpha3, Kind=KubeadmConfig"
I0402 08:06:24.196062       1 controller.go:138] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machine" "source"={"Type":{"apiVersion":"bootstrap.cluster.x-k8s.io/v1alpha3","kind":"KubeadmConfig"}}
I0402 08:06:24.353683       1 tracker.go:54] controllers/Machine "msg"="Adding watcher on external object" "machine"="test1-controlplane-8fj79" "namespace"="metal3" "GroupVersionKind"="infrastructure.cluster.x-k8s.io/v1alpha3, Kind=Metal3Machine"
I0402 08:06:24.353970       1 controller.go:138] controller-runtime/controller "msg"="Starting EventSource"  "controller"="machine" "source"={"Type":{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"Metal3Machine"}}
I0402 08:06:24.354756       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
E0402 08:06:24.355564       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Bootstrap provider for Machine \"test1-controlplane-8fj79\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
I0402 08:06:24.507936       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
E0402 08:11:59.275817       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Infrastructure provider for Machine \"test1-controlplane-8fj79\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
I0402 08:12:02.238687       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
E0402 08:12:02.238913       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Infrastructure provider for Machine \"test1-controlplane-8fj79\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
I0402 08:12:12.284617       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
E0402 08:12:12.285193       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Infrastructure provider for Machine \"test1-controlplane-8fj79\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" 
E0402 08:12:25.606281       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create client for Cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=32s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="machine" "request"={"Namespace":"metal3","Name":"test1-controlplane-8fj79"}
E0402 08:12:28.597216       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create client for Cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=32s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="machine" "request"={"Namespace":"metal3","Name":"test1-controlplane-8fj79"}
E0402 08:12:31.668110       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create client for Cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=32s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="machine" "request"={"Namespace":"metal3","Name":"test1-controlplane-8fj79"}
E0402 08:14:25.843653       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create client for Cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=32s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="machine" "request"={"Namespace":"metal3","Name":"test1-controlplane-8fj79"}
E0402 08:15:22.907962       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create client for Cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=32s: dial tcp 192.168.111.249:6443: connect: connection refused"  "controller"="machine" "request"={"Namespace":"metal3","Name":"test1-controlplane-8fj79"}
I0402 08:16:23.099040       1 machine_controller_noderef.go:81] controllers/Machine "msg"="Set Machine's NodeRef" "cluster"="test1" "machine"="test1-controlplane-8fj79" "namespace"="metal3" "noderef"="node-1"
I0402 10:11:49.354994       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:11:49.355037       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Bootstrap provider for Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
I0402 10:11:49.538243       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:17:49.642996       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Infrastructure provider for Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
I0402 10:17:58.796377       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:17:58.797307       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Infrastructure provider for Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
I0402 10:18:00.346022       1 machine_controller_noderef.go:53] controllers/Machine "msg"="Machine doesn't have a valid ProviderID yet" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:18:00.346198       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="Infrastructure provider for Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\" is not ready, requeuing: requeue in 30s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:18:08.984650       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="cannot assign NodeRef to Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\", no matching Node: requeue in 10s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:18:09.197566       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="cannot assign NodeRef to Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\", no matching Node: requeue in 10s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:18:09.314930       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="cannot assign NodeRef to Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\", no matching Node: requeue in 10s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
E0402 10:21:40.309073       1 machine_controller.go:215] controllers/Machine "msg"="Reconciliation for Machine asked to requeue" "error"="cannot assign NodeRef to Machine \"test1-controlplane-ds5f9\" in namespace \"metal3\", no matching Node: requeue in 10s" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" 
I0402 10:21:50.423254       1 machine_controller_noderef.go:81] controllers/Machine "msg"="Set Machine's NodeRef" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "namespace"="metal3" "noderef"="node-0"

@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

capi-kubeadm-bootstrap-controller-manager-689678664d-vrp4f.log

I0402 08:03:04.307841       1 listener.go:44] controller-runtime/metrics "msg"="metrics server is starting to listen"  "addr"="127.0.0.1:8080"
I0402 08:03:04.309313       1 main.go:151] setup "msg"="starting manager"  "version"=""
I0402 08:03:04.310528       1 leaderelection.go:242] attempting to acquire leader lease  capi-kubeadm-bootstrap-system/kubeadm-bootstrap-manager-leader-election-capi...
I0402 08:03:04.310545       1 internal.go:356] controller-runtime/manager "msg"="starting metrics server"  "path"="/metrics"
I0402 08:03:04.338977       1 leaderelection.go:252] successfully acquired lease capi-kubeadm-bootstrap-system/kubeadm-bootstrap-manager-leader-election-capi
I0402 08:03:04.341619       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmconfig" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{},"status":{}}}
I0402 08:03:04.442235       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmconfig" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:04.543016       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmconfig" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:04.646700       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="kubeadmconfig"
I0402 08:03:04.646863       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="kubeadmconfig" "worker count"=10
I0402 08:06:24.410673       1 control_plane_init_mutex.go:86] init-locker "msg"="Attempting to acquire the lock" "cluster-name"="test1" "configmap-name"="test1-lock" "machine-name"="test1-controlplane-8fj79" "namespace"="metal3" 
I0402 08:06:24.471547       1 kubeadmconfig_controller.go:297] controllers/KubeadmConfig "msg"="Creating BootstrapData for the init control plane" "kind"="Machine" "kubeadmconfig"={"Namespace":"metal3","Name":"test1-controlplane-jkcgj"} "name"="test1-controlplane-8fj79" "version"="3262" 
I0402 08:06:24.482257       1 kubeadmconfig_controller.go:659] controllers/KubeadmConfig "msg"="Altering ClusterConfiguration" "kubeadmconfig"="metal3/test1-controlplane-jkcgj" "ControlPlaneEndpoint"="192.168.111.249:6443"
I0402 08:06:24.482382       1 kubeadmconfig_controller.go:665] controllers/KubeadmConfig "msg"="Altering ClusterConfiguration" "kubeadmconfig"="metal3/test1-controlplane-jkcgj" "ClusterName"="test1"
I0402 08:06:24.482398       1 kubeadmconfig_controller.go:672] controllers/KubeadmConfig "msg"="Altering ClusterConfiguration" "kubeadmconfig"="metal3/test1-controlplane-jkcgj" "DNSDomain"="cluster.local"
I0402 08:06:24.482413       1 kubeadmconfig_controller.go:678] controllers/KubeadmConfig "msg"="Altering ClusterConfiguration" "kubeadmconfig"="metal3/test1-controlplane-jkcgj" "ServiceSubnet"="10.96.0.0/12"
I0402 08:06:24.482428       1 kubeadmconfig_controller.go:684] controllers/KubeadmConfig "msg"="Altering ClusterConfiguration" "kubeadmconfig"="metal3/test1-controlplane-jkcgj" "PodSubnet"="192.168.0.0/18"
I0402 08:06:24.482676       1 kubeadmconfig_controller.go:691] controllers/KubeadmConfig "msg"="Altering ClusterConfiguration" "kubeadmconfig"="metal3/test1-controlplane-jkcgj" "KubernetesVersion"="v1.18.0"
I0402 10:11:49.331165       1 control_plane_init_mutex.go:105] init-locker "msg"="Checking for lock" "cluster-name"="test1" "configmap-name"="test1-lock" "namespace"="metal3" 
I0402 10:11:49.405934       1 kubeadmconfig_controller.go:621] controllers/KubeadmConfig "msg"="Altering JoinConfiguration.Discovery.BootstrapToken" "kubeadmconfig"="metal3/test1-controlplane-6cl29" "APIServerEndpoint"="192.168.111.249:6443"
I0402 10:11:49.574208       1 kubeadmconfig_controller.go:637] controllers/KubeadmConfig "msg"="Altering JoinConfiguration.Discovery.BootstrapToken" "kubeadmconfig"="metal3/test1-controlplane-6cl29" "Token"="bus6tj.efv2sr82lid8md94"
I0402 10:11:49.578714       1 kubeadmconfig_controller.go:484] controllers/KubeadmConfig "msg"="Creating BootstrapData for the join control plane" "kind"="Machine" "kubeadmconfig"={"Namespace":"metal3","Name":"test1-controlplane-6cl29"} "name"="test1-controlplane-ds5f9" "version"="46528" 
I0402 10:11:49.829941       1 kubeadmconfig_controller.go:215] controllers/KubeadmConfig "msg"="refreshing token until the infrastructure has a chance to consume it" "kind"="Machine" "kubeadmconfig"={"Namespace":"metal3","Name":"test1-controlplane-6cl29"} "name"="test1-controlplane-ds5f9" "version"="46539" 
I0402 10:18:09.154525       1 kubeadmconfig_controller.go:215] controllers/KubeadmConfig "msg"="refreshing token until the infrastructure has a chance to consume it" "kind"="Machine" "kubeadmconfig"={"Namespace":"metal3","Name":"test1-controlplane-6cl29"} "name"="test1-controlplane-ds5f9" "version"="46550" 

@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

capi-kubeadm-control-plane-controller-manager-57bcb55696-jwkj2.log

I0402 08:03:09.214179       1 listener.go:44] controller-runtime/metrics "msg"="metrics server is starting to listen"  "addr"="127.0.0.1:8080"
I0402 08:03:09.215408       1 main.go:143] setup "msg"="starting manager"  "version"=""
I0402 08:03:09.217851       1 leaderelection.go:242] attempting to acquire leader lease  capi-kubeadm-control-plane-system/kubeadm-control-plane-manager-leader-election-capi...
I0402 08:03:09.218980       1 internal.go:356] controller-runtime/manager "msg"="starting metrics server"  "path"="/metrics"
I0402 08:03:09.240908       1 leaderelection.go:252] successfully acquired lease capi-kubeadm-control-plane-system/kubeadm-control-plane-manager-leader-election-capi
I0402 08:03:09.242313       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmcontrolplane" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"version":"","infrastructureTemplate":{},"kubeadmConfigSpec":{}},"status":{"initialized":false,"ready":false}}}
I0402 08:03:09.348264       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmcontrolplane" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 08:03:09.452042       1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource"  "controller"="kubeadmcontrolplane" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 08:03:09.552763       1 controller.go:171] controller-runtime/controller "msg"="Starting Controller"  "controller"="kubeadmcontrolplane"
I0402 08:03:09.552895       1 controller.go:190] controller-runtime/controller "msg"="Starting workers"  "controller"="kubeadmcontrolplane" "worker count"=10
I0402 08:06:10.716246       1 controller.go:114] controllers/KubeadmControlPlane "msg"="Cluster Controller has not yet set OwnerRef" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 08:06:21.433417       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 08:06:23.925201       1 controller.go:228] controllers/KubeadmControlPlane "msg"="Initializing control plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" "Desired"=1 "Existing"=0
E0402 08:06:27.123285       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: no route to host" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 08:06:27.178628       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
I0402 08:06:27.180820       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 08:06:33.267355       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: no route to host" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 08:06:33.290399       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
I0402 08:06:33.291261       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 08:13:02.390225       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: no route to host" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 08:13:02.392407       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: no route to host"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
I0402 08:16:23.140414       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 08:16:23.988402       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:11:03.766855       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:11:47.310812       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:11:47.311666       1 controller.go:216] controllers/KubeadmControlPlane "msg"="Upgrading Control Plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:11:50.106923       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:11:50.108887       1 controller.go:216] controllers/KubeadmControlPlane "msg"="Upgrading Control Plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:21:50.456927       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:21:50.457775       1 controller.go:216] controllers/KubeadmControlPlane "msg"="Upgrading Control Plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:22:07.874435       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="failed to reconcile the remote kubelet RBAC role: failed to determine if kubelet config rbac role \"kubeadm:kubelet-config-1.18\" already exists: etcdserver: leader changed"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
I0402 10:22:07.877297       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:22:07.878602       1 controller.go:216] controllers/KubeadmControlPlane "msg"="Upgrading Control Plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:23:07.577960       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="Get https://192.168.111.249:6443/api/v1/nodes?labelSelector=node-role.kubernetes.io%2Fmaster%3D\u0026timeout=30s: http2: server sent GOAWAY and closed the connection; LastStreamID=11397, ErrCode=NO_ERROR, debug=\"\"" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:23:07.579418       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="Get https://192.168.111.249:6443/api/v1/nodes?labelSelector=node-role.kubernetes.io%2Fmaster%3D\u0026timeout=30s: http2: server sent GOAWAY and closed the connection; LastStreamID=11397, ErrCode=NO_ERROR, debug=\"\""  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
I0402 10:23:07.580702       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:23:17.881225       1 upgrade.go:46]  "msg"="failed to get remote client for workload cluster" "error"="failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused" "cluster-nanme"="test1" "name"="test1-controlplane" "namespace"="metal3" "cluster key"={"Namespace":"metal3","Name":"test1"}
E0402 10:23:17.883882       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:23:17.885593       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="[failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused, failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused]"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
W0402 10:23:27.598256       1 http.go:392] Error reading backend response: unexpected EOF
W0402 10:23:27.598261       1 http.go:392] Error reading backend response: unexpected EOF
W0402 10:23:27.601267       1 http.go:392] Error reading backend response: unexpected EOF
W0402 10:23:27.601315       1 http.go:392] Error reading backend response: unexpected EOF
W0402 10:23:27.605347       1 http.go:392] Error reading backend response: unexpected EOF
I0402 10:23:28.125302       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:23:28.125988       1 controller.go:216] controllers/KubeadmControlPlane "msg"="Upgrading Control Plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:23:28.135422       1 upgrade.go:46]  "msg"="failed to get remote client for workload cluster" "error"="failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused" "cluster-nanme"="test1" "name"="test1-controlplane" "namespace"="metal3" "cluster key"={"Namespace":"metal3","Name":"test1"}
E0402 10:23:28.140855       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 10:23:28.141793       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="[failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused, failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused]"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}
I0402 10:23:48.623826       1 controller.go:170] controllers/KubeadmControlPlane "msg"="Reconcile KubeadmControlPlane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
I0402 10:23:48.627823       1 controller.go:216] controllers/KubeadmControlPlane "msg"="Upgrading Control Plane" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 12:00:41.829053       1 upgrade.go:46]  "msg"="failed to get remote client for workload cluster" "error"="failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused" "cluster-nanme"="test1" "name"="test1-controlplane" "namespace"="metal3" "cluster key"={"Namespace":"metal3","Name":"test1"}
E0402 12:00:41.832277       1 controller.go:146] controllers/KubeadmControlPlane "msg"="Failed to update KubeadmControlPlane Status" "error"="failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused" "cluster"="test1" "kubeadmControlPlane"="test1-controlplane" "namespace"="metal3" 
E0402 12:00:41.834345       1 controller.go:258] controller-runtime/controller "msg"="Reconciler error" "error"="[failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused, failed to create remote cluster client: failed to create client for workload cluster metal3/test1: Get https://192.168.111.249:6443/api?timeout=30s: dial tcp 192.168.111.249:6443: connect: connection refused]"  "controller"="kubeadmcontrolplane" "request"={"Namespace":"metal3","Name":"test1-controlplane"}

@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

capm3-controller-manager-8d8dd9f68-7cxww.log

I0402 11:59:55.911166       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 11:59:55.929792       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:00:27.319720       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:00:44.245423       1 metal3cluster_controller.go:108] Metal3Cluster-controller "msg"="Reconciling metal3Cluster" "cluster"="test1" "metal3-cluster"={"Namespace":"metal3","Name":"test1"} 

I0402 12:02:26.797307       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:02:26.821534       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:02:26.842085       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:02:26.859891       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:02:42.293116       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:02:42.295407       1 metal3cluster_controller.go:108] Metal3Cluster-controller "msg"="Reconciling metal3Cluster" "cluster"="test1" "metal3-cluster"={"Namespace":"metal3","Name":"test1"} 
I0402 12:02:42.479712       1 metal3cluster_controller.go:108] Metal3Cluster-controller "msg"="Reconciling metal3Cluster" "cluster"="test1" "metal3-cluster"={"Namespace":"metal3","Name":"test1"} 
I0402 12:02:42.513785       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 
I0402 12:02:42.536402       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:02:42.723158       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:02:42.743661       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-ds5f9" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"newos-zgwg4"} 

I0402 12:06:56.969773       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:56.988193       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:57.019684       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:57.033606       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:57.058244       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:57.101875       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:57.171063       1 metal3machine_manager.go:519] controllers/Metal3Machine/Metal3Machine-controller "msg"="Updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 
I0402 12:06:57.191543       1 metal3machine_manager.go:557] controllers/Metal3Machine/Metal3Machine-controller "msg"="Finished updating machine" "cluster"="test1" "machine"="test1-controlplane-8fj79" "metal3-cluster"="test1" "metal3-machine"={"Namespace":"metal3","Name":"test1-controlplane-t7g82"} 

@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

baremetal-operator.log

{"level":"info","ts":1585822678.8002918,"logger":"baremetalhost_ironic","msg":"provisioning image to host","host":"node-0","state":"deploying"}
{"level":"info","ts":1585822678.8003073,"logger":"baremetalhost_ironic","msg":"checking image settings","host":"node-0","source":"https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img","checksum":"http://172.22.0.1/images/bionic-server-cloudimg-amd64.img.md5sum","same":true,"provisionState":"deploying"}
{"level":"info","ts":1585822678.8003209,"logger":"baremetalhost_ironic","msg":"waiting for host to become available","host":"node-0","state":"deploying","deploy step":{"argsinfo":null,"interface":"deploy","priority":100,"step":"deploy"}}
{"level":"info","ts":1585822678.8003452,"logger":"baremetalhost","msg":"saving host status","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning","operational status":"OK","provisioning state":"provisioning"}
{"level":"info","ts":1585822678.8235397,"logger":"baremetalhost","msg":"done","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning","requeue":true,"after":10}
{"level":"info","ts":1585822688.6862218,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"metal3","Request.Name":"node-0"}
{"level":"info","ts":1585822688.6871085,"logger":"baremetalhost","msg":"provisioning","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning"}
{"level":"info","ts":1585822688.7498462,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-0"}
{"level":"info","ts":1585822688.7498672,"logger":"baremetalhost_ironic","msg":"provisioning image to host","host":"node-0","state":"active"}
{"level":"info","ts":1585822688.7498777,"logger":"baremetalhost_ironic","msg":"checking image settings","host":"node-0","source":"https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img","checksum":"http://172.22.0.1/images/bionic-server-cloudimg-amd64.img.md5sum","same":true,"provisionState":"active"}
{"level":"info","ts":1585822688.7498937,"logger":"baremetalhost_ironic","msg":"finished provisioning","host":"node-0"}
{"level":"info","ts":1585822688.749899,"logger":"baremetalhost","msg":"updating deployed image in status","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning"}
{"level":"info","ts":1585822688.7499065,"logger":"baremetalhost","msg":"changing provisioning state","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning","old":"provisioning","new":"provisioned"}
{"level":"info","ts":1585822688.749922,"logger":"baremetalhost","msg":"saving host status","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning","operational status":"OK","provisioning state":"provisioned"}
{"level":"info","ts":1585822688.7702422,"logger":"baremetalhost","msg":"publishing event","reason":"ProvisioningComplete","message":"Image provisioning completed for https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img"}
{"level":"info","ts":1585822688.8162453,"logger":"baremetalhost","msg":"done","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioning","requeue":true,"after":0}
{"level":"info","ts":1585822688.8166747,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"metal3","Request.Name":"node-0"}
{"level":"info","ts":1585822688.9209285,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-0"}
{"level":"info","ts":1585822688.9215071,"logger":"baremetalhost_ironic","msg":"updating hardware state","host":"node-0"}
{"level":"info","ts":1585822689.0648508,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-0"}
{"level":"info","ts":1585822689.0713723,"logger":"baremetalhost","msg":"saving host status","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioned","operational status":"OK","provisioning state":"provisioned"}
{"level":"info","ts":1585822689.1078503,"logger":"baremetalhost","msg":"done","Request.Namespace":"metal3","Request.Name":"node-0","provisioningState":"provisioned","requeue":true,"after":60}
{"level":"info","ts":1585822689.108544,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"metal3","Request.Name":"node-0"}
{"level":"info","ts":1585822689.2254303,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-0"}
{"level":"info","ts":1585822689.2259176,"logger":"baremetalhost_ironic","msg":"updating hardware state","host":"node-0"}
{"level":"info","ts":1585822689.347832,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-0"}
{"level":"info","ts":1585829277.057415,"logger":"baremetalhost","msg":"saving host status","Request.Namespace":"metal3","Request.Name":"node-1","provisioningState":"provisioned","operational status":"OK","provisioning state":"provisioned"}
{"level":"info","ts":1585829277.1114109,"logger":"baremetalhost","msg":"done","Request.Namespace":"metal3","Request.Name":"node-1","provisioningState":"provisioned","requeue":true,"after":60}
{"level":"info","ts":1585829277.112044,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"metal3","Request.Name":"node-1"}
{"level":"info","ts":1585829277.1801221,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-1"}
{"level":"info","ts":1585829277.1803243,"logger":"baremetalhost_ironic","msg":"updating hardware state","host":"node-1"}
{"level":"info","ts":1585829277.2336922,"logger":"baremetalhost_ironic","msg":"found existing node by ID","host":"node-1"}
{"level":"info","ts":1585829277.2338178,"logger":"baremetalhost","msg":"saving host status","Request.Namespace":"metal3","Request.Name":"node-1","provisioningState":"provisioned","operational status":"OK","provisioning state":"provisioned"}
{"level":"info","ts":1585829277.2845716,"logger":"baremetalhost","msg":"done","Request.Namespace":"metal3","Request.Name":"node-1","provisioningState":"provisioned","requeue":true,"after":60}

@Xenwar
Copy link
Member Author

Xenwar commented Apr 2, 2020

In the above logs, some repeating logs are removed for brevity.

@Jaakko-Os
Copy link
Contributor

Open bugs that seem to impact this issue:
kubernetes-sigs/cluster-api#2915
kubernetes-sigs/cluster-api#2771
kubernetes-sigs/cluster-api#2859

@Xenwar
Copy link
Member Author

Xenwar commented Apr 30, 2020

This issue is reproduced with k8s version upgrade as well. And, this is not related to the earlier block issue.

Gathering more information.

@Xenwar
Copy link
Member Author

Xenwar commented May 11, 2020

This issue was caused by VIP not migrating to the control plane node. That made both old and new control plane nodes none responsive.

@Xenwar Xenwar closed this as completed May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants