-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster creation fails with NodesNotReady status if network policy is calico #905
Comments
this is bug for v1.12.x. versions, below that or azure network policy are still working. It should be fixed in next AKS release. |
Pending 4-8-2019 release in queue. |
@jnoller does this https://github.com/Azure/AKS/blob/master/CHANGELOG.md#release-2019-04-08-hotfix contain the fix for it ?. If yes, has it already been deployed into WestEurope (I still see this error)? |
It does not, that should roll out beginning tomorrow |
I faced the same issue when upgrading the cluster from 1.12.6 to 1.12.7 with calico enabled. Now the cluster is in failed state. I was wondering if the patch will fix the existing clusters with (calico+aks1.12) or will it only fix the newly created clusters? |
@thatInfrastructureGuy It will only patch new clusters, as this feature is in preview and should not be enabled on production systems. |
The issue has been fixed now. @thatInfrastructureGuy @katrinsharp Could you try to upgrade the cluster, or create a new one? After upgrading, remember to do a cleanup first (this is required, see discussion on aks-engine here):
If there're other Pods (e.g. dashboard) in crashing state, delete the pod and let Kubernetes creating new ones could bring it back. |
Already fixed now. Please upgrade or create a new cluster if you have a failed cluster with calico network policy. |
What happened:
Cluster was successfully created (kubernetes 1.12.6) with the same template that fails now. If I remove
networkProfile
portion of template the deployment finishes successfully, if I add it back - then it takes a long time to complete (~40 min) stuck onCreate or Update Managed Cluster
and it fails with following message:MC_* resource group is created successfully.
Downgrading to 1.12.6 and using calico doesn't help - it still has exactly same issue. If
networkProfile
part is omitted in ARM, cluster is create successfully.Region: US East.
What you expected to happen:
Cluster deployment is successful.
How to reproduce it (as minimally and precisely as possible):
Use the following as part of the cluster arm template:
Anything else we need to know?:
Environment:
kubectl version
): 1.12.7The text was updated successfully, but these errors were encountered: