-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[donotmerge] debug aks e2e #1488
Conversation
@alexeldeib: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In that one the cluster reached provisioned but both MPs seem to have gotten stuck. Added some debu logs and trying again. What’s weird is they both seem to have been created in azure — toward the end you can see the logs from the agentpools service where we diff the current/desired agentpool, and both pools are up to date. But we don’t see to reach the end of the reconcile, and yet I don’t see any errors. Could be failing to call VMSS but somehow not logging.. |
/hold just for debug |
hmm. in that run, cluster failed to provision, but AMCP is fully populated/ready. checking the timestamps... looks like AMCP was "reconciling kubeconfig" withing 5min |
and then we randomly start failing on identity not present, AFTER amcp successfully reconciles?
interestingly, I do not see the AzureClusterIdentity in the output artifacts: https://gcsweb.k8s.io/gcs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api-provider-azure/1488/pull-cluster-api-provider-azure-e2e-exp/1411107169369067520/artifacts/clusters/bootstrap/resources/capz-e2e-ni5r7m/ |
/test pull-cluster-api-provider-azure-e2e-exp getting some more data |
so for the failures waiting for a control plane machine where it looks "hung", appears we get stuck looping here: cluster-api-provider-azure/exp/controllers/azuremanagedmachinepool_reconciler.go Line 149 in 03ba941
now to find why update: from a successful run, this is the period during which no vmss are created, so we're looping trying to find one. I don't get why that makes sense though, if we are able to run pods, the VMs are clearly there...something is weird with the amounts of time between different events |
/test pull-cluster-api-provider-azure-e2e-exp |
in the last run, it looks like pods did come up, but we get no results listing scalesets and eventually time out :/ probably want to dump |
So it seems like we're really stuck in that loop, but the nodes from both pools already joined. So i'm trying to understand why we keep listing VMSS and getting zero results from Azure. |
/test pull-cluster-api-provider-azure-e2e-exp |
2 similar comments
/test pull-cluster-api-provider-azure-e2e-exp |
/test pull-cluster-api-provider-azure-e2e-exp |
/test pull-cluster-api-provider-azure-e2e |
This reverts commit 5498946.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This reverts commit 3ac2a7d.
@alexeldeib: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/close |
@CecileRobertMichon: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: