-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debugging external-apps-ingress-controller in Prod cluster #29
Comments
See my comment on the PR: the original value was correct (and you can verify that with a simple DNS query; compare the result of looking up |
And to avoid some additional confusion: there is some overlap between this issue and with #16. This issue is supposed to be "Why isn't the external ingress controller running?" |
It looks like the reason the external ingressController is not creating is because the pods are not schedule-able (See the PodsScheduled status at: here I quickly looked through the nodes available to the prod cluster and I don't see any labels (zone: external) which as you can see in the ingressController yaml, it's looking for
On several of the prod cluster's worker nodes there is however a label (nerc.mghpcc.org/external-ingress: 'true') which appears to be the label we are actually looking for. I think the namespace label may become an issue as well but the main error is related to the nodeselector so I'm going to open a PR to see about making this change and will link it below. |
The patch in OCP-on-NERC/nerc-ocp-config#42 did not result in any change to the error status of the ingressController at: https://console-openshift-console.apps.nerc-ocp-prod.rc.fas.harvard.edu/k8s/ns/openshift-ingress-operator/operator.openshift.io I also have determined that the namespaceSelector field, shouldn't be the root of the problem since
and so this should have no impact on the ingressController pods being scheduled. |
After deleting the external-apps-ingress-controller ingressController in the prod cluster, and recreating it via an argoCD sync, the PR to update the nodeSelector fields does appear to have worked as the ingressContoller is reporting that pods are now scheduled. There is still something going on here though bc we're still seeing 0/2 replicas available |
See also: #41 |
This has been closed by recent pull requests. |
I've done some looking around and I'm seeing what may be a typo in the ingressController yaml file on this line
I'm assuming it's meant to be:
As opposed to:
This may not be the only thing that needs changing seeing as there is also something going on with the node scheduling as seen in the conditions here, but it's a start.
The text was updated successfully, but these errors were encountered: