Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

Closed
2 tasks done
consideRatio opened this issue Mar 8, 2023 · 1 comment · Fixed by #2324
Closed
2 tasks done
Assignees

Comments

@consideRatio
Copy link
Member

consideRatio commented Mar 8, 2023

Julius at LEAP reported seeing a popup associated with loosing connection between JupyterLab in the browser and the jupyter server in https://2i2c.freshdesk.com/a/tickets/528.

The popup stems from jupyterlab beliving the connection is lost to the user server. I don't think this relates to Julius internet connection, but a disruption of networking for some reason.

  1. Was it the jupyterhub chart's proxy pod being evicted/restarted? No.
  2. Was it the ingress controller pod being evicted/restarted? Yes, it was evicted I think!

Why was the ingress controller pod evicted and a new pod started? I'm quite sure it was evicted from a node because memory pressure, because I see a node-exporter daemonset pod that can't be evicted restarted because it was OOMKilled just a minute before a new ingress controller pod was started on another node.

We can avoid this by having better memory requests for the ingress proxy pod, and by having multiple replicas of the ingress controller pods. Having these highly available and reliably running is very relevant.

Related

Our ingress-nginx chart configuration doesn't specify any cpu or memory requests.

# ingress-nginx is responsible for proxying traffic based on k8s Ingress
# resources defined in the k8s cluster.
#
# Typically, all inbound traffic arrives to some ingress-nginx controller pod
# via a k8s Service that has a public IP, and is thereafter proxied to a k8s
# Service (a Pod selected by a k8s Service) that doesn't have a public IP.
#
# values ref: https://github.com/kubernetes/ingress-nginx/blob/main/charts/ingress-nginx/values.yaml
#
ingress-nginx:
controller:
podLabels:
# nginx-ingress controllers need to be allowed proxy traffic onwards to
# JupyterHub's proxy pod (and only the proxy pod) in clusters with
# NetworkPolicy enforcement enabled. Adding this label on the controller
# pod allows that.
#
# ref: https://z2jh.jupyter.org/en/stable/administrator/security.html#introduction-to-the-chart-s-four-network-policies
#
hub.jupyter.org/network-access-proxy-http: "true"

The default values for ingress-nginx are these:

    ## Define requests resources to avoid probe issues due to CPU utilization in busy nodes
    ## ref: https://github.com/kubernetes/ingress-nginx/issues/4735#issuecomment-551204903
    ## Ideally, there should be no limits.
    ## https://engineering.indeedblog.com/blog/2019/12/cpu-throttling-regression-fix/
    resources:
        ##  limits:
        ##    cpu: 100m
        ##    memory: 90Mi
        requests:
            cpu: 100m
            memory: 90Mi

Action points

@consideRatio
Copy link
Member Author

consideRatio commented Mar 9, 2023

Looking at LEAP, 2i2c, utoronto, I see that a range of memory use, between 104-131 Mi. I think Doubling the requests makes sense, to arrive at 180Mi. For a bit more margin lets make it closer to a doubling of what we've observed so far - 250Mi.

kubectl top pod -n support -l app.kubernetes.io/name=ingress-nginx

NAME                                                CPU(cores)   MEMORY(bytes)   
support-ingress-nginx-controller-6585f58669-9zms5   8m           131Mi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant