ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

consideRatio · 2023-03-08T20:26:30Z

Julius at LEAP reported seeing a popup associated with loosing connection between JupyterLab in the browser and the jupyter server in https://2i2c.freshdesk.com/a/tickets/528.

The popup stems from jupyterlab beliving the connection is lost to the user server. I don't think this relates to Julius internet connection, but a disruption of networking for some reason.

Was it the jupyterhub chart's proxy pod being evicted/restarted? No.
Was it the ingress controller pod being evicted/restarted? Yes, it was evicted I think!

Why was the ingress controller pod evicted and a new pod started? I'm quite sure it was evicted from a node because memory pressure, because I see a node-exporter daemonset pod that can't be evicted restarted because it was OOMKilled just a minute before a new ingress controller pod was started on another node.

We can avoid this by having better memory requests for the ingress proxy pod, and by having multiple replicas of the ingress controller pods. Having these highly available and reliably running is very relevant.

    
           # ingress-nginx is responsible for proxying traffic based on k8s Ingress 
        
           # resources defined in the k8s cluster. 
        
           # 
        
           # Typically, all inbound traffic arrives to some ingress-nginx controller pod 
        
           # via a k8s Service that has a public IP, and is thereafter proxied to a k8s 
        
           # Service (a Pod selected by a k8s Service) that doesn't have a public IP. 
        
           # 
        
           # values ref: https://github.com/kubernetes/ingress-nginx/blob/main/charts/ingress-nginx/values.yaml 
        
           # 
        
           ingress-nginx: 
        
             controller: 
        
               podLabels: 
        
                 # nginx-ingress controllers need to be allowed proxy traffic onwards to 
        
                 # JupyterHub's proxy pod (and only the proxy pod) in clusters with 
        
                 # NetworkPolicy enforcement enabled. Adding this label on the controller 
        
                 # pod allows that. 
        
                 # 
        
                 # ref: https://z2jh.jupyter.org/en/stable/administrator/security.html#introduction-to-the-chart-s-four-network-policies 
        
                 # 
        
                 hub.jupyter.org/network-access-proxy-http: "true"

The default values for ingress-nginx are these:

    ## Define requests resources to avoid probe issues due to CPU utilization in busy nodes
    ## ref: https://github.com/kubernetes/ingress-nginx/issues/4735#issuecomment-551204903
    ## Ideally, there should be no limits.
    ## https://engineering.indeedblog.com/blog/2019/12/cpu-throttling-regression-fix/
    resources:
        ##  limits:
        ##    cpu: 100m
        ##    memory: 90Mi
        requests:
            cpu: 100m
            memory: 90Mi

Action points

Estimate memory use of ingress controller pods
Increase the default memory request of 90Mi
- resource requests/limits updates, mainly to ingress-nginx and prometheus #2324

The text was updated successfully, but these errors were encountered:

consideRatio · 2023-03-09T10:05:09Z

Looking at LEAP, 2i2c, utoronto, I see that a range of memory use, between 104-131 Mi. I think Doubling the requests makes sense, to arrive at 180Mi. For a bit more margin lets make it closer to a doubling of what we've observed so far - 250Mi.

kubectl top pod -n support -l app.kubernetes.io/name=ingress-nginx

NAME                                                CPU(cores)   MEMORY(bytes)   
support-ingress-nginx-controller-6585f58669-9zms5   8m           131Mi

consideRatio mentioned this issue Mar 9, 2023

resource requests/limits updates, mainly to ingress-nginx and prometheus #2324

Merged

consideRatio self-assigned this Mar 9, 2023

consideRatio closed this as completed in #2324 Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

consideRatio commented Mar 8, 2023 •

edited

Loading

consideRatio commented Mar 9, 2023 •

edited

Loading

ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

ingress-controller pod that routes all incoming traffic to k8d pods got evicted? #2322

Comments

consideRatio commented Mar 8, 2023 • edited Loading

Related

Action points

consideRatio commented Mar 9, 2023 • edited Loading

consideRatio commented Mar 8, 2023 •

edited

Loading

consideRatio commented Mar 9, 2023 •

edited

Loading