[web hook] when storage scale out and pending, because of resource not enough, then can not execute scale in ,it seams stuck #320

jinyingsunny · 2023-10-09T10:32:11Z

when enable web hook. scale out storage but failed, because cpu not enough

$ kubectl -n nebula describe pod nebulazone-storaged-9
Name:             nebulazone-storaged-9
Namespace:        nebula
Priority:         0
Service Account:  nebula-sa
Node:             <none>
Labels:           app.kubernetes.io/cluster=nebulazone
                  app.kubernetes.io/component=storaged
                  app.kubernetes.io/managed-by=nebula-operator
                  app.kubernetes.io/name=nebula-graph
                  controller-revision-hash=nebulazone-storaged-5b568d554c
                  statefulset.kubernetes.io/pod-name=nebulazone-storaged-9
Annotations:      cloud.google.com/cluster_autoscaler_unhelpable_since: 2023-10-09T09:58:34+0000
                  cloud.google.com/cluster_autoscaler_unhelpable_until: Inf
                  nebula-graph.io/cm-hash: 760645648930d20e
Status:           Pending
IP:
IPs:              <none>
Controlled By:    StatefulSet/nebulazone-storaged
Containers:
  storaged:
    Image:       asia-east2-docker.pkg.dev/nebula-cloud-test/poc/rc/nebula-storaged-ent:v3.5.0-sc
    Ports:       9779/TCP, 19789/TCP, 9778/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP
    Command:
      /bin/sh
      -ecx
      exec /usr/local/nebula/bin/nebula-storaged --flagfile=/usr/local/nebula/etc/nebula-storaged.conf --meta_server_addrs=nebulazone-metad-0.nebulazone-metad-headless.nebula.svc.cluster.local:9559,nebulazone-metad-1.nebulazone-metad-headless.nebula.svc.cluster.local:9559,nebulazone-metad-2.nebulazone-metad-headless.nebula.svc.cluster.local:9559 --local_ip=$(hostname).nebulazone-storaged-headless.nebula.svc.cluster.local --ws_ip=$(hostname).nebulazone-storaged-headless.nebula.svc.cluster.local --daemonize=false --ws_http_port=19789
    Limits:
      cpu:     3
      memory:  16Gi
    Requests:
      cpu:        2
      memory:     8Gi
    Readiness:    http-get http://:19789/status delay=10s timeout=5s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /usr/local/nebula/data from storaged-data (rw,path="data")
      /usr/local/nebula/etc/nebula-storaged.conf from nebulazone-storaged (rw,path="nebula-storaged.conf")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j86r9 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  storaged-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  storaged-data-nebulazone-storaged-9
    ReadOnly:   false
  nebulazone-storaged:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nebulazone-storaged
    Optional:  false
  kube-api-access-j86r9:
    Type:                     Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:   3607
    ConfigMapName:            kube-root-ca.crt
    ConfigMapOptional:        <nil>
    DownwardAPI:              true
QoS Class:                    Burstable
Node-Selectors:               <none>
Tolerations:                  node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                              node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints:  topology.kubernetes.io/zone:DoNotSchedule when max skew 1 is exceeded for selector app.kubernetes.io/cluster=nebulazone,app.kubernetes.io/component=storaged,app.kubernetes.io/managed-by=nebula-operator,app.kubernetes.io/name=nebula-graph
Events:
  Type     Reason             Age   From                Message
  ----     ------             ----  ----                -------
  Warning  FailedScheduling   48s   nebula-scheduler    0/3 nodes are available: 2 Insufficient cpu, 2 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
  Warning  FailedScheduling   45s   nebula-scheduler    0/3 nodes are available: 2 Insufficient cpu, 2 Insufficient memory. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod..
  Normal   NotTriggerScaleUp  46s   cluster-autoscaler  pod didn't trigger scale-up:

Your Environments (required)

nebula-operator: snap1.19

Expected behavior

when pending cause by resource , stop the scale out ,return to last status .

The text was updated successfully, but these errors were encountered:

jinyingsunny · 2023-10-09T10:43:27Z

i resolve the problem by edit nebula-operator deployment set --enable-admission-webhook=false, to let webhook stop

MegaByte875 · 2023-10-10T08:36:38Z

I think insufficient resource problem is not a bug, admission webhook is used for preventing operations in intermediate state.

jinyingsunny added the type/bug Type: something is unexpected label Oct 9, 2023

github-actions bot added affects/none PR/issue: this bug affects none version. severity/none Severity of bug labels Oct 9, 2023

jinyingsunny changed the title ~~[web hook] when resource not enough cause storagescale out~~ [web hook] when storage scale out pending, because of resource not enough, then can not execute scalein Oct 9, 2023

jinyingsunny changed the title ~~[web hook] when storage scale out pending, because of resource not enough, then can not execute scalein~~ [web hook] when storage scale out pending, because of resource not enough, then can not execute scale in Oct 9, 2023

jinyingsunny changed the title ~~[web hook] when storage scale out pending, because of resource not enough, then can not execute scale in~~ [web hook] when storage scale out and pending, because of resource not enough, then can not execute scale in ,it seams stuck Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[web hook] when storage scale out and pending, because of resource not enough, then can not execute scale in ,it seams stuck #320

[web hook] when storage scale out and pending, because of resource not enough, then can not execute scale in ,it seams stuck #320

jinyingsunny commented Oct 9, 2023 •

edited

Loading

jinyingsunny commented Oct 9, 2023

MegaByte875 commented Oct 10, 2023

[web hook] when storage scale out and pending, because of resource not enough, then can not execute scale in ,it seams stuck #320

[web hook] when storage scale out and pending, because of resource not enough, then can not execute scale in ,it seams stuck #320

Comments

jinyingsunny commented Oct 9, 2023 • edited Loading

jinyingsunny commented Oct 9, 2023

MegaByte875 commented Oct 10, 2023

jinyingsunny commented Oct 9, 2023 •

edited

Loading