Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hpa]only set scaleDown, then with many error message #314

Closed
jinyingsunny opened this issue Oct 7, 2023 · 5 comments
Closed

[hpa]only set scaleDown, then with many error message #314

jinyingsunny opened this issue Oct 7, 2023 · 5 comments
Labels
affects/none PR/issue: this bug affects none version. process/done Process of bug severity/none Severity of bug type/bug Type: something is unexpected

Comments

@jinyingsunny
Copy link

jinyingsunny commented Oct 7, 2023

Disabled scale, yaml as follow,when pressure test stop it keep pod alive,however there are some error in log

image

errors in log:

github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler.(*HorizontalController).stabilizeRecommendationWithBehaviors(0xc000517290, {{0xc000fac6f0, 0x18}, 0x0, 0xc0007339e0, 0x2, 0xa, 0x2, 0x1})
	github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler/autoscaler.go:713 +0xc5
github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler.(*HorizontalController).normalizeDesiredReplicasWithBehaviors(0xc000517290, 0xc000c036c0, {0xc000fac6f0?, 0x18?}, 0x2?, 0x1, 0x2?)
	github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler/autoscaler.go:603 +0x1f4
github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler.(*HorizontalController).reconcileAutoscaler(0xc000517290, {0x25044a8, 0xc0007338f0}, 0xc000c03380, {0xc000fac6f0, 0x18})
	github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler/autoscaler.go:340 +0xada
github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler.(*HorizontalController).Reconcile(0xc000517290, {0x25044a8, 0xc0007338f0}, {{{0xc000c97a86?, 0x0?}, {0xc0010bf200?, 0x40e027?}}})
	github.com/vesoft-inc/nebula-operator/pkg/controller/autoscaler/autoscaler.go:232 +0x49b
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x25044a8?, {0x25044a8?, 0xc0007338f0?}, {{{0xc000c97a86?, 0x1dbf260?}, {0xc0010bf200?, 0x24ec878?}}})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000245b80, {0x2504400, 0xc00009ad20}, {0x1f91a20?, 0xc0000a19e0?})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:314 +0x377
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000245b80, {0x2504400, 0xc00009ad20})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:265 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:226 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222 +0x587
E1006 08:08:38.743381       1 controller.go:324] "Reconciler error" err="panic: runtime error: invalid memory address or nil pointer dereference [recovered]" controller="nebulaautoscaler" controllerGroup="autoscaling.nebula-graph.io" controllerKind="NebulaAutoscaler" NebulaAutoscaler="nebula/nebula-autoscaler" namespace="nebula" name="nebula-autoscaler" reconcileID="4c635937-8fa6-4b43-bb7d-2e1add916132"
I1006 08:08:52.587035       1 autoscaler.go:206] Finished reconciling NebulaAutoscaler [nebula/nebula-autoscaler], spendTime: (9.48868ms)
E1006 08:08:52.587121       1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 200 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1eb31e0?, 0x3634650})
	k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0x99
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:107 +0xc5
panic({0x1eb31e0, 0x3634650})
	runtime/panic.go:884 +0x213

however when add scale up witch is default value , error message disappear.
autoscale.yaml as follows:

apiVersion: autoscaling.nebula-graph.io/v1alpha1
kind: NebulaAutoscaler
metadata:
  name: nebula-autoscaler
  namespace: nebula
spec:
  nebulaClusterRef:
    name: nebulazcert
  graphdPolicy:
    minReplicas: 2
    maxReplicas: 10
    metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50
    behavior:
      scaleUp:
        stabilizationWindowSeconds: 0
        policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
      scaleDown:
        selectPolicy: Disabled
  pollingPeriod: 30s

Your Environments (required)

nebula-operator: s-1.18

How To Reproduce(required)

Steps to reproduce the behavior:

  1. set autoscale.yaml
  2. provide many query, auto scale many pod
  3. show auto scale logs: kubectl -n nebula-operator-system logs -f nebula-operator-controller-manager-deployment-759f97f578-fq6v6 -c autoscaler

Expected behavior
without error message

@jinyingsunny jinyingsunny added the type/bug Type: something is unexpected label Oct 7, 2023
@github-actions github-actions bot added affects/none PR/issue: this bug affects none version. severity/none Severity of bug labels Oct 7, 2023
@jinyingsunny
Copy link
Author

when set like follower as demo from https://kubernetes.io/zh-cn/docs/tasks/run-application/horizontal-pod-autoscale/ ,still report error
image

@MegaByte875
Copy link
Contributor

SelectPolicy is a pointer type and has no default value set, I will give it an initial default value.

@MegaByte875
Copy link
Contributor

MegaByte875 commented Oct 9, 2023

#319
#321

@jinyingsunny
Copy link
Author

jinyingsunny commented Oct 9, 2023

验证如下autoscale.yaml配置,已经可以成功写入和生效:
缩容的稳定窗口是2min,过了2min,如果依旧低于平均cpu使用量,则会进行缩容
operator版本: s-1.20

$ cat autoscale.yaml
apiVersion: autoscaling.nebula-graph.io/v1alpha1
kind: NebulaAutoscaler
metadata:
  name: nebula-autoscaler
  namespace: nebula
spec:
  nebulaClusterRef:
    name: nebulazone
  graphdPolicy:
    minReplicas: 2
    maxReplicas: 5
    metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 10
    behavior:
      scaleUp:
        stabilizationWindowSeconds: 0
        policies:
        - type: Percent
          value: 100
          periodSeconds: 15
        - type: Pods
          value: 4
          periodSeconds: 15
        selectPolicy: Max
      scaleDown:
        stabilizationWindowSeconds: 120
        policies:
        - type: Percent
          value: 100
          periodSeconds: 15
  pollingPeriod: 30s

@jinyingsunny
Copy link
Author

验证如下autoscale.yaml配置,apply后不再报错,但是有流量时也不会进行扩容
image

operator版本: s-1.21

apiVersion: autoscaling.nebula-graph.io/v1alpha1
kind: NebulaAutoscaler
metadata:
  name: nebula-autoscaler
  namespace: nebula
spec:
  nebulaClusterRef:
    name: nebulazone
  graphdPolicy:
    minReplicas: 2
    maxReplicas: 5
    metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 10
    behavior:
        scaleDown:
          selectPolicy: Disabled
  pollingPeriod: 30s

qiaolei:解释 scale up 和 scale down必须同时配置才可以生效 且 selectPolicy一定要设置,否则不生效(也没啥报错,就是average cpu的值一直是0)。

@github-actions github-actions bot added the process/fixed Process of bug label Oct 10, 2023
@jinyingsunny jinyingsunny added process/done Process of bug and removed process/fixed Process of bug labels Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/none PR/issue: this bug affects none version. process/done Process of bug severity/none Severity of bug type/bug Type: something is unexpected
Projects
None yet
Development

No branches or pull requests

2 participants