Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to dynamic flags are not effective #289

Closed
wenhaocs opened this issue Sep 21, 2023 · 6 comments
Closed

Updates to dynamic flags are not effective #289

wenhaocs opened this issue Sep 21, 2023 · 6 comments
Labels
affects/none PR/issue: this bug affects none version. process/fixed Process of bug severity/blocker Severity of bug type/bug Type: something is unexpected

Comments

@wenhaocs
Copy link

wenhaocs commented Sep 21, 2023

Version used: “RUN git clone --single-branch --branch release-1.6 --depth 1 https://github.com/vesoft-inc/nebula-operator/” and chart version 1.6.0

All updates to dynamic flags are not applied. E.g., Here is the nc:

image

But the gflags received from graphd is:
image

Here is a full list of crd config and flags:
nc.yaml.txt

graphd_flags.txt
metad-flags.txt
storaged-flags.txt

Please make sure all configs in crd can take effect.

@wenhaocs wenhaocs added the type/bug Type: something is unexpected label Sep 21, 2023
@github-actions github-actions bot added affects/none PR/issue: this bug affects none version. severity/none Severity of bug labels Sep 21, 2023
@wenhaocs wenhaocs added severity/blocker Severity of bug and removed severity/none Severity of bug affects/none PR/issue: this bug affects none version. labels Sep 21, 2023
@github-actions github-actions bot added the affects/none PR/issue: this bug affects none version. label Sep 21, 2023
@HarrisChu
Copy link
Contributor

HarrisChu commented Sep 21, 2023

test in gcp
operator: vesoft/nebula-operator:v1.6.0

image image

@wenhaocs
Copy link
Author

wenhaocs commented Sep 22, 2023

kubectl get svc -n mau-exp

NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                          AGE
mau-exp-exporter-svc        ClusterIP   172.20.162.199   <none>        9100/TCP                                         136m
mau-exp-graphd-headless     ClusterIP   None             <none>        9669/TCP,19669/TCP,19670/TCP                     136m
mau-exp-graphd-svc          NodePort    172.20.225.149   <none>        9669:32316/TCP,19669:31001/TCP,19670:31367/TCP   136m
mau-exp-metad-headless      ClusterIP   None             <none>        9559/TCP,19559/TCP,19560/TCP                     137m
mau-exp-storaged-headless   ClusterIP   None             <none>        9779/TCP,19779/TCP,19780/TCP,9778/TCP            137m

Set the accept_partial_success to false and then to true in mau-comm cluster. Now the flag is set to true successfully. Here is the operator log right after the setting:
nebula-operator-controller-manager-deployment-bf754fccc-mln4r.txt

The wal_ttl is still 14400 while it's 300 at CRD. Here is how the value is set wal_ttl: "300".
mau-comm-storaged-0-flag.txt

@wenhaocs
Copy link
Author

wenhaocs commented Sep 22, 2023

Here is the operator log from a brand new cluster:

nebula-operator-controller-manager-deployment-768d9f7ff5-zqrd7-mau-expt.txt

nc:
mau-exp-nc.yaml.txt

Again, accept_partial_success is false even though it's set to true in CRD.

@MegaByte875
Copy link
Contributor

I can reproduce the issue when creating nebulacluster, I will submit a PR to fix it.

@MegaByte875 MegaByte875 mentioned this issue Sep 22, 2023
3 tasks
@MegaByte875
Copy link
Contributor

first operator log:
I0921 04:19:01.059209 1 nebula_cluster_controller.go:145] Finished reconciling NebulaCluster [mau-comm/mau-comm], spendTime: (3.138337827s)
I0921 04:21:52.668638 1 nebula_cluster_controller.go:162] Start to reconcile NebulaCluster
I0921 04:21:55.584295 1 nebulacluster.go:119] NebulaCluster [mau-comm/mau-comm] updated successfully
I0921 04:11:41.931843 1 nebula_cluster_controller.go:145] Finished reconciling NebulaCluster [mau-comm/mau-comm], spendTime: (2.767108368s)
I0921 04:18:29.567944 1 nebula_cluster_controller.go:162] Start to reconcile NebulaCluster
I0921 04:18:32.411361 1 nebula_cluster_controller.go:145] Finished reconciling NebulaCluster [mau-comm/mau-comm], spendTime: (2.843583002s)
I0921 04:18:32.411531 1 nebula_cluster_controller.go:162] Start to reconcile NebulaCluster
I0921 04:18:36.209704 1 nebula_cluster_controller.go:145] Finished reconciling NebulaCluster [mau-comm/mau-comm], spendTime: (3.798248051s)
I0921 04:21:52.668638 1 nebula_cluster_controller.go:162] Start to reconcile NebulaCluster
I0921 04:21:55.584295 1 nebulacluster.go:119] NebulaCluster [mau-comm/mau-comm] updated successfully
I0921 04:21:55.584323 1 nebula_cluster_controller.go:173] NebulaCluster [mau-comm/mau-comm] reconcile details: waiting for nebulacluster ready
I0921 04:21:55.584330 1 nebula_cluster_controller.go:143] Finished reconciling NebulaCluster [mau-comm/mau-comm] (2.915876371s), result: {false 10s}

This time nebulacluster status was not ready

----- graphd scale out ------

I0921 15:33:15.377704 1 graphd_cluster.go:242] graphd pod [mau-comm/mau-comm-graphd-67] scheduled on node ip-10-92-150-167.ec2.internal in zone us-east-1b
I0921 15:33:15.398134 1 cm.go:98] configMap [mau-comm/mau-comm-graphd-zone] updated successfully

------ storage scale out ------
I0921 18:09:42.145337 1 storaged_cluster.go:321] storaged pod [mau-comm/mau-comm-storaged-16] scheduled on node ip-10-96-72-7.ec2.internal in zone us-east-1d
I0921 18:09:42.197424 1 cm.go:98] configMap [mau-comm/mau-comm-storaged-zone] updated successfully

------- stoage scale in ------
I0921 18:24:21.483949 1 nebula_cluster_controller.go:173] NebulaCluster [mau-comm/mau-comm] reconcile details: waiting for nebulacluster ready
I0921 18:24:21.483956 1 nebula_cluster_controller.go:143] Finished reconciling NebulaCluster [mau-comm/mau-comm] (5.344186803s), result: {false 10s}
I0921 18:24:21.484076 1 nebula_cluster_controller.go:162] Start to reconcile NebulaCluster
I0921 18:24:21.793664 1 nebulacluster.go:119] NebulaCluster [mau-comm/mau-comm] updated successfully
I0921 18:24:22.014071 1 storaged_scaler.go:195] storaged cluster [mau-comm/mau-comm-storaged] drop hosts
[HostAddr({Host:mau-comm-storaged-23.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-22.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-21.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-20.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-19.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-18.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-17.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})
HostAddr({Host:mau-comm-storaged-16.mau-comm-storaged-headless.mau-comm.svc.cluster.local Port:9779})] successfully

---- storage status -----

  storaged:
    phase: ScaleIn
    version: nebula-09.19.257-multi-arch
    workload:
      availableReplicas: 14
      collisionCount: 0
      currentReplicas: 16
      currentRevision: mau-comm-storaged-9667fbb98
      observedGeneration: 3
      readyReplicas: 14
      replicas: 16
      updateRevision: mau-comm-storaged-9667fbb98
      updatedReplicas: 16
  version: 2023.09.19-ent

Desired replicas 16
Ready replicas 14

When to update dynamic flags:
image

@wenhaocs
Copy link
Author

wenhaocs commented Sep 22, 2023

Verified working by customer! Thank you so much! @MegaByte875

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/none PR/issue: this bug affects none version. process/fixed Process of bug severity/blocker Severity of bug type/bug Type: something is unexpected
Projects
None yet
Development

No branches or pull requests

3 participants