Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cockroachdb-vcheck Exit Code: 137 #1006

Open
aep opened this issue Oct 3, 2023 · 1 comment
Open

cockroachdb-vcheck Exit Code: 137 #1006

aep opened this issue Oct 3, 2023 · 1 comment

Comments

@aep
Copy link

aep commented Oct 3, 2023

hi, i'm new to kubernetes so unclear if this is user error

when going through https://www.cockroachlabs.com/docs/stable/deploy-cockroachdb-with-kubernetes
Initialize the cluster -> Check that the pods were created:

i see no pods created that look like crdb. instead i see

NAME                                          READY   STATUS    RESTARTS   AGE
cockroach-operator-manager-5489bf9cbc-srr2z   1/1     Running   0          9m21s
cockroachdb-vcheck-28272052-8nj59             0/1     Error     0          46s
cockroachdb-vcheck-28272052-bbs7c             0/1     Error     0          59s
cockroachdb-vcheck-28272052-vr7nz             0/1     Error     0          23s
cockroachdb-vcheck-28272053-52b2m             0/1     Error     0          7s
cockroachdb-vcheck-28272053-m4m7w             0/1     Error     0          30s
cockroachdb-vcheck-28272053-nc62h             0/1     Error     0          43s

kubectl describe pod cockroachdb-vcheck-28272052-bbs7c    
Name:             cockroachdb-vcheck-28272052-bbs7c
Namespace:        cockroach-operator-system
Priority:         0
Service Account:  cockroachdb-sa
Node:             uca2k/10.181.22.6
Start Time:       Tue, 03 Oct 2023 10:52:50 +0200
Labels:           batch.kubernetes.io/controller-uid=c004f0fe-f805-4496-b4eb-45d7ebe9ac08
                  batch.kubernetes.io/job-name=cockroachdb-vcheck-28272052
                  controller-uid=c004f0fe-f805-4496-b4eb-45d7ebe9ac08
                  job-name=cockroachdb-vcheck-28272052
Annotations:      cni.projectcalico.org/containerID: e7f02d6498f71000c55edff73e74b9074718e5c1a609018efe52bddfb33ab326
                  cni.projectcalico.org/podIP: 
                  cni.projectcalico.org/podIPs: 
Status:           Failed
IP:               192.168.124.154
IPs:
  IP:           192.168.124.154
Controlled By:  Job/cockroachdb-vcheck-28272052
Containers:
  crdb:
    Container ID:  cri-o://9f4c2693b5043752e68b757b7f6f708a20c8d64f3daef3cfc01294771d3055e0
    Image:         cockroachdb/cockroach:v23.1.4
    Image ID:      docker.io/cockroachdb/cockroach@sha256:83770bbd0e3cbc5d07f47c252d5f5f00f3ff56b22d61c378b3b496fdf0337430
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
    Args:
      -c
      set -eo pipefail; /cockroach/cockroach.sh version | grep 'Build Tag:'| awk '{print $3}'; sleep 150
    State:          Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Tue, 03 Oct 2023 10:52:50 +0200
      Finished:     Tue, 03 Oct 2023 10:52:53 +0200
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     300m
      memory:  256Mi
    Requests:
      cpu:        300m
      memory:     256Mi
    Environment:  <none>
    Mounts:       <none>
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:            <none>
QoS Class:          Guaranteed
Node-Selectors:     <none>
Tolerations:        node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                    node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  108s  default-scheduler  Successfully assigned cockroach-operator-system/cockroachdb-vcheck-28272052-bbs7c to uca2k
  Normal  Pulled     108s  kubelet            Container image "cockroachdb/cockroach:v23.1.4" already present on machine
  Normal  Created    108s  kubelet            Created container crdb
  Normal  Started    108s  kubelet            Started container crdb

i'm not sure if i'm holding it wrong, but i get no log output

kubectl logs  cockroachdb-vcheck-28272052-bbs7c


(nothing)

so i have no idea why it is failing.

kubectl logs cockroach-operator-manager-5489bf9cbc-srr2z
rn","ts":1696323344.3604305,"logger":"controller.CrdbCluster","msg":"starting to check the crdb version of the container provided","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF"}
{"level":"warn","ts":1696323344.3604872,"logger":"controller.CrdbCluster","msg":"User set image.name, using that field instead of cockroachDBVersion","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF"}
{"level":"warn","ts":1696323344.3657265,"logger":"controller.CrdbCluster","msg":"version checker","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF","job":"cockroachdb-vcheck-28272055"}
{"level":"warn","ts":1696323344.3716621,"logger":"controller.CrdbCluster","msg":"job pod is ready","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF"}
{"level":"error","ts":1696323344.3852656,"logger":"controller.CrdbCluster","msg":"crdb version not found","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF","error":"failed to check the version of the cluster","stacktrace":"github.com/cockroachdb/cockroach-operator/pkg/controller.(*ClusterReconciler).Reconcile\n\tpkg/controller/cluster_controller.go:154\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\texternal/io_k8s_sigs_controller_runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\texternal/io_k8s_sigs_controller_runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\texternal/io_k8s_sigs_controller_runtime/pkg/internal/controller/controller.go:214"}
{"level":"info","ts":1696323344.385327,"logger":"controller.CrdbCluster","msg":"Error on action","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF","Action":"VersionCheckerAction","err":"failed to check the version of the cluster"}
{"level":"error","ts":1696323344.3853533,"logger":"controller.CrdbCluster","msg":"can't proceed with reconcile","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"7CkXm2BiHGqgHZQ7y44jSF","Action":"VersionCheckerAction","error":"failed to check the version of the cluster","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\texternal/io_k8s_sigs_controller_runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\texternal/io_k8s_sigs_controller_runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\texternal/io_k8s_sigs_controller_runtime/pkg/internal/controller/controller.go:214"}
{"level":"info","ts":1696323346.8818362,"logger":"controller.CrdbCluster","msg":"reconciling CockroachDB cluster","CrdbCluster":"cockroach-operator-system/cockroachdb","ReconcileId":"QxX3kNWGAVU7LhnxiDrPfJ"}
{"level":"info","ts":1696323346.8818865,"logger":"webhooks","msg":"default","name":"cockroachdb"}
@psosnowski
Copy link

Experiencing identical issue. Deployed cockroachDB using operator. I have the vcheck pods come up with OOM. Seems the default request/limits are at CPU: 300m and Mem: 256m. Eventually the vcheck succeeds and pods are created, however is there a way to configure these resources?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants