Redis cluster cannot be recovered by reverting CR after affinity cannot be satisfied #480

hoyhbx · 2023-04-03T04:52:38Z

What version of redis operator are you using?

redis-operator version: We are using redis-operator built from the HEAD

f1c547e

Does this issue reproduce with the latest release?

Yes, it reproduces with quay.io/opstree/redis-operator:v0.10.0

What operating system and processor architecture are you using (kubectl version)?

kubectl version Output

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.9", GitCommit:"6df4433e288edc9c40c2e344eb336f63fad45cd2", GitTreeState:"clean", BuildDate:"2022-05-19T19:53:08Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

What did you do?

I first created a 6 node redis cluster with 3 leaders and 3 followers by applying the following YAML file. We will call this YAML file the 'original' one in the following part of this issue. What can be done with you?

apiVersion: redis.redis.opstreelabs.in/v1beta1
kind: RedisCluster
metadata:
  name: test-cluster
spec:
  clusterSize: 3
  kubernetesConfig:
    image: quay.io/opstree/redis:v6.2.5
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        cpu: 101m
        memory: 128Mi
      requests:
        cpu: 101m
        memory: 128Mi
  storage:
    volumeClaimTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi

We later changed the Affinity rule, but then realized the Affinity rule cannot be satisfied. There is always one redis pod not scheduled. Then we try to recover the cluster by reverting to the original CR, to remove the unsatisfiable affinity rule. However, the redis-operator updates the statefulset to remove the Affinity rule, but the pods still have the old bad affinity rule, causing the Redis cluster to remain in the error state.

What did you expect to see?

We expect the Affinity rule from the pods to be removed after removing it from CR.

What did you see instead?

Redis cluster continue to have one less replica than desired.

Possible root cause and Comments
It may be caused by this known limitation of statefulSet: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#forced-rollback

The text was updated successfully, but these errors were encountered:

iamabhishek-dubey · 2023-04-09T08:46:05Z

We have already introduced the force recreation of statefulset for issues like this, please upgrade the operator to the latest version.

hoyhbx added the bug Something isn't working label Apr 3, 2023

tylergu mentioned this issue Apr 3, 2023

Redis cluster cannot be recovered by reverting CR after affinity cannot be satisfied xlab-uiuc/acto#209

Closed

iamabhishek-dubey closed this as completed Apr 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis cluster cannot be recovered by reverting CR after affinity cannot be satisfied #480

Redis cluster cannot be recovered by reverting CR after affinity cannot be satisfied #480

hoyhbx commented Apr 3, 2023 •

edited

Loading

iamabhishek-dubey commented Apr 9, 2023

Redis cluster cannot be recovered by reverting CR after affinity cannot be satisfied #480

Redis cluster cannot be recovered by reverting CR after affinity cannot be satisfied #480

Comments

hoyhbx commented Apr 3, 2023 • edited Loading

iamabhishek-dubey commented Apr 9, 2023

hoyhbx commented Apr 3, 2023 •

edited

Loading