Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Medusa restore init container fails with operation not permitted exception #1434

Open
luigiba opened this issue Oct 18, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@luigiba
Copy link

luigiba commented Oct 18, 2024

What happened?

I tried to restore a medusa backup. After creating a MedusaRestoreJob resource, the cassandra cluster was restarted by the operator. The medusa-restore init container tried to perform the restore operation but after some time it fails with the following error:

subprocess.CalledProcessError: Command '['chown', '-R', 'root:root', '/var/lib/cassandra/data/dev/test_table-a3b36c407a7211ef8c17e15ea700af1a']' returned non-zero exit status 1.

It seems that it tries to change the permission of the table folder (owned by systemd-coredump user) but it lacks permissions to do it.

Environment

  • K8ssandra Operator version:
    cr.k8ssandra.io/k8ssandra/k8ssandra-operator:v1.18.0

  • Kubernetes version information:
    k8s 1.25 (Rancher)

  • Manifests:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: cassandra-cluster-dev
  namespace: k8ssandra
spec:
  auth: true
  cassandra:
    containers:
      - name: cassandra
        volumeMounts:
          - mountPath: /config/logback.xml
            name: cassandra-cluster-dev-logback-volume
            subPath: logback.xml
    datacenters:
      - config:
          jvmOptions:
            gc: G1GC
            heap_initial_size: 4G
            heap_max_size: 7G
        metadata:
          name: dc1
        resources:
          limits:
            cpu: 2000m
            memory: 8Gi
          requests:
            cpu: 500m
            memory: 8Gi
        size: 3
        storageConfig:
          cassandraDataVolumeClaimSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 5Gi
            storageClassName: ''
    extraVolumes:
      volumes:
        - configMap:
            items:
              - key: logback.xml
                path: logback.xml
            name: cassandra-cluster-dev-logback-config
          name: cassandra-cluster-dev-logback-volume
    serverType: cassandra
    serverVersion: 4.0.13
    softPodAntiAffinity: false
    superuserSecretRef:
      name: cassandra-cluster-dev-admin
  medusa:
    storageProperties:
      bucketName: dl-backup-zone-dev
      host: minio-cluster-dev-hl.k8ssandra.svc.cluster.local
      maxBackupAge: 0
      maxBackupCount: 3
      port: 9001
      prefix: cassandra-dev
      secure: true
      storageProvider: s3_compatible
      storageSecretRef:
        name: cassandra-cluster-dev-medusa-secret
---
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaBackupSchedule
metadata:
  name: cassandra-cluster-dev-medusa-backup-schedule
  namespace: k8ssandra
spec:
  backupSpec:
    backupType: differential
    cassandraDatacenter: dc1
  cronSchedule: 0 9 * * 0
  operationType: backup
---
apiVersion: medusa.k8ssandra.io/v1alpha1
kind: MedusaRestoreJob
metadata:
  name: restore-backup1
  namespace: k8ssandra
spec:
  cassandraDatacenter: dc1
  backup: backupid
---
  • K8ssandra Operator Logs:
subprocess.CalledProcessError: Command '['chown', '-R', 'root:root', '/var/lib/cassandra/data/dev/test_table-a3b36c407a7211ef8c17e15ea700af1a']' returned non-zero exit status 1.

Anything else we need to know?:
Persisted Volumes mounted on pods are based on host path. I executed chmod -R 777 on all the volumes. Below an example of PV created for a k8ssandra node:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: server-data-cassandra-cluster-dev-dc1-default-sts-0
spec:
  capacity:
    storage: 5Gi
  hostPath:
    path: /server-data-cassandra-cluster-dev-dc1-default-sts-0
    type: ''
  accessModes:
    - ReadWriteOnce
  claimRef:
    kind: PersistentVolumeClaim
    namespace: k8ssandra
    name: server-data-cassandra-cluster-dev-dc1-default-sts-0
  persistentVolumeReclaimPolicy: Retain
  volumeMode: Filesystem
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - nodeName

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: K8OP-278

@luigiba luigiba added the bug Something isn't working label Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: No status
Development

No branches or pull requests

1 participant