Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsuccessful backup execution still updates lastBackup #421

Closed
trieszklr opened this issue Jun 18, 2020 · 2 comments
Closed

unsuccessful backup execution still updates lastBackup #421

trieszklr opened this issue Jun 18, 2020 · 2 comments

Comments

@trieszklr
Copy link

trieszklr commented Jun 18, 2020

Expected Behavior

if backup script fails for whatsoever reason it should never update lastBackup

Actual Behavior

backup often fails ( due to #415 ), backup file is missing but lastBackup somehow gets updated.
This actually blocks jenkins master to come back healthy, if it gets restarted in this state. (comes online but configurations are missing.)

kubectl --kubeconfig ~/.kube/config.neutral -n jenkins get jenkins -o yaml | grep  Backup
      makeBackupBeforePodDeletion: true
    lastBackup: 13043
    pendingBackup: 13044

while operator logs says:

2020-06-18T11:52:21.106Z    WARN    controller-jenkins    jenkins/jenkins_controller.go:152    Reconcile loop failed 10 times with the same error, giving up: pod exec error o │
│ ' stderr 'tar (child): /backup/13043.tar.gz: Cannot open: No such file or directory                                                                                            │
│ tar (child): Error is not recoverable: exiting now                                                                                                                             │
│ tar: Child returned status 2                                                                                                                                                   │
│ tar: Error is not recoverable: exiting now                                                                                                                                     │
│ ': command terminated with exit code 2    {"cr": "xxxx"}                                   

current files

user@jenkins-klar:/backup$ ls -lah
total 1.2G
drwxrwsr-x 3 root user 4.0K Jun 18 10:47 .
drwxr-xr-x 1 root root   42 Jun 18 11:19 ..
-rw-rw-r-- 1 user user 404M Jun 18 08:49 13040.tar.gz
-rw-rw-r-- 1 user user 406M Jun 18 09:38 13041.tar.gz
-rw-rw-r-- 1 user user 412M Jun 18 10:47 13042.tar.gz
drwxrws--- 2 root user  16K May  4 19:16 lost+found
user@jenkins-klar:/backup$ 

Steps to Reproduce the Problem

  1. check lastBackup
  2. hack backup script, make sure it returns nonzero
  3. check lastBackup counter again, together with the /backup content

Additional Info

  • Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.11-eks-af3caf", GitCommit:"af3caf6136cd355f467083651cc1010a499f59b1", GitTreeState:"clean", BuildDate:"2020-03-27T21:51:36Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}
  • Jenkins Operator version:
 v0.4.0

Workaround: manually copy the last backup into the filename it is missing, then wait for restore process to complete.

@tomaszsek
Copy link

Fixed https://github.com/jenkinsci/kubernetes-operator/releases/tag/v0.5.0

@MemoAlfa
Copy link

Fixed https://github.com/jenkinsci/kubernetes-operator/releases/tag/v0.5.0
Does this mean that upgrading the operator should be enough, or do we need to explicitly enable getLatestAction to avoid this issue.

    getLatestAction:
      exec:
        command:
        - /home/user/bin/get-latest.sh # this command is invoked on "backup" container to get last backup number before pod deletion. If you don't omit it in CR, you can lose data

This snippet is from the documentation and the comment seems to discourage using getLatestAction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants