backup scrip breaks consistently over large jobs folder #415

trieszklr · 2020-06-17T11:54:44Z

Expected Behavior

backup.sh runs successfully even when jobs folder is large

Actual Behavior

backup.sh fails practically consistently due to some files are always changing while it runs

2020-06-17T03:30:46.521Z    WARN    controller-jenkins    jenkins/jenkins_controller.go:171    Reconcile loop failed: pod exec error operation on stream: stdout 'Running backup
backup file '/backup/13019.tar.gz' is empty
' stderr 'tar: jobs/xxxx/jobs/yyyyyy: file changed as we read it
tar: jobs/xxxx/jobs/zzzzzz: file changed as we read it
tar: jobs/xxxx/jobs/vvvvv: file changed as we read it
': command terminated with exit code 1    {"cr": "xxxxxx"}

this does not change over 24 hours:

kubectl --kubeconfig ~/.kube/config.neutral -n jenkins get jenkins -o yaml | grep  Backup
      makeBackupBeforePodDeletion: true
    lastBackup: 13018
    pendingBackup: 13019
    restoredBackup: 13016

Steps to Reproduce the Problem

set up github organization project with 30 projects and 20+ contributors with commit hooks
create branches and run jobs so jobs folder grows 5GB+
monitor jenkins-operator logs backup part

Additional Info

tar returns 1 if files change while archiving.
https://stackoverflow.com/questions/20318852/tar-file-changed-as-we-read-it

this breaks current backup.sh script due to

set -eo pipefail

Jenkins Operator version:

 v0.4.0

The text was updated successfully, but these errors were encountered:

tomaszsek · 2020-06-19T14:11:54Z

Hi @trieszklr,

In my opinion, your Jenkins is too big. Why don't split it to there smaller Jenkinses? The backup will be much smaller. If you have that big amount of data you have to increase the resources for the backup container.

Cheers

trieszklr · 2020-07-02T14:54:29Z

Hi @trieszklr,

In my opinion, your Jenkins is too big. Why don't split it to there smaller Jenkinses? The backup will be much smaller. If you have that big amount of data you have to increase the resources for the backup container.

Cheers

Hi @tomaszsek ,

thanks for your reply. We are using github organization which automagically creates folders/jobs for all projects and branches where pipeline file was found... We have already reduced the build retention to fairly minimal, but we would not want to lose this very convenient jenkins feature... So "splitting up to smaller jenkinses" is not really an option for us and we are likely not alone with this requirement...

And indeed increasing resource for backup container also helped to reduce the occurrences but that does not fix this bug in itself...

trieszklr · 2020-07-16T15:06:00Z

hi @tomaszsek and @waveywaves ,

just wanted to mention that I have built a custom backup container with this patch, rolled it out and it did fix the issue:
#416

Previously 9 out of 10 times jenkins master pod did come back online after a restart because of this bug.
If you guys suggest alternative approach, that is perfectly fine too, but this bug should be addressed IMHO. The implication is very severe and "splitting up jenkins" is not an option for larger teams using GitHub or Teambucket Organization Folders...

Cheers

stale · 2021-07-21T00:03:14Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this issue is still affecting you, just comment with any updates and we'll keep it open. Thank you for your contributions.

stale · 2021-08-22T18:41:59Z

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

Bakies · 2021-09-01T11:39:08Z

your Jenkins is too big

I don't find this answer acceptable, the backup job should be more resilient. Like trieszklr, I added one organization job and backups started failing constantly. I can't split this up into smaller portion and don't want to manage more Jenkins.

trieszklr pushed a commit to trieszklr/kubernetes-operator that referenced this issue Jun 17, 2020

jenkinsci#415 avoid breaking backup script if files change during tar

bffb86e

trieszklr mentioned this issue Jun 17, 2020

#415 avoid breaking backup script if files change during tar #416

Closed

trieszklr pushed a commit to trieszklr/kubernetes-operator that referenced this issue Jun 18, 2020

jenkinsci#415 remove && from line end to evaluate exit code properly

5365407

trieszklr mentioned this issue Jun 18, 2020

unsuccessful backup execution still updates lastBackup #421

Closed

stale bot added the stale label Jul 21, 2021

stale bot closed this as completed Aug 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backup scrip breaks consistently over large jobs folder #415

backup scrip breaks consistently over large jobs folder #415

trieszklr commented Jun 17, 2020

tomaszsek commented Jun 19, 2020

trieszklr commented Jul 2, 2020

trieszklr commented Jul 16, 2020 •

edited

Loading

stale bot commented Jul 21, 2021

stale bot commented Aug 22, 2021

Bakies commented Sep 1, 2021 •

edited

Loading

backup scrip breaks consistently over large jobs folder #415

backup scrip breaks consistently over large jobs folder #415

Comments

trieszklr commented Jun 17, 2020

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

tomaszsek commented Jun 19, 2020

trieszklr commented Jul 2, 2020

trieszklr commented Jul 16, 2020 • edited Loading

stale bot commented Jul 21, 2021

stale bot commented Aug 22, 2021

Bakies commented Sep 1, 2021 • edited Loading

trieszklr commented Jul 16, 2020 •

edited

Loading

Bakies commented Sep 1, 2021 •

edited

Loading