-
-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
practical testing of what will become borg 1.2 #4360
Comments
I have tested master branch (borg 1.2.0a3.dev3+g81f9a8cc) with empty repo on davfs2.
Everything worked perfectly. 83 segments total.
davfs cache_size: 5000 MiB / davfs table_size: 1024 works for me |
alpha 3 is out. \o/ |
Tested latest master branch (borg 1.2.0a3.dev8+gde151cd3) with my 31.31 GB repo from yesterday on davfs2.
No errors occurred. 87 segments total.
Interesting: davfs didn't respect cache_size of 5 GB, all segments were cached until cache was 30 GB. When borg exited (rc 0), cache was cleaned to < 5 GB, which is expected behaviour when open files are closed. works for me |
@fantasya-pbem the age threshold of the new lrucache is 4min, so it would be interesting to test >>4min. Also, the code is avoiding to use old file handles, but otoh does not actively search for them and close them. |
I did several more tests in the last two days, with a production repository I used for half a year until some days ago:
First some tests with 5 GB limited DavFS cache (which is on its own LVM volume that had ~45 GB free space). As expected, borg 1.2.0a3.dev8+gde151cd3 crashed when DavFS cache was full (45 GB). Therefore I increased DavFS cache to 25 GB limit. borg 1.1.10.dev5+g7a7e04a4.d20190227 check then finished without errors. Now I tested borg 1.2.0a3.dev8+gde151cd3 again. DavFS deleted the "metadata files" from its cache but no segment files when the 25 GB threshold was exceeded. This is a very interesting behaviour, but I cannot see a plausible explaination why DavFS cache growth stopped at 41 GB after 31 minutes. I can only draw the conclusion that one needs a reasonably large DavFS cache for borg checks, but I does not need to be as big as the borg repo. |
@fantasya-pbem i opened #4427 for further work on the lrucache. if you like to help with testing, add a comment there. |
Just released 1.2.0 alpha4 to pypi. |
Just for protocol: Tested borg-1.2.0a5 today. Works for me. |
@fantasya-pbem did you also test some of the new features (see changelog)? |
I did a "borg compact --cleanup-commits" which did what it should. I ran a "borg check" afterwards which succeeded too. (Test repo has 5 archives. Will do more tests later with a production repo copy.) I did search for information about the borg check "--max-duration". It seems that this option is not explained in detail anywhere. I'm wondering what would be good and bad max duration values. There should be a section in the docs where this option is discussed. |
Good values depend a bit on your repo size and also on speeds and overheads. But I guess usually one would take enough time to make a decent amount of progress related to the overall time needed to check the whole repo. E.g. if whole repo would take 100h to check, you could:
|
I tested "borg compact", "borg prune" and "borg check" with my big repo on DavFS. Compact and prune worked well (prune's applied-rule output is nice!). Check did crash after 90 segments, but I think this was because of small DavFS cache (5 G) which may be too less for the network throughput of my server and Borg's FD timeout. With 10 G cache borg check did succeed. I had such problems before occasionally with 5 G. So again – 1.2.0a5 works for me. Regarding --max-duration: It is mentioned in "borg check" with one sentence: "do only a partial repo check for max. SECONDS seconds (Default: unlimited)". |
@fantasya-pbem thanks for testing, created #4473 for the docs issue. |
Executed |
XXX borgbackup 1.2.0rc1 beta release - do not use this on production backup repositories. XXX Upstream discussions: borgbackup/borg#6166 borgbackup/borg#4360 * gnu/packages/backup.scm (borg): Update to 1.2.0rc1. [source]: Adjust the list of Cython files to rebuild. Remove an obsolete substitution. Delete the bundled xxhash. Blake2 is no longer bundled. [native-inputs]: Add python-dateutil. [inputs]: Add xxhash. Add python-msgpack-1.2. Remove libb2. [arguments]: Export BORG_LIBXXHASH_PREFIX to ensure the build script can find xxhash. Adjust the list of skipped tests and make the custom 'check' phase honor tests?. Install some more documentation.
@m3nu borg compact behaves the same no matter from where you invoke it (on client or on server). It will just physically remove everything that logically is already deleted. So delete and prune basically mark stuff as deleted, but the data is still physically present in old segment files (but these old segment files have "logical holes" where this deleted data is stored, because there is already a later DELETE committed for that chunk of data). borg compact then cleans up and removes all these logical holes by writing only the stuff that is still in use to new segment files (but skipping the deleted stuff). there is a default threshold so that this data shuffling does not occur if there is too little to compact. So, delete and prune do not immediately free space (that's why prune and delete is much faster). But a compact done afterwards will. And compact will reject to run if the repo is in append-only mode. |
@m3nu yeah, good idea, borg hosting providers could offer off-peak-hours compacting:
|
About the create --stats speed: let's continue in #6259. |
Hmm, thinking about it, guess it makes a difference whether the append-only is enforced by repo config or by |
Should be fine, if ‘borg compact’ is rejected for both options. I see the main difference in how both options are enforced. Server-side, vs user-side. The latter is like making my own file RO to avoid deleting it. Stops accidents, but doesn’t add security. Also thanks for the additional background. Will add compaction as opt-in feature. |
I noticed that 1.2 now prints out a warning like „file changed while we backed it up“, which seems to be the cause for return code 1 (warning). I never experienced that in 1.1. This might be generated because I add |
@fantasya-pbem well, a warning is not an error and warnings are kind of expected if you back up an active filesystem. i don't remember whether we have the same code in 1.1 (or backported the change from master to 1.1-maint), but master/1.2 checks the file stat before and after reading an input file and notices if it has changed while we read it (which can mean that we backed up inconsistent crap). iirc tar also emits a similar warning for this case. for files you do not really care about inconsistency might be acceptable, but borg does not known whether you do or not. you can also exclude unimportant changing files if your goal is a warning free run. i don't think this warning is related to --progress or --stats. |
OK, just wanted to tell that there is about 1 week left before 1.2.0 official release. Also, all tickets in 1.2.0 milestone (except "testing it", this one) were closed or moved to future milestones. So, please use the remaining time for more stress testing and feedback. |
In the last couple of days I had one backup per day, a |
Maybe release one more RC, esp. for #6306? I cannot estimate the impact, is this fix relevant for special use cases? ("race condition") |
One is doing a low-level change about how borg deletes files (e.g. segment files in the repo, but also other stuff). that truncate-unlink method used there previously had bad consequences if somebody tried to be clever and hardlink-copied a repo (see my discussion post / PSA). The "usual" case of this is just successfully deleting the file (unlink). The more interesting case is if the delete (os.unlink) fails with ENOSPC (no space left). Then it tries the truncate first IF there is no other hardlink and then another unlink operation. The SaveFile fix (race conditions) is only relevant under rather special circumstances, e.g. when running borg init in parallel on a fresh machine that has no borg cache dir yet. We had that code since longer and it took some years until someone found this failing in their setup. Most borg ops running in parallel deal with different pathes because they deal with different repos and the repo id is part of the path. The top level borg cachedir (and the CACHETAG in there) is likely the only exception. Not sure it makes sense to create another RC for just a few days testing. |
https://github.com/borgbackup/borg/tree/1.2.0 i just tagged 1.2.0 in the repo. if somebody has a development setup for borg and wants to give the release some pre-release testing, that's the tag to fetch. i still have to do the platform testing (using vagrant), so the tag might change to another changeset later if anything needing a fix comes up. Twosday late evening is release time. :-) |
platform testing with vagrant was successful! https://paste.thinkmo.de/7ikN34mK#borg-1.2.0-packages here are the currently planned release files for 2022-02-22 22:02:22 Please give them a try before I officially upload them to github and pypi. The upper ones are the pyinstaller-made fat binaries for misc. OSes and the lower 2 files are the pypi package (each plus gpg signature). |
Once again tested on OpenBSD:
|
@ThomasWaldmann I'll try to get Fedora's spec file updated and run a test build this evening. Might be just in time before beta cutoff for F36 (Feb 22). |
Last minute! :) |
Tested on Fedora rawhide:
What I'd need for real Fedora build is a tarball (+ ideally a GPG signature). I generated a tarball manually from a git checkout due to setuptools_scm issues but that won't fly for a real distro package. |
@FelixSchwarz It would be good to have it confirmed by @ThomasWaldmann , but the aforementioned link contains a (digitally signed) tarball - borgbackup-1.2.0.tar.gz (second from the bottom) which - as I understood - will be the official one (assuming no disasters reported :-) ). |
@szpak Thanks, I completely missed the link. |
Yeah, that is the plan: release that stuff "as is" if no major issues are reported. |
Hello, I'll have a test too |
I did test on Debian sid and Ubuntu jammy (openssl 3.0.1). Build was fine after adding some new dependencies (dateutil, pkgconfig).
|
Thanks for testing! The openssl-related warnings might be interesting. IIRC, When "pyx" shows up in warnings: that is code generated by Cython and besides using the latest Cython release, we can't do much about it. _hashindex.c is hand-made C code, we can look at whether we can get rid of some warnings (e.g. by adding casts), but as far as I have seen yet, warnings are harmless. |
1.2.0 is released - let's continue in the related discussion / new github tickets. |
do practical testing with master branch code (or some alpha release).
report anything you find on the issue tracker (as a separate issue), check first if the issue already has been reported. you can post "works for me" in this ticket here.
see end of this ticket for current betas / rc (or directly check the github releases page).
do not run master branch code or borg pre-releases against production repos, always use fresh repos made by
borg init
!The text was updated successfully, but these errors were encountered: