You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you checked borgbackup docs, FAQ, and open Github issues?
Yes.
Is this a BUG / ISSUE report or a QUESTION?
Neither, it's a suggestion to improve the documentation.
System information. For client/server mode post info for both machines.
Your borg version (borg -V).
borg 1.1.9 (from Debian buster, borgbackup=1.1.9-2+deb10u1 (both client and server)
Operating system (distribution) and version.
Debian 10 (buster) (both client and server)
Hardware / network configuration, and filesystems used.
irrelevant
How much data is handled by borg?
irrelevant
Full borg commandline that lead to the problem (leave away excludes and passwords)
borg check --progress --verbose /path/to/repo
Describe the problem you're observing.
I encountered my first case of a failing borg check yesterday that wasn't actually caused by a damaged repository, but by defective memory. This is actually documented in the FAQ:
Checking memory
Intermittent issues, such as borg check finding errors inconsistently between runs, are frequently caused by bad memory.
Run memtest86+ (or an equivalent memory tester) to verify that the memory subsystem is operating correctly.
But due to the fact that I had several damaged repositories in the past (due to power failures) that I was always able to to repair with borg check --repair, I didn't actually bother to look through the FAQ, and directly proceeded with borg check --repair. The timeline looked something like this:
A manual attempt to run the backup failed with File failed integrity check: /path/to/repo/cache/9ce198549ec83582155f288b853891e6cb1d33f41547d52748d4e2c9fb5ada1d/chunks
borg check found errors (Segment entry checksum mismatch), so I figured that the repository was damaged.
borg check --repair suddenly no longer found any errors.
Another borg check again found errors.
Another borg check --repair again didn't find any errors.
At this point I was very confused and wondered if I was doing something wrong or if I had encountered a very strange bug, because it looked to me like I had some kind of repository corruption that is visible to borg check, but not to borg check --repair. I asked for help on IRC, and was advised to use the most recent borg version as the one provided by Debian is rather old.
I downloaded a static build of the most recent borg release (1.1.16), and ran a check with this version. Now this also didn't find any errors during borg check.
I ran borg 1.1.9 check again to see if the old version was still able to see the corruption, and again it would report Segment entry checksum mismatch.
Only at this point I actually compared the output of my multiple borg check runs and noticed that the segment numbers were always different:
At this point it was rapidly becoming clear to me that the hardware might be the actual cause, and a memtester run confirmed it:
server ~ # memtester 6G
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 6144MB (6442450944 bytes)
got 6144MB (6442450944 bytes), trying mlock ...locked.
Loop 1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : testing 4FAILURE: 0x404040404040404 != 0x404040404040400 at offset 0x4a7f0fb0.
Checkerboard : ok
Bit Spread : testing 2FAILURE: 0x00000014 != 0x00000010 at offset 0x4a7f0fb0.
Bit Flip : testing 17FAILURE: 0x00000004 != 0x00000000 at offset 0x4a7f0fb0.
Walking Ones : ok
Walking Zeroes : testing 125FAILURE: 0x00000004 != 0x00000000 at offset 0x4a7f0fb0.
8-bit Writes : ok
16-bit Writes : |FAILURE: 0x77cea9b8efef8c2c != 0x77cea9b8efef8c28 at offset 0x4a7f0fb0.
I think mentioning this potential problem in man borg-check would be worthwhile, to increase the likelihood that people who use borg check are aware of it. E.g. with the following passage:
Note that borg check can report spurious errors when running on defective hardware. If you're seeing errors during borg check but not in a subsequent borg check --repair, run multiple checks and compare the defective segment IDs. If the defective segment IDs vary between checks, check your hardware e.g. with memtest86+ or memtester.
Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.
irrelevant
Include any warning/errors/backtraces from the system logs
irrelevant
The text was updated successfully, but these errors were encountered:
Have you checked borgbackup docs, FAQ, and open Github issues?
Yes.
Is this a BUG / ISSUE report or a QUESTION?
Neither, it's a suggestion to improve the documentation.
System information. For client/server mode post info for both machines.
Your borg version (borg -V).
borg 1.1.9 (from Debian buster, borgbackup=1.1.9-2+deb10u1 (both client and server)
Operating system (distribution) and version.
Debian 10 (buster) (both client and server)
Hardware / network configuration, and filesystems used.
irrelevant
How much data is handled by borg?
irrelevant
Full borg commandline that lead to the problem (leave away excludes and passwords)
Describe the problem you're observing.
I encountered my first case of a failing
borg check
yesterday that wasn't actually caused by a damaged repository, but by defective memory. This is actually documented in the FAQ:But due to the fact that I had several damaged repositories in the past (due to power failures) that I was always able to to repair with
borg check --repair
, I didn't actually bother to look through the FAQ, and directly proceeded withborg check --repair
. The timeline looked something like this:File failed integrity check: /path/to/repo/cache/9ce198549ec83582155f288b853891e6cb1d33f41547d52748d4e2c9fb5ada1d/chunks
borg check
found errors (Segment entry checksum mismatch
), so I figured that the repository was damaged.borg check --repair
suddenly no longer found any errors.borg check
again found errors.borg check --repair
again didn't find any errors.borg check
, but not toborg check --repair
. I asked for help on IRC, and was advised to use the most recent borg version as the one provided by Debian is rather old.check
with this version. Now this also didn't find any errors duringborg check
.Segment entry checksum mismatch
.Only at this point I actually compared the output of my multiple
borg check
runs and noticed that the segment numbers were always different:At this point it was rapidly becoming clear to me that the hardware might be the actual cause, and a
memtester
run confirmed it:I think mentioning this potential problem in
man borg-check
would be worthwhile, to increase the likelihood that people who useborg check
are aware of it. E.g. with the following passage:Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.
irrelevant
Include any warning/errors/backtraces from the system logs
irrelevant
The text was updated successfully, but these errors were encountered: