Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Borg extract hangs after about an hour. (strace: Timeout) #3866

Closed
rvanlaar opened this issue May 29, 2018 · 5 comments
Closed

Borg extract hangs after about an hour. (strace: Timeout) #3866

rvanlaar opened this issue May 29, 2018 · 5 comments
Milestone

Comments

@rvanlaar
Copy link

Have you checked borgbackup docs, FAQ, and open Github issues?

yes

Is this a BUG / ISSUE report or a QUESTION?

ISSUE

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

borg 1.1.5

Operating system (distribution) and version.

Ubuntu 18.04

Hardware / network configuration, and filesystems used.

Hardware: UX501VW laptop,
network config: 250Mbit network connection
filesystems used: ext4

How much data is handled by borg?

90 GB

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg -v -p --remote-path=borg1 extract [email protected]:REPONAME::2018-05-28

Describe the problem you're observing.

After a while all activity stops, no cpu usage, no downloads
Strace shows:

select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

I can reproduce the issue. The issues shows up after about an hour.
I tried restarting the export. This gave the same problem.

I would like two things for borg to support.
A better retry after the timeout and borg export to be able to resume after a failure.
Restoring an 80 GB repo isn't fun this way.

Include any warning/errors/backtraces from the system logs

strace: Process 5656 attached
select(8, [5 7], [], [4 5 7], {tv_sec=0, tv_usec=783801}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 0 (Timeout)
select(8, [5 7], [], [4 5 7], {tv_sec=1, tv_usec=0}) = 2 (in [5 7], left {tv_sec=0, tv_usec=512787})
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5659, si_uid=1000, si_status=0, si_utime=29894, si_stime=12996} ---
read(5, "", 10485760)                   = 0
close(3)                                = 0
select(8, [5 7], [4], [4 5 7], {tv_sec=1, tv_usec=0}) = 3 (in [5 7], out [4], left {tv_sec=0, tv_usec=999997})
read(5, "", 10485760)                   = 0
close(4)                                = 0
close(5)                                = 0
wait4(5659, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 5659
close(7)                                = 0
stat("/usr/lib/python3/dist-packages/borg/archiver.py", {st_mode=S_IFREG|0644, st_size=231367, ...}) = 0
openat(AT_FDCWD, "/usr/lib/python3/dist-packages/borg/archiver.py", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=231367, ...}) = 0
ioctl(3, TCGETS, 0x7ffdd699a060)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "import argparse\nimport collectio"..., 4096) = 4096
lseek(3, 0, SEEK_CUR)                   = 4096
read(3, "ord arguments\n    :param cache: "..., 8192) = 8192
read(3, "able/changes.html#pre-1-0-9-mani"..., 8192) = 8192
read(3, "eate))\n            print(fmt % ("..., 8192) = 8192
read(3, "                         keep_ex"..., 8192) = 8192read(3, "                         keep_ex"..., 8192) = 8192
read(3, "on.\n        #\n        # Therefor"..., 8192) = 8192
read(3, "dev)\n                tarinfo.dev"..., 8192) = 8192
read(3, "em2)\n                    continu"..., 8192) = 8192
read(3, "import borg.fuse\n        except "..., 8192) = 8192
read(3, "                   '\"keep-weekly"..., 8192) = 8192
read(3, "rgs.checkpoint_interval,\n       "..., 8192) = 8192
read(3, "rn textwrap.indent(json.dumps(d,"..., 8192) = 8192
read(3, "ll path to just before a path se"..., 8192) = 8192
read(3, "ser}\n            The user name ("..., 8192) = 8192
read(3, "igher index.\n            \"\"\"\n   "..., 8192) = 8192
read(3, "  add_common_option('--show-vers"..., 8192) = 8192
read(3, "ACHE_ENTRIES environment variabl"..., 8192) = 8192
read(3, "--------------+-----------------"..., 8192) = 8192
read(3, "'work slower, but using less spa"..., 8192) = 8192
read(3, "sing all paths specified. Paths "..., 8192) = 8192
read(3, "put stats as JSON. Implies ``--s"..., 8192) = 8192
read(3, " gzip\n        - .tar.bz2: bzip2\n"..., 8192) = 8192
read(3, "*',\n                            "..., 8192) = 8192
read(3, "is \"H\", \"d\", \"w\", \"m\", \"y\". For "..., 8192) = 8192
read(3, "RG REPOSITORIES, as the\n        "..., 8192) = 8192
read(3, "')\n        archive_group.add_arg"..., 8192) = 8192
read(3, "escription=self.do_debug_dump_ar"..., 8192) = 8192
read(3, "'required arguments', metavar='<"..., 8192) = 8192
read(3, "interpret outside\n              "..., 8192) = 6087
read(3, "", 8192)                       = 0
close(3)                                = 0
stat("/usr/lib/python3/dist-packages/borg/remote.py", {st_mode=S_IFREG|0644, st_size=55155, ...}) = 0
openat(AT_FDCWD, "/usr/lib/python3/dist-packages/borg/remote.py", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=55155, ...}) = 0
ioctl(3, TCGETS, 0x7ffdd699a060)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "import errno\nimport fcntl\nimport"..., 4096) = 4096
lseek(3, 0, SEEK_CUR)                   = 4096
read(3, "': ('repair', 'save_space', ),\n "..., 8192) = 8192
read(3, " e.traceback else logging.DEBUG\n"..., 8192) = 8192
read(3, " bound_args.arguments[name]\n    "..., 8192) = 8192
read(3, "pts = []\n        if args is not "..., 8192) = 8192
read(3, "                   else:\n       "..., 8192) = 8192
read(3, " In JSON mode, we manually check"..., 8192) = 8192
read(3, "yield transformed\n        # Cons"..., 8192) = 1907
read(3, "", 8192)                       = 0
close(3)                                = 0
stat("/usr/lib/python3/dist-packages/borg/archive.py", {st_mode=S_IFREG|0644, st_size=80503, ...}) = 0
openat(AT_FDCWD, "/usr/lib/python3/dist-packages/borg/archive.py", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=80503, ...}) = 0
ioctl(3, TCGETS, 0x7ffdd6999fc0)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "import errno\nimport json\nimport "..., 4096) = 4096
lseek(3, 0, SEEK_CUR)                   = 4096
read(3, "               msg = ' ' * colum"..., 8192) = 8192
read(3, ":\n        self.id = id\n        s"..., 8192) = 8192
read(3, "             pi.show(increase=le"..., 8192) = 8192
read(3, "is specific xattr (this may happ"..., 8192) = 8192
read(3, "st, path))\n            return st"..., 8192) = 8192
read(3, ".item_keys = [msgpack.packb(name"..., 8192) = 8192
read(3, "ning('Found defect chunks. They "..., 8192) = 8192
read(3, "replacement chunk detected (Byte"..., 8192) = 8192
read(3, "data), cdata)\n                se"..., 8192) = 8192
read(3, "by exclude_cache and exclude_if_"..., 8192) = 2679
read(3, "", 8192)                       = 0
close(3)                                = 0
uname({sysname="Linux", nodename="MiSus-L", ...}) = 0
openat(AT_FDCWD, "/etc/lsb-release", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=103, ...}) = 0
ioctl(3, TCGETS, 0x7ffdd699b500)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
ioctl(3, TCGETS, 0x7ffdd699b4b0)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
read(3, "DISTRIB_ID=Ubuntu\nDISTRIB_RELEAS"..., 8192) = 103
read(3, "", 8192)                       = 0
close(3)                                = 0
getpid()                                = 5656
getcwd("/home/roland/fromAsus-Prio", 1024) = 27
getpid()                                = 5656
write(2, "Connection closed by remote host"..., 33) = 33
rt_sigaction(SIGUSR2, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, 8) = 0
rt_sigaction(SIGUSR1, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, 8) = 0
rt_sigaction(SIGHUP, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, {sa_handler=0x444a20, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, 8) = 0
munmap(0x7fc0ff132000, 262144)          = 0
munmap(0x7fc0feeb2000, 262144)          = 0
munmap(0x7fc106692000, 135168)          = 0
munmap(0x7fc0ff0b2000, 262144)          = 0
munmap(0x7fc0fedf2000, 262144)          = 0
munmap(0x7fc0feff2000, 262144)          = 0
munmap(0x7fc0ff032000, 262144)          = 0
munmap(0x7fc0fedb2000, 262144)          = 0
munmap(0x7fc0fecb2000, 262144)          = 0
munmap(0x7fc0ff072000, 262144)          = 0
munmap(0x7fc0fec32000, 262144)          = 0
munmap(0x7fc103428000, 262144)          = 0
rt_sigaction(SIGBUS, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, NULL, 8) = 0
rt_sigaction(SIGFPE, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, NULL, 8) = 0
rt_sigaction(SIGABRT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, NULL, 8) = 0
rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7fc106440890}, NULL, 8) = 0
sigaltstack(NULL, {ss_sp=0x1e56b00, ss_flags=0, ss_size=8192}) = 0
sigaltstack({ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=0}, NULL) = 0
futex(0x7fc103d6382c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
exit_group(2)                           = ?
+++ exited with 2 +++
@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented May 30, 2018

Looks like your network or ssh connection broke down.

I also use rsync.net and ubuntu 18.04 now, so I'll try to reproduce this.

If you need the extraction result urgently, I can propose 2 ways that might work for you right now until we find a better solution or the root cause of the problem:

  • you can do partial extracts, like borg extract REPO PATTERN to do multiple smaller extraction runs that complete before your connection has issues
  • you can try using borg mount REPO MOUNTPOINT and rsync -avH from MOUNTPOINT to your desired extraction directory. If the connection breaks down, just repeat that over and over again until rsync does not find anything to do any more. Due to the way borg mount works, this might be less efficient than borg extract for bigger volumes of data.

@ThomasWaldmann ThomasWaldmann changed the title Borg export hangs after about an hour. (strace: Timeout) Borg extract hangs after about an hour. (strace: Timeout) May 30, 2018
@ThomasWaldmann ThomasWaldmann self-assigned this May 30, 2018
@rvanlaar
Copy link
Author

As you said @ThomasWaldmann the ssh connection broke down.

This fixed it for me:

Host *
    ServerAliveInterval 300
    ServerAliveCountMax 2

Is this something we could fix in borg?
Such as, restart the ssh connection or have an extract to be resumed?

@ThomasWaldmann
Copy link
Member

borg just uses ssh, it does not configure ssh (assuming that the user already has a working ssh configuration). if that is not the case, maybe there should be rather a bug been filed against the ssh default configuration of the ssh package.

restarting a broken connection: borg does not do that, please check, iirc we might have a ticket for this already.

borg extract resume: iirc we already have a ticket for that, but it is non-trivial.

@ThomasWaldmann
Copy link
Member

add @rvanlaar's fix to the docs / faq?

@fantasya-pbem
Copy link
Contributor

I will add this to the FAQ / make a PR.

fantasya-pbem added a commit to fantasya-pbem/borg that referenced this issue Oct 12, 2019
fantasya-pbem added a commit to fantasya-pbem/borg that referenced this issue Oct 13, 2019
fantasya-pbem added a commit to fantasya-pbem/borg that referenced this issue Oct 17, 2019
fantasya-pbem added a commit to fantasya-pbem/borg that referenced this issue Oct 19, 2019
fantasya-pbem added a commit to fantasya-pbem/borg that referenced this issue Oct 19, 2019
ThomasWaldmann pushed a commit that referenced this issue Oct 19, 2019
elho pushed a commit to elho/borg that referenced this issue Mar 17, 2020
@ghost ghost mentioned this issue Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants