Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Borg stalling on unstable connection #3754

Closed
robertschulze opened this issue Apr 3, 2018 · 7 comments
Closed

Borg stalling on unstable connection #3754

robertschulze opened this issue Apr 3, 2018 · 7 comments

Comments

@robertschulze
Copy link

Hi,
I have set up a Borg backup on an unstable Internet connection (browser requests time out and packets are lost). Instead of timing out and retrying Borg completely stalls and (apparently) does nothing anymore
(no CPU usage, no network traffic for 30+ minutes). When I cancel using Ctrl+C I obtain the following dump:

Traceback (most recent call last):
  File "/usr/lib/python3.6/subprocess.py", line 269, in call
    return p.wait(timeout=timeout)
  File "/usr/lib/python3.6/subprocess.py", line 1457, in wait
    (pid, sts) = self._try_wait(0)
  File "/usr/lib/python3.6/subprocess.py", line 1404, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/subprocess.py", line 272, in call
    p.wait()
  File "/usr/lib/python3.6/subprocess.py", line 1457, in wait
    (pid, sts) = self._try_wait(0)
  File "/usr/lib/python3.6/subprocess.py", line 1404, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/borgmatic", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python3.6/site-packages/borgmatic/commands/borgmatic.py", line    163, in main
    run_configuration(config_filename, args)
  File "/usr/lib/python3.6/site-packages/borgmatic/commands/borgmatic.py", line    130, in run_configuration
    remote_path=remote_path,
  File "/usr/lib/python3.6/site-packages/borgmatic/borg/create.py", line 157, in    create_archive
    subprocess.check_call(full_command)
  File "/usr/lib/python3.6/subprocess.py", line 286, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/usr/lib/python3.6/subprocess.py", line 273, in call
    raise
  File "/usr/lib/python3.6/subprocess.py", line 756, in __exit__
    self.wait()
  File "/usr/lib/python3.6/subprocess.py", line 1457, in wait
    (pid, sts) = self._try_wait(0)
  File "/usr/lib/python3.6/subprocess.py", line 1404, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt

Would it be possible to include such logic (or possibly is there already such option?).
Best,
Robert

@ThomasWaldmann
Copy link
Member

borg uses ssh for the connection, so I guess the stalls have to be handled within ssh?

@witten
Copy link
Contributor

witten commented Oct 14, 2018

Given that you're using borgmatic, this issue may also be of interest to you: https://projects.torsion.org/witten/borgmatic/issues/28

@ThomasWaldmann
Copy link
Member

#3866 (comment) does that help to at least have it terminate instead of stalling?

@robertschulze
Copy link
Author

Since that troubled phone line got fixed I am not observing the issue anymore. However, said comment only reduces the global ssh keep-alive settings, such that ssh quits earlier on disconnected peers, while I observed that borg did not terminate even though ssh had terminated. So I fear it would not help.

@ThomasWaldmann
Copy link
Member

@robertschulze btw, did you make the corresponding ssh/sshd settings on both sides?

@robertschulze
Copy link
Author

Indeed, I tried settings such as these plus a bunch more on client as well as on server side. But in the end this cannot cure the line - if ssh looses too many packages it breaks down regardless of the settings. What I observed was that the ssh process exitting did not always result in direct break of the borg process. However, (un)fortunately, since the line appears to be fixed now, it is hard to check/replicate the issue again.

@ThomasWaldmann
Copy link
Member

Considering the age of this ticket and that it is mostly due to the bad connection, I am closing this.

If somebody can reproduce with a recent borg version, please file a new bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants