Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH remote permission error #2535

Closed
ynop opened this issue Sep 25, 2019 · 41 comments · Fixed by #2554
Closed

SSH remote permission error #2535

ynop opened this issue Sep 25, 2019 · 41 comments · Fixed by #2554
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.

Comments

@ynop
Copy link
Contributor

ynop commented Sep 25, 2019

Hi

I get a permission error when trying to push to an ssh remote.
I assume it is due to the fact, that I don't have full permission on the remote (only from /cluster/data/user onwards)

As far as I can see dvc tries to create directories from the root.

head, tail = posixpath.split(path)
if head:
self.makedirs(head)
if tail:
try:
self.sftp.mkdir(path)
except IOError as e:
# Since paramiko errors are very vague we need to recheck
# whether it's because path already exists or something else
if e.errno == errno.EACCES or not self.exists(path):
raise

Is this the intention or do i get something wrong?

I tried using password and keyfile.

Config:
['remote "cluster"']
url = ssh://user@host/cluster/data/user
user = user
keyfile = path-to-keyfile

Info:
dvc version 0.60.0
macos
pip installed

@efiop
Copy link
Contributor

efiop commented Sep 25, 2019

Hi @ynop !

Please post full -v log, it makes it much easier for us to debug this. Also, are you able to ssh into your machine as user and create /cluster/data/user/test directory with mkdir /cluster/data/user/test ?

@ynop
Copy link
Contributor Author

ynop commented Sep 25, 2019

Yes creating manually works.

ERROR: failed to upload '.dvc/cache/70/889565c1fed05122fd06ad9492c9d3' to 'ssh://[email protected]/cluster/data/user/70/889565c1fed05122fd06ad9492c9d3' - [Errno 13] Permission denied
------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/base.py", line 522, in upload
    no_progress_bar=no_progress_bar,
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/__init__.py", line 244, in _upload
    no_progress_bar=no_progress_bar,
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 189, in upload
    self.makedirs(posixpath.dirname(dest))
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
    self.makedirs(head)
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
    self.makedirs(head)
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
    self.makedirs(head)
  [Previous line repeated 2 more times]
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 104, in makedirs
    self.sftp.mkdir(path)
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 460, in mkdir
    self._request(CMD_MKDIR, path, attr)
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 813, in _request
    return self._read_response(num)
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 865, in _read_response
    self._convert_status(msg)
  File "/Users/matthi/zhaw/mt/data/.venv/lib/python3.7/site-packages/paramiko/sftp_client.py", line 896, in _convert_status
    raise IOError(errno.EACCES, text)
PermissionError: [Errno 13] Permission denied
------------------------------------------------------------

This error appears multiple times.
I can post the full log later.

@efiop
Copy link
Contributor

efiop commented Sep 25, 2019

@ynop Btw, could you try downgrading to 0.59.2 to see if that one is affected too? We did add a few changes to those lines in 0.60.0, so it might be the cause.

@efiop efiop added bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP. labels Sep 25, 2019
@ynop
Copy link
Contributor Author

ynop commented Sep 25, 2019

I have tried 0.59.2, but ends up with the same errors.

Here is the log (using 0.60.0).
https://gist.github.com/ynop/a8273bf13b76f4b0ff4d1225197c96c6

@ghost
Copy link

ghost commented Sep 25, 2019

@ynop , are you using the same user in your test and with DVC?

@ghost
Copy link

ghost commented Sep 25, 2019

@ynop , I'll try to replicate it

@ghost ghost self-assigned this Sep 26, 2019
@ghost
Copy link

ghost commented Sep 26, 2019

@ynop , I wasn't able to reproduce it:

sudo systemctl start sshd
sudo mkdir -p /cluster/data/${USER}
sudo chown ${USER} -R /cluster/data/${USER}

ls -lah /cluster/data/
#
# Permissions Size User    Group Date Modified Name
# drwxr-xr-x     - mroutis root  25 Sep 19:54  mroutis

ssh localhost mkdir /cluster/data/${USER}/test

ls -lah /cluster/data/${USER}
#
# Permissions Size User    Group   Date Modified Name
# drwxr-xr-x     - mroutis mroutis 25 Sep 20:05  test


dvc init --no-scm
dvc remote add ssh ssh://${USER}@localhost/cluster/data/${USER}
echo "foo" > foo
dvc add foo
dvc push -r ssh

ls -R -lah /cluster/data/mroutis
# Permissions Size User    Group   Date Modified Name
# drwxr-xr-x     - mroutis mroutis 25 Sep 21:11  d3
# drwxr-xr-x     - mroutis mroutis 25 Sep 20:05  test
#
# /cluster/data/mroutis/d3:
# Permissions Size User    Group   Date Modified Name
# .rw-r--r--     4 mroutis mroutis 25 Sep 21:11  b07384d113edec49eaa6238ad5ff00


dvc version
#
# DVC version: 0.60.0
# Python version: 3.7.4
# Platform: Linux-5.3.1-arch1-1-ARCH-x86_64-with-arch
# Binary: False
# Cache: reflink - True, hardlink - True, symlink - True
# Filesystem type (cache directory): ('xfs', '/dev/mapper/vg-root')
# Filesystem type (workspace): ('xfs', '/dev/mapper/vg-root')

Looking at your log, it looks like is trying to create /cluster/data/user

dest should be /cluster/data/user/5a/9ab1413269d550586b4586714e3ffc

So, after 3 head, tail = posixpath.split(path) operations, the head is /cluster/data/user.

...
    self.makedirs(posixpath.dirname(dest))
  File "/Users/useraw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
    self.makedirs(head)
  File "/Users/useraw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
    self.makedirs(head)
  File "/Users/useraw/mt/data/.venv/lib/python3.7/site-packages/dvc/remote/ssh/connection.py", line 100, in makedirs
    self.makedirs(head)
...

Can you make sure the /cluster/data/user directory exists?

@ghost
Copy link

ghost commented Sep 26, 2019

As far as I can see dvc tries to create directories from the root.

@ynop , it is not from the root, it is from the first existing directory:

def makedirs(self, path):
# Single stat call will say whether this is a dir, a file or a link
st_mode = self.st_mode(path)
if stat.S_ISDIR(st_mode):
return

import posixpath

head, tail = posixpath.split('/cluster/data/user/5a/9ab1413269d550586b4586714e3ffc')

print(head) # '/cluster/data/user/5a'
print(tail)  # '9ab1413269d550586b4586714e3ffc'

@ghost ghost added the awaiting response we are waiting for your reply, please respond! :) label Sep 26, 2019
@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

If i a add log outputs like that:

if stat.S_ISDIR(st_mode):
    logger.debug('Directory')
    return
else:
    logger.debug('Not a directory')

Then i get an output like that:

DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data/user
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster/data
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /cluster
DEBUG: Not a directory
DEBUG: /
DEBUG: Directory
DEBUG: /
DEBUG: Directory
DEBUG: /
DEBUG: Directory
DEBUG: /
DEBUG: Directory

And the directory /cluster/data/user does exist.
Although /cluster is mounted from a ceph storage.
Could that be the problem?

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

I logged the file mode that i get:

logger.debug(stat.filemode(st_mode))
DEBUG: /cluster
DEBUG: ?---------

Seems that it can't retrieve the file mode for the mounted directories.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Indeed, that is the cause. Very interesting. Does running stat /cluster on that machine show anything interesting?

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

  File: /cluster
  Size: 5         	Blocks: 0          IO Block: 65536  directory
Device: 0h/0d	Inode: 1099511661972  Links: 1
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-09-26 08:37:11.922091000 +0200
Modify: 2019-04-25 06:21:22.901076000 +0200
Change: 2019-04-25 06:21:22.901076000 +0200
 Birth: -

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop The blocks is a bit weird to see as 0, but at least stat is able to tell that it is a directory. Very interesting. Maybe sftp's implementation of stat a bit different. Or paramiko's even. Let me look into those.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Btw, could you show full st_mode please as seen in dvc itself?

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Btw, one more workaround that comes to mind is to create something like ~/dvc-remote symlink that would point to /cluster/data/user and use that as a remote to see if that would help by any chance. Could you give it a try, please?

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

When I log st_mode it just outputs 0.

The workaround is not really possible, since all places I have access to are on the mounted volumes.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Btw, and what does stat -L /cluster show? We are using lstat in the dvc itself, and I wonder if it is causing that.

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Exactly the same output.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Did I get you right, that you we're either using pdb or modifying dvc code in place and running it? If so, would you mind showing self.sftp.stat(path).st_mode output?

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop I've taken a look at paramiko's and openssh's stat function, and they both seem pretty normal and just passing st_mode intact from original stat() call on the server. I wonder if CLI utility stat is doing any tricks with such mounts. Could you check python's stat().st_mode output on the server for /cluser, please?

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

On the server via python it seems to work just fine.

>>> os.lstat('/cluster').st_mode
16877
>>> os.stat('/cluster').st_mode
16877
>>> stat.S_ISDIR(os.stat('/cluster').st_mode)
True
>>> stat.S_ISDIR(os.lstat('/cluster').st_mode)
True

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

When printing self.sftp.stat(path).st_mode, I don't get any output.

with ignore_file_not_found():
    logger.debug(self.sftp.stat(path).st_mode)

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop maybe it is because it is raising an exception there. How about self.sftp.stat("/cluster").st_mode?

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Then I get FileNotFoundError: [Errno 2] No such file and therefore no output either.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop That is weird. I think there is a miscommunication here. Btw, how about we get a video call together, so I could take a closer look at it, so we don't spend so much time going back and forward in the comments for this issuse? 🙂

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Unfortunately thats not possible today, as I am in a classroom.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Sure, we can do it tomorrow or whenever you'll have time, if we don't figure this out remotely here. 🙂

Ok, so

Then I get FileNotFoundError: [Errno 2] No such file and therefore no output either.

you ran in makedirs when it failed, right? Or in some other way? Are you modifying dvc package in-place? Maybe consider adding import pdb; pdb.set_trace() and using -j1 flag with dvc push, so you could enter interactive pdb shell and experiment there.

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

I'm modifying the code in-place in the venv.
I run the command within the makedirs method.
I tried with pdb, then I get the FileNotFoundError, when executing the step other_st_mode = self.sftp.stat(path).st_mode.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop

I tried with pdb, then I get the FileNotFoundError, when executing the step other_st_mode = self.sftp.stat(path).st_mode.

But path might be pointing to something that actually doesn't exist, that is why I was asking about using /cluster as path there.

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Ah sorry, overlooked that.
But same results with /cluster as path.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop FineNotFoundError when running that command with /cluster? O_o That is extremely odd. Are you sure you are connecting to the correct machine? If not, that would explain every single error we've seen in this thread.

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Yeah, I also checked that by copying the url from the log and scp'd something there.

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Hmm just found a solution, although i don't know why it is like that.
I debugged available paths using self.sftp.listdir(...) starting from /.
Then I see that I have the path /data/user, but on the server or with scp or whatever it is /cluster/data/user.

When I change the path in the dvc config, it works.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Oh, that makes total sense! So your sftp server on that machine is configured with the root of /clusetr and not usual / .

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop I imagine you don't have access to /etc/ssh/sshd_config on the server, right? But even if you don't., this explains everything perfectly.

@ynop
Copy link
Contributor Author

ynop commented Sep 26, 2019

Yeah, seems that way. Thanks for the help and sorry for the circumstances.

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop Nothing to be sorry for. This was extremely useful and will help debugging similar issues in the future! 🙂 Thank you!

@efiop
Copy link
Contributor

efiop commented Sep 26, 2019

@ynop So in the end, this was an sftp server configuration caveat, that dvc can't really do much to mitigate. But we can and should improve the error message to something like Unable to create directory '{}', so it is more friendly. Let's keep this issue opened until that one is fixed.

@efiop efiop removed the awaiting response we are waiting for your reply, please respond! :) label Sep 26, 2019
@ghost
Copy link

ghost commented Sep 26, 2019

@efiop , @ynop, great discussion! We encountered that problem before 😅

@efiop , I was thinking that maybe we can do sftp.getcwd() and then compare it to the path specified in the URL.

Also, adding the error message to makedirs will make it more easy to debug 👍

@efiop
Copy link
Contributor

efiop commented Sep 27, 2019

@MrOutis

Not sure what you want to compare it with, sftp.getcwd() will return / in /cluster because it is configured to have the root there.

@ghost
Copy link

ghost commented Sep 27, 2019

@efiop , true, forgot that chrooting changes also the PWD and friends to reflect the new jailness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants