Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transfer files using a separate paramiko channel #171

Merged
merged 21 commits into from
Aug 23, 2022

Conversation

keewis
Copy link
Contributor

@keewis keewis commented Aug 17, 2022

The lower-level paramiko library allows using a separate channel to execute something similar to:

python -c 'script = "..."; print(script)' | ssh user@host 'cat - >remote_file'

where script is never evaluated by the shell.

I'm not sure if there's something like this in fabric; I couldn't find anything that would avoid using SFTP (which the admins on my HPC disabled). There were references to SCP in the documentation of fabric, but I don't think we should depend on that: as far as I remember, that protocol is deprecated.

I tried to include a test, but I can't seem to understand the test setup so I can't verify if it passes (I guess I will need to wait on CI for that).

@andersy005
Copy link
Member

thank you for this addition, @keewis! The CI has been failing for a while. I plan to look into this today. If you don't mind waiting, we can merge this once the CI is fixed.

Are you able to test this feature on your local HPC machine?

@keewis
Copy link
Contributor Author

keewis commented Aug 17, 2022

no worries, I'm not in a hurry to get this merged so you can take your time (though afterwards it would be great to get a release).

Re testing: of course I did a manual "integration test" on my local HPC (run jupyter-forward --launch-command '...') to verify that it works (it does, but that might be specific to my setup)

@keewis
Copy link
Contributor Author

keewis commented Aug 18, 2022

it would probably help very much with the debugging if we could figure out how to have pytest properly redirect stdin, stdout, and stderr.

My guess is that the reason for broken redirect is that the console is instantiated on import while pytest redirects when entering the test function's context. If that's correct, this might be fixed by instantiating console in the runner's __post_init__ (or by monkeypatching jupyter_forward.core.console).

I can try sending in a PR tomorrow, if that sounds good to you.

@andersy005
Copy link
Member

I can try sending in a PR tomorrow, if that sounds good to you.

Yes, please. That would be very useful...

@andersy005 andersy005 added the enhancement New feature or request label Aug 18, 2022
@keewis
Copy link
Contributor Author

keewis commented Aug 19, 2022

okay, so after some investigation, it seems that this is caused by the -s option to pytest (configured in setup.cfg) and doesn't have anything to do with the console. This is necessary because we're trying to read from stdin using getpass.getpass.

I'm guessing we would have to monkeypatch _authentication_handler / getpass.getpass in all of the core tests to avoid that.

Another option would be to take the authentication handlers as parameters (one for auth_interactive_dumb and one for auth_password), which would allow us to specify dummy authentication handlers.

I'd probably prefer the latter because it results in a cleaner architecture (monkeypatching is usually a code smell).

Edit: as it turns out, invoke also tries to do operations on stdin (even if the command itself never uses it), which is where the OSErrors is coming from if we run pytest without -s. So not sure how to fix this if we don't want to switch to bare paramiko?

Edit2: setting in_stream=False in every call to session.run, e.g. by overriding run.in_stream in fabric's configuration object, might work as well if we don't need to have remote processes read from stdin.

@keewis keewis mentioned this pull request Aug 19, 2022
@keewis
Copy link
Contributor Author

keewis commented Aug 19, 2022

well, turns out the issue was not the write but verifying the contents of the file. That means that we can also ignore all the different shells (we don't actually use them) and just take the default shell.

Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, @keewis! Thanks again for this addition 🎉

@andersy005
Copy link
Member

i'm going to merge this shortly unless you have additional changes ...

@mnlevy1981
Copy link
Contributor

well, turns out the issue was not the write but verifying the contents of the file. That means that we can also ignore all the different shells (we don't actually use them) and just take the default shell.

I've been trying to follow this conversation, but I'm a little lost. I saw that you introduced a new test (test_put_file()), and this test was failing on every shell except bash. It sounds like you're okay with this failing on the other shells, and I just want a little clarification about why... is this an issue with the CI environment rather than the code? Is bash the only shell that actually calls runner.put_file()? I guess I'm just asking for a little reassurance that we won't have tcsh users running into the same error that CI was reporting before we removed tcsh from the list of shells that are being tested.

@keewis
Copy link
Contributor Author

keewis commented Aug 19, 2022

I guess I'm just asking for a little reassurance that we won't have tcsh users running into the same error

no worries, I'm happy to answer any questions you might have (I might not be able to answer some of them, though, I'm no expert on shell startup, and in particular I don't know csh / tcsh at all)

Slight correction: the test was passing on every shell except tcsh.

As far as I can tell, the reason it failed was the line of the test where we get the contents of the newly created file to check that it actually contains what we wanted to put there:

out = runner.run_command(f'cat {path}')

Changing the test to call runner.session.run does make it work:

out = runner.session.run(f'cat {path}')
which to me indicates that there's something wrong in run_command, or, more likely, that a common shell startup file (like ~/.profile) contains code incompatible with tcsh, which decided to complain about that to stdout (and all commands executed by _set_log_directory would print the same warning, so...).

Now that the test does not use run_command anymore (which I think is the only time the shell would ever be used), I thought that we wouldn't need to test exactly the same thing for each shell. However, if you don't think that's the case, or just feel safer to run the test on all shells that's fine with me, too.

In any case, TL;DR: the issue was in the test code and not put_file, and put_file works (and is used) regardless of the chosen shell.

@mnlevy1981
Copy link
Contributor

Thanks for clarifying!

Now that the test does not use run_command anymore (which I think is the only time the shell would ever be used), I thought that we wouldn't need to test exactly the same thing for each shell. However, if you don't think that's the case, or just feel safer to run the test on all shells that's fine with me, too.

Given your response, I don't think we need to run this test on every shell so I vote for leaving it bash-only.

@keewis
Copy link
Contributor Author

keewis commented Aug 19, 2022

great! @andersy005, I think this should be ready for merging then. Thanks a lot for the reviews and help!

Edit: and just after posting this I found a few things to improve...

jupyter_forward/core.py Outdated Show resolved Hide resolved
tests/test_core.py Outdated Show resolved Hide resolved
@andersy005 andersy005 merged commit cd1eefe into ncar-xdev:main Aug 23, 2022
@keewis keewis deleted the file-transfer branch August 24, 2022 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

issues with shell escaping in the batch script
3 participants