Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSHHook: Using correct hostname for host_key when using non-default ssh port #15964

Merged
merged 3 commits into from
Jul 3, 2021

Conversation

freget
Copy link
Contributor

@freget freget commented May 20, 2021

When using the SSHHook to connect to an ssh server on a non default port, the host_key setting was not added with the correct hostname to the list of known hosts. In more detail:

from airflow.providers.ssh.hooks.ssh import SSHHook
import paramiko
from base64 import decodebytes

hook = SSHHook(remote_host="1.2.3.4", port=1234, username="user")
# Usually, host_key would come from the connection_extras, for the sake of this example we set the value manually:
host_key = "abc" # Some public key
hook.host_key = paramiko.RSAKey(data=decodebytes(host_key.encode("utf-8")))
hook.no_host_key_check = False

conn = hook.get_conn()

yielded the exception

paramiko.ssh_exception.SSHException: Server '[1.2.3.4]:1234' not found in known_hosts

closes: #15963

@boring-cyborg
Copy link

boring-cyborg bot commented May 20, 2021

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

Comment on lines 221 to 222
remote_host = f"[{self.remote_host}]:{self.port}" if self.port != SSH_PORT else self.remote_host
client_host_keys.add(remote_host, 'ssh-rsa', self.host_key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the linked issue, it may be easier to do it this way:

client_host_keys.add(self.remote_host, 'ssh-rsa', self.host_key)
if self.port:
    client_host_keys.add(f"{self.remote_host}:{self.port}", 'ssh-rsa', self.host_key)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it is a good idea to add the host key without port unconditionally. At least in theory it might be possible that there is one server listening on the default port and another one on self.port. Those servers could have different public keys.

However, it might be more clear to have that explicit branch instead of the inline if. I will adjust the PR accordingly.

Copy link
Member

@uranusjr uranusjr May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

if self.port is None or self.port == SSH_PORT:
    client_host_keys.add(self.remote_host, 'ssh-rsa', self.host_key)
if self.port:
    client_host_keys.add(f"{self.remote_host}:{self.port}", 'ssh-rsa', self.host_key)

So SSH servers exposed on the default port can have both registered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 196 the port is set to SSH_PORT if it is None:

    self.port = self.port or SSH_PORT

Hence, checking for self.port is None is always false. Your suggestion results in adding both self.remote_host and f"{[self.remote_host}]:{self.port}" in most cases. Adding the latter for the standard port is not required according to the OpenSSH documentation (https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#~/.ssh/known_hosts). As Paramiko expects the strings according to the OpenSSH format, we would just be adding a redundant, never used entry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how it is now actually. I am also all for adding host without port when the default port is used. This is what OpenSSH is doing usually.

@kaxil
Copy link
Member

kaxil commented Jun 19, 2021

ping @uranusjr @potiuk

@freget freget force-pushed the bugfix/ssh_hook_host_key branch from ce6f985 to 86159cd Compare June 20, 2021 16:13
@freget freget force-pushed the bugfix/ssh_hook_host_key branch from 86159cd to 124ef38 Compare June 28, 2021 19:54
Improved formatting
@potiuk potiuk merged commit a2dc01b into apache:main Jul 3, 2021
@freget freget deleted the bugfix/ssh_hook_host_key branch July 3, 2021 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SSHHook: host_key is not added properly when using non-default port
4 participants