Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxmox Inventory: Fix for #4431 #4432

Closed
wants to merge 1 commit into from
Closed

Proxmox Inventory: Fix for #4431 #4432

wants to merge 1 commit into from

Conversation

ralgar
Copy link

@ralgar ralgar commented Apr 1, 2022

SUMMARY

This commit fixes issue #4431, where the Proxmox inventory plugin would
randomly fail to acquire the node's IP address.

Edit: This is now a near complete re-write of _get_node_ip(), which resolves the Proxmox connection URL provided to the plugin, and attempts to match it against one of the PVE node's interfaces. If this process fails, no IP is returned.

This seems more logical to me, as the IP address is used to define the ansible_host var, which should be the same IP used to establish the connection, rather than an arbitrary and unpredictable IP.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

plugins/inventory/proxmox.py

ADDITIONAL INFORMATION

See #4431

This commit fixes issue #4431, where the Proxmox inventory plugin would
randomly fail to acquire the node's IP address.
@ansibullbot
Copy link
Collaborator

@ansibullbot ansibullbot added bug This issue/PR relates to a bug inventory inventory plugin new_contributor Help guide this first time contributor plugins plugin (any type) small_patch Hopefully easy to review labels Apr 1, 2022
@@ -268,7 +268,9 @@ def _get_node_ip(self, node):
try:
Copy link
Contributor

@ilijamt ilijamt Apr 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your first contribution. I've added some comments.

I would actually change this to:
It's possible for a proxmox node to have a combination:

  • single public and/or multiple private
  • multiple public and multiple private
  • multiple private
        # first try to get the first available public IP if it exists
        for iface in ret:
            addr = iface.get('address', None)
            if addr != None and all([ipaddress.ip_address(addr).is_global]):
                return addr

        # get the first available private IP in the list if no public ones are available, 
        # excluding loopback addresses
        for iface in ret:
            addr = iface.get('address', None)
            if addr != None and all([not ipaddress.ip_address(addr).is_loopback, 
                                     ipaddress.ip_address(addr).is_private]):
                return addr

        # no IP address found
        return None

Copy link
Contributor

@ilijamt ilijamt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        # first try to get the first available public IP if it exists
        for iface in ret:
            addr = iface.get('address', None)
            if addr != None and all([ipaddress.ip_address(addr).is_global]):
                return addr

        # get the first available private IP in the list if no public ones are available, 
        # excluding loopback addresses
        for iface in ret:
            addr = iface.get('address', None)
            if addr != None and all([not ipaddress.ip_address(addr).is_loopback, 
                                     ipaddress.ip_address(addr).is_private]):
                return addr

        # no IP address found
        return None

@ansibullbot ansibullbot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR label Apr 1, 2022
@ralgar ralgar marked this pull request as draft April 1, 2022 18:37
@ansibullbot ansibullbot added the WIP Work in progress label Apr 1, 2022
@felixfontein
Copy link
Collaborator

Thanks for your contribution! Also please add a changelog fragment.

@felixfontein felixfontein added check-before-release PR will be looked at again shortly before release and merged if possible. backport-3 labels Apr 1, 2022
@ralgar
Copy link
Author

ralgar commented Apr 1, 2022

Thanks for your review. These are valid points, and something I had planned to consider in a separate issue/PR, since this requires a bit of discussion IMHO. However, I'm happy to work on it here, and I've marked this PR as a draft for now.

Here are my thoughts:

  • We can only return one address here, so we need to make sure that we return the "right" one rather than just the first one, which is completely arbitrary. See Proxmox inventory plugin frequently fails to grab node IP address #4431 for more info, this arbitrary response is what caused the bug in the first place.
  • Since this value fills out the 'ansible_host' variable, I feel it should ideally return the same address that Ansible is currently using to connect to the Proxmox node.
  • A public IP should probably be the last resort, as they would likely belong to a guest, rather than the PVE host. Neither SSH nor the Proxmox API should be available on a public address in a production environment. If a user needs these addresses, they could grab them from either the guest or Proxmox node's facts.
  • Perhaps we should only be returning the IP that Ansible is using to establish a connection, as any other IP would be inappropriate for the ansible_host variable?

That's where my head is at. I'm open to any and all counter-points.

For now, I propose something like this. This simply resolves the provided Proxmox URL, and then verifies that the Proxmox node has the same address.

    def _get_node_ip(self, node):
        ret = self._get_json("%s/api2/json/nodes/%s/network" % (self.proxmox_url, node))
        conn_addr = socket.gethostbyname(re.findall(r'://([\w\-\.]+)', self.proxmox_url)[0])

        # Try to verify the connection IP against an interface IP
        for iface in ret:
            addr = iface.get('address', None)
            if addr is not None and addr == conn_addr:
                return addr

        # no IP address found
        return None

Please let me know what you think. @ilijamt @felixfontein

Also, I will add that changelog fragment, thanks for the reminder.

@ansibullbot ansibullbot removed the small_patch Hopefully easy to review label Apr 4, 2022
@ilijamt
Copy link
Contributor

ilijamt commented Apr 5, 2022

Hi @ralgar sorry for the late reply.

But I see one problem with this approach. You may not get what you expect in this case.

$ cat /etc/hosts
127.0.0.1       localhost
::1             localhost ip6-localhost ip6-loopback
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters

127.0.1.1       brain

$ cat test.py
import socket

hostName    = "brain"

ipAddress   = socket.gethostbyname(hostName)

print("IP address of the host name {} is: {}".format(hostName, ipAddress))

$ python test.py
IP address of the host name brain is: 127.0.1.1

There are valid cases for this, like setting the IP address of the hostname to resolve to the internal IP address of the network and not the public one. More details about this can be found in https://man7.org/linux/man-pages/man3/getaddrinfo.3.html

return None
addr = iface.get('address', None)
if addr is not None and addr == conn_addr:
return addr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem is also that this is a breaking change. It might make some setups stop that worked before.

Also the documentation on want_proxmox_nodes_ansible_host is pretty clear: this will just give you an IP, but not necessarily the one you expect. If you want your own definition, you could use compose (see the last example which shows how to pick another IP).

So maybe it's best to not modify this function (or even plugin) at all?

Copy link
Contributor

@ilijamt ilijamt Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can fix this just by just making sure we get an IP. And in the original code and the one I suggested above we always get the first publicly defined interface, if not available the first private one, and if not available none. Which is still in line with the original documentation. Or just fix it to return any address as suggested above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still a breaking change, if someone was using ansible_host | default(..., true) to replace None by some other value, they might suddenly get another value here. I don't use proxmox so no idea how serious this is, but generally we try to avoid breaking changes to make sure nobody's setup suddenly stops working.

Copy link
Author

@ralgar ralgar Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I see what you are saying here. Should I revert back to my original commit then, which simply fixes the original issue of the for loop not iterating correctly? Then we can leave the rest as-is.

    def _get_node_ip(self, node):
        ret = self._get_json("%s/api2/json/nodes/%s/network" % (self.proxmox_url, node))

        for iface in ret:
            try:
                return iface['address']
            except Exception:
                pass

        return None

Edit: Missed the last part of the convo. I've restored the original commit. Let me know what you decide is best, and I will either mark it ready for review or close it out entirely.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the original loop is incorrect. The documentation for want_proxmox_nodes_ansible_host says "will use the first available interface". If you pass here, you are considering other interfaces than the first interface.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. It doesn't seem like a very useful function then, but I'll close out the PR and issue. Thanks for the input.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed. That's why the want_proxmox_nodes_ansible_host option was added to disable this thing (see #2263). It's basically just there for backwards compatibility. (Maybe it's time to deprecate the default value true and switch it to false at the end of the deprecation cycle?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes I wasn't aware of all of this.

It certainly wouldn't hurt to deprecate the default the value. If nothin else, it would help to make the characteristics of this function more obvious to the user. I hadn't set that option exclusively, and as such hadn't even read that it may have unexpected behavior. I just saw the erratic behavior, did a basic issue search, and immediately went into bug-fixing mode.

Apologies for the waste of time. This was my first ever PR, so I may have gotten a little overzealous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue/PR relates to a bug has_issue inventory inventory plugin needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR new_contributor Help guide this first time contributor plugins plugin (any type) small_patch Hopefully easy to review WIP Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants