Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to provision a host (with Libvirt/KVM) based on a qcow2 image without explicitly setting the image_id parameter. #1160

Closed
MLotton opened this issue Mar 4, 2021 · 9 comments · Fixed by #1215

Comments

@MLotton
Copy link

MLotton commented Mar 4, 2021

SUMMARY

I wanted to provision a VM with a Libvirt/KVM compute ressource based on an image provision method with an existing qcow2 image. I encountered an issue where the host was created but the image provisioning wasn't correctly done by using the module host.
The result is not the same by using the hammer CLI or the foreman/satellite web UI (with identical input parameters).

Looking further and comparing with what was done by the Satellite Web UI and Hammer CLI, I found that the host module (python code) doesn't send the image_id to the API call (Note: I didn't investigate a lot, so it is a guess). I tried again by adding the parameter image_id directly in the module parameters (inside compute_attributes) and then it worked well.

I guess that the Ansible module(s) / python code must perform an additional step to resolve the image_id based on the image name and send it to the API call (as it will be more user friendly if the user only has to put the image name as an input parameter).

ISSUE TYPE
  • Bug Report
ANSIBLE VERSION
ansible 2.9.13
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/max/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /home/max/deploy/venvs_python/ansible_epdb_python_2/lib/python2.7/site-packages/ansible
  executable location = /home/max/deploy/venvs_python/ansible_epdb_python_2/bin/ansible
  python version = 2.7.5 (default, Aug 13 2020, 02:51:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
COLLECTION VERSION
redhat.satellite:2.0.1

Repo link: https://github.com/RedHatSatellite/satellite-ansible-collection
This was not tested with the community foreman collection version.

KATELLO/FOREMAN VERSION
tfm-rubygem-katello-3.16.0.13-1.el7sat.noarch
foreman-2.1.2.21-1.el7sat.noarch
STEPS TO REPRODUCE
---
- hosts: localhost
  gather_facts: False
  tasks:

    - include_vars: input_variables/vault_secrets.yml

    - name: "Create Satellite host"
      theforeman.foreman.host:
        name: "example-host.example.com"
        architecture: "x86_64"
        build: True
        comment: "Test collection"
        domain: "example.com"
        interfaces_attributes:
          - subnet: "Example subnet provisioning"
            identifier: eth0
            type: interface
            domain: "example.com"
            managed: True
            primary: True
            provision: True
        location: "example_location"
        operatingsystem: "CentOS 7 Latest"
        organization: "example_organization"
        root_pass: "example_password"
        server_url: "https://satellite-host"
        username: "example_user"
        password: "{{ satellite_user_password }}"
        validate_certs: False
        state: present
        provision_method: image
        image: "my-vm-qcow2-image"
        compute_resource: "Example Libvirt ressource"
        compute_profile: "example-compute-profile"
EXPECTED RESULTS

The host is created and well provisioned based on a qcow2 image.

The boot device order is based only on the first disk.
Note: Satellite will also create a virtual CDROM device for user data / cloud-init (like this example, below you will find the definition of this device for libvirt/kvm in a XML format):

<disk type="file" device="cdrom">
  <driver name="qemu" type="raw"/>
  <source file="/var/lib/libvirt/images/example-host.example.com-cloud-init.iso"/>
  <backingStore/>
  <target dev="hdc" bus="ide"/>
  <readonly/>
  <alias name="ide0-1-0"/>
  <address type="drive" controller="0" bus="1" target="0" unit="0"/>
</disk>
ACTUAL RESULTS

The host is created but the VM is "blank" (no OS installed, etc).

The boot device order is first "network" and then "disk".
There is no virtual CDROM device created.

WORKAROUND
---
- hosts: localhost
  gather_facts: False
  tasks:

    - include_vars: input_variables/vault_secrets.yml

    - name: "Create Satellite host"
      theforeman.foreman.host:
        name: "example-host.example.com"
        architecture: "x86_64"
        build: True
        comment: "Test collection"
        domain: "example.com"
        interfaces_attributes:
          - subnet: "Example subnet provisioning"
            type: interface
            domain: "example.com"
            managed: True
            primary: True
            provision: True
        location: "example_location"
        operatingsystem: "CentOS 7 Latest"
        organization: "example_organization"
        root_pass: "example_password"
        server_url: "https://satellite-host"
        username: "example_user"
        password: "{{ satellite_user_password }}"
        validate_certs: False
        state: present
        provision_method: image
        image: "my-vm-qcow2-image"
        compute_resource: "Example Libvirt ressource"
        compute_profile: "example-compute-profile"
        compute_attributes:
          image_id: "/var/lib/libvirt/images/my-vm-qcow2-image.qcow2"

By adding:

compute_attributes:
  image_id: "/var/lib/libvirt/images/my-vm-qcow2-image.qcow2"

It works as expected.

@evgeni
Copy link
Member

evgeni commented Mar 5, 2021

To compare, can you post the hammer call you used?

@MLotton
Copy link
Author

MLotton commented Mar 5, 2021

Yes, here is an example with a Hammer CLI command.

Host creation with hammer CLI
hammer host create \
--name example-host.example.com \
--architecture x86_64 \
--managed true \
--organization bouygues_telecom \
--location example_location \
--domain "example.com" \
--interface "managed=true,primary=true,provision=true,compute_network=Example subnet provisioning" \
--operatingsystem "CentOS 7 Latest" \
--partition-table "Kickstart default" \
--root-password "example_password" \
--provision-method image \
--image "my-vm-qcow2-image" \
--compute-resource "Example Libvirt ressource" \
--compute-profile "example-compute-profile"
API results - examples

To add more details, here is what I can find from /var/log/foreman/production.log

With ansible modules:

Started POST "/api/hosts" for xx.xx.xx.xx at 2021-03-05 13:52:23 +0100
Processing by Api::V2::HostsController#create as JSON
Parameters: {
"organization_id"=>1, 
"host"=>
{
"operatingsystem_id"=>20, 
"root_pass"=>"[FILTERED]", 
"managed"=>true, 
"name"=>"example-host.example.com", 
"interfaces_attributes"=>[{"managed"=>true, "subnet_id"=>35, "type"=>"interface", "primar
y"=>true, "virtual"=>false, "identifier"=>"eth0", "provision"=>true, "domain_id"=>1}], 
"provision_method"=>"image", 
"comment"=>"Test collection", 
"compute_profile_id"=>7, 
"organization_id"=>1, 
"image_id"=>3, 
"architecture_id"=>1, 
"build"=>true, 
"location_id"=>2, 
"domain_id"=>1, "
compute_resource_id"=>3
}, 
"location_id"=>2, 
"apiv"=>"v2"
}

image_id is well defined, so the issue is not linked to that.

With Hammer CLI:

Started POST "/api/hosts" for xx.xx.xx.xx at 2021-03-05 13:57:19 +0100
Processing by Api::V2::HostsController#create as JSON
Parameters: {
"location_id"=>2, 
"organization_id"=>1, 
"host"=>
{
"name"=>"example-host.example.com", 
"location_id"=>2, 
"organization_id"=>1, 
"architecture_id"=>1, 
"domain_id"=>1, 
"operatingsystem_id"=>20, 
"ptable_id"=>116, 
"compute_resource_id"=>3, 
"image_id"=>3, 
"provision_method"=>"image", 
"managed"=>true, 
"compute_profile_id"=>7, 
"compute_attributes"=>
{
"volumes_attributes"=>{}, 
"image_id"=>"/var/lib/libvirt/images/my-vm-qcow2-image.qcow2"
}, 
"content_facet_attributes"=>{}, 
"subscription_facet_attributes"=>{}, 
"build"=>true, 
"overwrite"=>true, 
"interfaces_attributes"=>[{"managed"=>"true", "primary"=>"true", "provision"=>"true", "compute_attributes"=>{"network"=>"Example subnet provisioning"}}], 
"root_pass"=>"[FILTERED]"
}, 
"apiv"=>"v2"
}

Both have one identical image_id parameter at the root of host. But there is another parameter also called image_id and defined in compute_attributes for hammer CLI.
I think this is the the cause of the difference in behavior because of the workaround I detailed in this issue. It might require further investigations. From the integrated Satellite API documentation nothing seems to be mentioned...

@evgeni
Copy link
Member

evgeni commented Mar 16, 2021

@tbrisker @lzap do you know if Foreman itself should do the translation of the Hosts's "image" parameter to a CR image? It feels wrong to have all API clients (hammer, FAM, etc) have to carry the same "find image and set another parameter" code around.

@lzap
Copy link
Member

lzap commented Mar 18, 2021

By translation do you mean i18n translation? I think you mean more "look up"?

Foreman keeps images in its database, this table has image_id (SQL INT), uuid and name fields (varchar) which are used by individual compute resources. Usually, both UI and API should only need to provide image_id which Foreman than uses to look other attributes up (uuid, name).

So I think this is expected, when compute resource attribute is passed (uuid or name) Foreman will merge that and pass it into the VM creation request so it actually works and this is probably a bug, but the correct way is to really find image_id first and then providing that in the request.

In general, compute attributes are implemented in a weird way and @ezr-ondrej actually started an effort to improve so poking him to take this into consideration. Maybe we should create some allow list of compute resource attributes which are allowed to be passed in from UI / CLI.

This also raises a concern about security, Rails have a mechanism to protect parameters from being sent into ActiveRecord, we do however appear to merge all attributes and pass them into fog. At least how I understand this, it looks like an attacker can actually pass image path directly escaping organization boundary and performing a DOS by overwriting an existing image owned by a different organization.

@evgeni
Copy link
Member

evgeni commented Mar 18, 2021

Ah, sorry. "translate" was maybe a bit wrong selection of words on my side.

If you look at the API requests above, Ansible sends:

"image_id"=>3, 

While hammer sends

"image_id"=>3, 
"compute_attributes"=>
{
"image_id"=>"/var/lib/libvirt/images/my-vm-qcow2-image.qcow2"
}, 

And it seems without the second entry, Libvirt just doesn't know what to do.

My naïve assumption was that when I pass image_id: 3 to the API, it will forward the right data to the CR, but it seems that's not the case.

@lzap
Copy link
Member

lzap commented Mar 22, 2021

I would also expect that our API will lookup the image "UUID / path" or whatever we call as image_id. That smells like a bug worth fixing.

@ezr-ondrej
Copy link
Member

Image selection is particulary weird as there is a select it in Host form where I select the method, but that input is actually send as host[compute_attributes][image_id] and in code it's being rendered in the Virtual Machine tab and then moved by JS. So for user it looks fine, but it's created a mess.
Hammer has solved this to mimic this by looking the image_id attribute up and passing it to API as compute_attributes: { image_id: ... } only this will make sure the image gets picked up during orchestration. The image_id on host gets set only after the provisioning has been successful.

This is IMHO very wrong and I might prioritize it in the ongoing compute resources cleanup, but I can't promise any delivery upon that, as I don't fully understand the process for all the compute resources.

This being said, we should probably have fix here sooner. By doing eigher

  1. do what hammer does and just lookup the image id and pass it to the compute_attributes.
  2. create temporary workaround in API to handle the image_id and do the lookup on Foreman side.

@wbclark wbclark added the bug label Mar 30, 2021
@wbclark
Copy link
Contributor

wbclark commented Mar 30, 2021

Following the discussion, ideally this would be fixed on the Foreman side but that is not imminently expected, so we could provide a workaround here similar to what hammer is doing. Therefore I'm labeling this as a bug, although it could also be 'depends on external project'

evgeni added a commit to evgeni/foreman-ansible-modules that referenced this issue May 3, 2021
Yes, the field in the compute attributes is really called `image_id`.
Yes, it really expects the *UUID* of the image there.

Fixes: theforeman#1160
@evgeni
Copy link
Member

evgeni commented May 4, 2021

Thanks! I've opened https://projects.theforeman.org/issues/32501 to track the API side of this, and #1215 has a workaround for now.

evgeni added a commit that referenced this issue May 4, 2021
Yes, the field in the compute attributes is really called `image_id`.
Yes, it really expects the *UUID* of the image there.

Fixes: #1160
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants