Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multipart mime support and refactor #21

Conversation

gabriel-samfira
Copy link
Member

Refactor user-data handling

This change adds multipart mime userdata support and refactors the code to allow coreos-cloudinit to run multiple userdata parts as separate steps.

The new tests that were added use the testify, which also pulls in yaml.v3. This makes up the bulk of the lines added.

In a future PR, it may be worth replacing the now archived yaml library that coreos-cloudinit uses in favor of a library that is actively maintained/

We try to set the hostname and import ssh keys before anything else happens. If we fail later on and we manage to import SSH keys, we can at least debug what has happened.

How to use

The old behavior is preserved. New support is added for multipart mime user-data, which means we may get multiple different parts that coreos-cloudinit will now run in the order they are defined in the multipart user-data.

Userdata hostname precedes the metadata one, but if multiple #cloud-config parts are defined with a hostname set, only the first one is returned.

Script and cloud-config parts are run. If there is a valid ignition part, we log the event and do nothing. Any other user-data part type that we don't support is labeled as "unknown" and logged.

Testing done

Added new tests for the new user data parser. Built image using it and deployed virtual machines with all types of supported userdata.

Test Multipart Mime user-data

Userdata:

Content-Type: multipart/mixed; boundary="===============1598784645116016685=="
MIME-Version: 1.0

--===============1598784645116016685==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config"

ssh_authorized_keys:
  - ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEftQIHTRvUmyDCN7VGve4srz03Jmq6rPnqq+XMHMQUIL9c/b0l7B5tWfQvQecKyLte94HOPzAyMJlktWTVGQnY=

--===============1598784645116016685==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config"

hostname: "example"

--===============1598784645116016685==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config"

write_files:
-   encoding: b64
    content: NDI=
    path: /tmp/b64
    permissions: '0644'
-   encoding: base64
    content: NDI=
    path: /tmp/b64_1
    permissions: '0644'
-   encoding: gzip
    content: !!binary |
        H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip
    permissions: '0644'
-   encoding: gz
    content: !!binary |
        H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_1
    permissions: '0644'
-   encoding: gz+base64
    content: H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_base64
    permissions: '0644'
-   encoding: gzip+base64
    content: H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_base64_1
    permissions: '0644'
-   encoding: gz+b64
    content: H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_base64_2
    permissions: '0644'

--===============1598784645116016685==
Content-Type: text/x-shellscript; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="create_file.ps1"

#!/bin/sh
touch /tmp/coreos-cloudinit_multipart.txt

--===============1598784645116016685==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config"

#test_to_check_if_cloud_config_can_contain_a_comment

--===============1598784645116016685==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="some_text.txt"

This is just some random text.

--===============1598784645116016685==
Content-Type: application/json; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="ignition.txt"

{
  "ignitionVersion": 1,
  "ignition": {
    "version": "3.0.0"
  },
  "systemd": {
    "units": [{
      "name": "example.service",
      "enabled": true,
      "contents": "[Service]\nType=oneshot\nExecStart=/usr/bin/echo Hello World\n\n[Install]\nWantedBy=multi-user.target"
    }]
  }
}

--===============1598784645116016685==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="incognito_cloud_config.txt"

#cloud-config

hostname: "undercover"

--===============1598784645116016685==--

Result:

May 17 21:19:03 test-fc.novalocal bash[1196]: + OEMS=(aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean openstack)
May 17 21:19:03 test-fc.novalocal bash[1197]: + echo aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean openstack
May 17 21:19:03 test-fc.novalocal bash[1199]: + grep -q -x -F openstack
May 17 21:19:03 test-fc.novalocal systemd[1]: Starting oem-cloudinit.service - Run cloudinit...
May 17 21:19:03 test-fc.novalocal bash[1198]: + tr ' ' '
May 17 21:19:03 test-fc.novalocal bash[1198]: '
May 17 21:19:03 test-fc.novalocal bash[1208]: ++ '[' openstack = aws -o openstack = openstack ']'
May 17 21:19:03 test-fc.novalocal bash[1208]: ++ echo ec2-compat
May 17 21:19:03 test-fc.novalocal bash[1206]: + /usr/bin/coreos-cloudinit --oem=ec2-compat
May 17 21:19:03 test-fc.novalocal bash[1206]: 2023/05/17 21:19:03 fetching token...
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 error: token response status code 404
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Checking availability of "cloud-drive"
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Checking availability of "ec2-metadata-service"
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching meta-data from datasource of type "ec2-metadata-service"
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-keys. Attempt #1
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-key. Attempt #1
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Found SSH key for "local"
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching data from http://169.254.169.254/2009-04-04/meta-data/hostname. Attempt #1
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching data from http://169.254.169.254/2009-04-04/meta-data/local-ipv4. Attempt #1
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-ipv4. Attempt #1
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching user-data from datasource of type "ec2-metadata-service"
May 17 21:19:04 test-fc.novalocal bash[1206]: 2023/05/17 21:19:04 Fetching data from http://169.254.169.254/2009-04-04/user-data. Attempt #1
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Set hostname to example
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Authorized SSH keys for core user
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "cloud-config"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/etc/environment"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/etc/environment"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Updated /etc/environment
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd2.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "fleet.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "locksmithd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "cloud-config"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Updated /etc/environment
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd2.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "fleet.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "locksmithd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "cloud-config"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/b64"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/b64"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/b64 to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/b64_1"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/b64_1"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/b64_1 to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/gzip"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/gzip"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/gzip to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/gzip_1"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/gzip_1"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/gzip_1 to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/gzip_base64"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/gzip_base64"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/gzip_base64 to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/gzip_base64_1"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/gzip_base64_1"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/gzip_base64_1 to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/tmp/gzip_base64_2"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/tmp/gzip_base64_2"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file /tmp/gzip_base64_2 to filesystem
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Updated /etc/environment
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd2.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "fleet.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "locksmithd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "script"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/var/lib/coreos-cloudinit/scripts/3013682941"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/var/lib/coreos-cloudinit/scripts/3013682941"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Creating transient systemd unit 'coreos-cloudinit-3013682941.service'
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Writing file to "/var/lib/coreos-cloudinit/scripts/unit-name"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Wrote file to "/var/lib/coreos-cloudinit/scripts/unit-name"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "cloud-config"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Updated /etc/environment
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd2.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "fleet.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "locksmithd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "unknown"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 ignoring part of type unknown
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "ignition"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 ignoring part of type ignition
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Running part "cloud-config"
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Updated /etc/environment
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "etcd2.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "fleet.service" is unmasked
May 17 21:19:04 example bash[1206]: 2023/05/17 21:19:04 Ensuring runtime unit file "locksmithd.service" is unmasked
May 17 21:19:04 example systemd[1]: oem-cloudinit.service: Deactivated successfully.
May 17 21:19:04 example systemd[1]: Finished oem-cloudinit.service - Run cloudinit.

Test cloud-config

Userdata:

#cloud-config

write_files:
-   encoding: b64
    content: NDI=
    path: /tmp/b64
    permissions: '0644'
-   encoding: base64
    content: NDI=
    path: /tmp/b64_1
    permissions: '0644'
-   encoding: gzip
    content: !!binary |
        H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip
    permissions: '0644'
-   encoding: gz
    content: !!binary |
        H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_1
    permissions: '0644'
-   encoding: gz+base64
    content: H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_base64
    permissions: '0644'
-   encoding: gzip+base64
    content: H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_base64_1
    permissions: '0644'
-   encoding: gz+b64
    content: H4sIAGUfoFQC/zMxAgCIsCQyAgAAAA==
    path: /tmp/gzip_base64_2
    permissions: '0644'

Result:

May 17 21:37:40 test-fc.novalocal bash[1184]: + OEMS=(aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean openstack)
May 17 21:37:40 test-fc.novalocal systemd[1]: Starting oem-cloudinit.service - Run cloudinit...
May 17 21:37:40 test-fc.novalocal bash[1185]: + echo aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean openstack
May 17 21:37:40 test-fc.novalocal bash[1187]: + grep -q -x -F openstack
May 17 21:37:40 test-fc.novalocal bash[1186]: + tr ' ' '
May 17 21:37:40 test-fc.novalocal bash[1186]: '
May 17 21:37:40 test-fc.novalocal bash[1190]: ++ '[' openstack = aws -o openstack = openstack ']'
May 17 21:37:40 test-fc.novalocal bash[1190]: ++ echo ec2-compat
May 17 21:37:40 test-fc.novalocal bash[1189]: + /usr/bin/coreos-cloudinit --oem=ec2-compat
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 fetching token...
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 error: token response status code 404
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Checking availability of "cloud-drive"
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Checking availability of "ec2-metadata-service"
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching meta-data from datasource of type "ec2-metadata-service"
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-keys. Attempt #1
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-key. Attempt #1
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Found SSH key for "local"
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching data from http://169.254.169.254/2009-04-04/meta-data/hostname. Attempt #1
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching data from http://169.254.169.254/2009-04-04/meta-data/local-ipv4. Attempt #1
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-ipv4. Attempt #1
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching user-data from datasource of type "ec2-metadata-service"
May 17 21:37:40 test-fc.novalocal bash[1189]: 2023/05/17 21:37:40 Fetching data from http://169.254.169.254/2009-04-04/user-data. Attempt #1
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Set hostname to test-fc
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Authorized SSH keys for core user
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Running part "cloud-config"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/b64"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/b64"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/b64 to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/b64_1"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/b64_1"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/b64_1 to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/gzip"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/gzip"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/gzip to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/gzip_1"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/gzip_1"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/gzip_1 to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/gzip_base64"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/gzip_base64"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/gzip_base64 to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/gzip_base64_1"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/gzip_base64_1"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/gzip_base64_1 to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/tmp/gzip_base64_2"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/tmp/gzip_base64_2"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file /tmp/gzip_base64_2 to filesystem
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Writing file to "/etc/environment"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Wrote file to "/etc/environment"
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Updated /etc/environment
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Ensuring runtime unit file "etcd.service" is unmasked
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Ensuring runtime unit file "etcd2.service" is unmasked
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Ensuring runtime unit file "fleet.service" is unmasked
May 17 21:37:40 test-fc bash[1189]: 2023/05/17 21:37:40 Ensuring runtime unit file "locksmithd.service" is unmasked
May 17 21:37:40 test-fc systemd[1]: oem-cloudinit.service: Deactivated successfully.
May 17 21:37:40 test-fc systemd[1]: Finished oem-cloudinit.service - Run cloudinit.

Test script userdata

Userdata:

#!/bin/bash

echo "Creating file /tmp/coreos-cloudinit_test.txt"
touch /tmp/coreos-cloudinit_test.txt

Result:

May 17 21:43:09 test-fc.novalocal systemd[1]: Starting oem-cloudinit.service - Run cloudinit...
May 17 21:43:09 test-fc.novalocal bash[1195]: + OEMS=(aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean openstack)
May 17 21:43:09 test-fc.novalocal bash[1199]: + grep -q -x -F openstack
May 17 21:43:09 test-fc.novalocal bash[1197]: + echo aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean openstack
May 17 21:43:09 test-fc.novalocal bash[1198]: + tr ' ' '
May 17 21:43:09 test-fc.novalocal bash[1198]: '
May 17 21:43:09 test-fc.novalocal bash[1202]: ++ '[' openstack = aws -o openstack = openstack ']'
May 17 21:43:09 test-fc.novalocal bash[1202]: ++ echo ec2-compat
May 17 21:43:09 test-fc.novalocal bash[1200]: + /usr/bin/coreos-cloudinit --oem=ec2-compat
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 fetching token...
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 error: token response status code 404
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Checking availability of "cloud-drive"
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Checking availability of "ec2-metadata-service"
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching meta-data from datasource of type "ec2-metadata-service"
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-keys. Attempt #1
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-key. Attempt #1
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Found SSH key for "local"
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching data from http://169.254.169.254/2009-04-04/meta-data/hostname. Attempt #1
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching data from http://169.254.169.254/2009-04-04/meta-data/local-ipv4. Attempt #1
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching data from http://169.254.169.254/2009-04-04/meta-data/public-ipv4. Attempt #1
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching user-data from datasource of type "ec2-metadata-service"
May 17 21:43:09 test-fc.novalocal bash[1200]: 2023/05/17 21:43:09 Fetching data from http://169.254.169.254/2009-04-04/user-data. Attempt #1
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Set hostname to test-fc
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Authorized SSH keys for core user
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Running part "script"
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Writing file to "/var/lib/coreos-cloudinit/scripts/3921432685"
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Wrote file to "/var/lib/coreos-cloudinit/scripts/3921432685"
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Creating transient systemd unit 'coreos-cloudinit-3921432685.service'
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Writing file to "/var/lib/coreos-cloudinit/scripts/unit-name"
May 17 21:43:09 test-fc bash[1200]: 2023/05/17 21:43:09 Wrote file to "/var/lib/coreos-cloudinit/scripts/unit-name"
May 17 21:43:09 test-fc systemd[1]: oem-cloudinit.service: Deactivated successfully.
May 17 21:43:09 test-fc systemd[1]: Finished oem-cloudinit.service - Run cloudinit.

Test kops deployment

A k8s deployment was tried, with additionalUserData set. A full config of the InstanceGroup bellow:

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-05-17T22:00:07Z"
  generation: 1
  labels:
    kops.k8s.io/cluster: my-cluster.k8s.local
  name: control-plane-nova
spec:
  additionalUserData:
  - content: |
      write_files:
       - encoding: b64
         content: NDI=
         path: /tmp/coreos-cloudinit_test.txt
         permissions: '0644'
    name: ps_cloud_init.txt
    type: text/cloud-config
  image: flatcar-custom
  machineType: m1.medium
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - nova

Result:

A multipart mime userdata was created for the controller. The k8s cluster came up successfully and /tmp/coreos-cloudinit_test.txt was created on the controller with the contents: 42, as expected.

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

Related: flatcar/scripts#823

@gabriel-samfira gabriel-samfira force-pushed the add-multipart-mime-support branch 2 times, most recently from 6e27d1b to 5c9797d Compare May 17, 2023 22:25
This change adds multipart mime userdata support and refactors the code
to allow coreos-cloudinit to run multiple userdata parts as separate
steps.

Signed-off-by: Gabriel Adrian Samfira <[email protected]>
@hakman
Copy link

hakman commented May 18, 2023

New support is added for multipart mime user-data, which means we may get multiple different parts that coreos-cloudinit will now run in the order they are defined in the multipart user-data.

I am not 100% sure, but I think we noticed that cloud-init runs the scripts ordered by filename: Content-Disposition: attachment; filename="".

@gabriel-samfira
Copy link
Member Author

New support is added for multipart mime user-data, which means we may get multiple different parts that coreos-cloudinit will now run in the order they are defined in the multipart user-data.

I am not 100% sure, but I think we noticed that cloud-init runs the scripts ordered by filename: Content-Disposition: attachment; filename="".

Will check it out. Sorting by file name should be an easy change. Will look at the cloud-init code and align with existing expectations users might have.

@gabriel-samfira
Copy link
Member Author

Seems that cloud-init walks the multipart message and calls the supplied callback:

https://github.com/canonical/cloud-init/blob/main/cloudinit/stages.py#L653

defined here:

https://github.com/canonical/cloud-init/blob/main/cloudinit/handlers/__init__.py#L257-L277

The mime message does not seem to be sorted in any way. It kind of makes sense. This way you can add scripts/cloud-configs in the order you want them to be executed by simply appending to an array before you serialize them in a MIME multipart message.

We can change this later if it turns out I did not understand the code correctly, and they are sorted by filename.

Signed-off-by: Gabriel Adrian Samfira <[email protected]>
@pothos
Copy link
Member

pothos commented May 22, 2023

Thank you, let's also test this with the regular test suite. For that we need to create a scripts PR that uses your repo and the branch's last commit ID in src/third_party/coreos-overlay/coreos-base/coreos-cloudinit/coreos-cloudinit-9999.ebuild. That and the required Ignition changes to make it work. Then we can see if tests pass. We also need a test written in kola (not many tiny unit tests but more a larger integration test) - the mantle repo PR will result in a pushed container image that we need to refer to in sdk_container/.repo/manifests/mantle-container.
Edit: Forgot that flatcar/scripts#823 exists and the commit should be added there. Pointers for kola: cloud-config is tested in kola/tests/misc/cloudinit.go

config/config.go Outdated Show resolved Hide resolved
initialize/config.go Outdated Show resolved Hide resolved
initialize/user_data.go Outdated Show resolved Hide resolved
initialize/user_data.go Outdated Show resolved Hide resolved
Co-authored-by: Krzesimir Nowak <[email protected]>
@gabriel-samfira
Copy link
Member Author

Thank you, let's also test this with the regular test suite. For that we need to create a scripts PR that uses your repo and the branch's last commit ID in src/third_party/coreos-overlay/coreos-base/coreos-cloudinit/coreos-cloudinit-9999.ebuild. That and the required Ignition changes to make it work. Then we can see if tests pass. We also need a test written in kola (not many tiny unit tests but more a larger integration test) - the mantle repo PR will result in a pushed container image that we need to refer to in sdk_container/.repo/manifests/mantle-container.

A potential kola integration test would have to pretty much run a VM with each of the user-data combinations in the testdata folder in this PR, making sure that old behavior is preserved and new behavior works as expected. Will have a look at all the moving parts in the morning.

Thanks for the review folks!

  * Use textproto to try and read multipart headers
  * No need for generics in parseMimeHeader
  * Remove quadratic

Signed-off-by: Gabriel Adrian Samfira <[email protected]>
…ra/coreos-cloudinit into add-multipart-mime-support
@gabriel-samfira
Copy link
Member Author

Changes made. Will update the scripts PR to include some integration tests.

@gabriel-samfira
Copy link
Member Author

Proposed flatcar/mantle#437, but needs approval before workflows run.

config/config.go Show resolved Hide resolved
initialize/user_data.go Outdated Show resolved Hide resolved
initialize/user_data.go Outdated Show resolved Hide resolved
gabriel-samfira and others added 2 commits May 23, 2023 15:02
Co-authored-by: Krzesimir Nowak <[email protected]>
Co-authored-by: Krzesimir Nowak <[email protected]>
initialize/user_data.go Outdated Show resolved Hide resolved
Co-authored-by: Krzesimir Nowak <[email protected]>
Signed-off-by: Gabriel Adrian Samfira <[email protected]>
@gabriel-samfira gabriel-samfira force-pushed the add-multipart-mime-support branch from 96492d1 to 52d63b4 Compare June 2, 2023 11:56
@gabriel-samfira gabriel-samfira merged commit eb49a8f into flatcar:flatcar-master Jun 8, 2023
Comment on lines +242 to +246
hostname := determineHostname(metadata, udata)
if err := initialize.ApplyHostname(hostname); err != nil {
log.Printf("Failed to set hostname: %v", err)
mustStop = true
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now overwrites any static hostname with the meta-data hostname. Can we remove these lines again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was kind of the point of this. If coreos-cloudinit is used, it is responsible for setting the hostname. The hostname is fetched from userdata or meta-data and the short form hostname is set. Userdata has precedence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, when Ignition ran and coreos-cloudinit gets triggered through the config drive mechanism, this suddenly overwrites the hostname Ignition set up in /etc/hostname.
We could try to skip execution of coreos-cloudinit when Ignition ran but still I wonder if this here wouldn't also cause problems when coreos-cloudinit or some custom image setup was writing to /etc/hostname and now this here overwrites it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should use coreos-cloudinit in scenarios where it's needed (like OpenStack or wherever we need multipart mime/cloud-config). Otherwise it should be disabled.

If, however it's enabled, we should assume it will overwrite some things set by afterburn.

If custom image setup is needed, that should either be done with coreos-cloudinit via userdata, or it should run after coreos-cloudinit.

Otherwise there is no sane way to have coreos-cloudinit run and be useful. Perhaps this is something that needs to be documented?

Copy link
Member

@pothos pothos Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the process terminated with Detected an Ignition config. Exiting... but that's maybe something we better enforce from the service unit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it should probably not run if the userdata is ignition. Is there some other trigger that enables it, besides non-ignition userdata?

If yes, we should probably add flags to disable various bits of it, like setting the hostname and SSH keys (for example).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think flags are a good idea, we also have this for afterburn and it would allow to, e.g., opt out of metadata if it's covered by afterburn for the platform. We should also tweak the units to not have it run at all on certain platforms: e.g., on Digital Ocean it can run twice, once through the regular oem-cloudinit service and once through the configdrive, and it makes sense to disable it for the config drive (this is the case we ran into).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look into adding those flags as soon as I can.

os.Exit(1)
}
mergedKeys := mergeSSHKeysFromSources(metadata, udata)
if err := initialize.ApplyCoreUserSSHKeys(mergedKeys, env); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also a new addition to the code path? I noticed that this conflicts with afterburn writing the keys as well, and this race could maybe lead to a broken setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afterburn runs in initrd. This runs during system service startup. At most, I think this can add duplicate keys, but that should not break anything. This is not new, just moved around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[email protected] is running at the same time but in this execution log here it was faster than cloudinit:

1377   │ Jun 20 08:34:24.519522 coreos-cloudinit[1171]: 2023/06/20 08:34:24 Checking availability of "cloud-drive"
1378   │ Jun 20 08:34:24.519522 coreos-cloudinit[1171]: 2023/06/20 08:34:24 Fetching meta-data from datasource of type "cloud-drive"
1379   │ Jun 20 08:34:24.519522 coreos-cloudinit[1171]: 2023/06/20 08:34:24 Attempting to read from "/media/configdrive/openstack/latest/meta_data.json"
1380   │ Jun 20 08:34:24.519522 coreos-cloudinit[1171]: 2023/06/20 08:34:24 Fetching user-data from datasource of type "cloud-drive"
1381   │ Jun 20 08:34:24.519522 coreos-cloudinit[1171]: 2023/06/20 08:34:24 Attempting to read from "/media/configdrive/openstack/latest/user_data"
1382   │ Jun 20 08:34:24.524397 update-ssh-keys[1183]: Updated "/home/core/.ssh/authorized_keys"
1383   │ Jun 20 08:34:24.521093 systemd[1]: Finished [email protected] - Flatcar Metadata Agent (SSH Keys).

MichaelEischer added a commit to MichaelEischer/coreos-cloudinit that referenced this pull request Jan 15, 2024
The refactoring in flatcar#21
caused hostnames to be set unconditionally compared to the old behavior
of only setting the hostname if it not empty.

When running coreos-cloudinit with datasources that do not provide
metadata such as the `file` datasource, the refactored code caused the
hostname to always be reset to `localhost`. This leads to various
problems like preventing k8s nodes from joining their cluster.

This change restores the old behavior by not applying empty hostnames.

Fixes flatcar/Flatcar#1262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants