Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlatCar Beta 3913.1.0 with systemd 255 enables DHCP rapid commit by default #1438

Open
daMupfel opened this issue Apr 25, 2024 · 7 comments
Open
Labels
kind/bug Something isn't working

Comments

@daMupfel
Copy link

daMupfel commented Apr 25, 2024

Description

The new Beta FlatCar with version 3913.1.0 updated systemd to version 255. With this new version comes support for DHCP RapidCommit which seems to be enabled by default:

RapidCommit=

    Takes a boolean. The DHCPv4 client can obtain configuration parameters from a DHCPv4 server through a rapid two-message exchange (discover and ack). When the rapid commit option is set by both the DHCPv4 client and the DHCPv4 server, the two-message exchange is used. Otherwise, the four-message exchange (discover, offer, request, and ack) is used. The two-message exchange provides faster client configuration. See [RFC 4039](https://tools.ietf.org/html/rfc4039) for details. Defaults to true when Anonymize=no and neither AllowList= nor DenyList= is specified, and false otherwise.

    Added in version 255.

Our cloud provider (CloudSigma) seems to have a faulty implementation of DHCPv4 rapid commit which means that we are no longer getting an IP address.

This can be fixed (for existing servers) by copying the default config from /usr/lib/systemd/network/zz-default.network as an own config and adapting the DHCPv4 section as follows:

[DHCPv4]
RoutesToDNS=false
RapidCommit=false

Impact

Not getting an IP address. Because the CloudInit process for CloudSigma requires an assigned lease this also means that the whole setup doesn't work anymore.

Environment and steps to reproduce

  1. Upload current beta FlatCar CloudSigma vendor image to CloudSigma
  2. Create a new machine
  3. No public IP is assigned and the CloudInit process never runs

Expected behavior

Server correctly setup with IP and CloudInit config.

Additional information

We are also in discussions with CloudSigma in order to fix their DHCP implementation. Not sure when and how this will go though.

This is not really a bug on Flatcars side but rather a break for us because the network config is now different with the new version.

The question is how this could be fixed (if you are open to do it on the FlatCar side). I currently see the following options:

  • Update the default network config to disabled rapid commit
  • Add a custom network config file to the vendored CloudSigma image

I would like to get some feedback for this and probably can provide a PR if you would be fine with one of the proposed solutions :).

@jepio
Copy link
Member

jepio commented Apr 25, 2024

Add a custom network config file to the vendored CloudSigma image

this would definitely be a good idea if the default does not cause widespread problems for other platforms

@t-lo
Copy link
Member

t-lo commented Apr 25, 2024

@jepio if added only to oem-cloudsigma it shouldn't affect other platforms, should it? And it potentially affects all CloudSigma deployments the way I read the summary.

@daMupfel I would argue that implementing this should be done as an OEM sysext so the change is also distributed to existing nodes when these update (@pothos please keep me honest).
Using an OEM sysext would also allow to change the config with future updates if required. As sysexts cover /usr, the config should go to /usr/lib/systemd/network/.
This is slightly (but only slightly) more complicated than just dropping a config file to the oem-cloudsigma provider. The biggest challenge is to introduce OEM sysext to the cloudsigma image as this image is currently not using OEM sysexts afaict. But that shouldn't keep you from working on a PR, OEM sysexts are used for most other images. The concept should be easily portable to cloudsigma.

@pothos
Copy link
Member

pothos commented Apr 25, 2024

I think the OEM sysext might get loaded too late? For most clouds the small network config files are part of the base image because they need to be in bootengine and in init.

@t-lo
Copy link
Member

t-lo commented Apr 25, 2024

Hmmm, good point, re-reading the summary it states that bootstrap configuration fails, so this is required in the initrd. No sysext then.

@daMupfel
Copy link
Author

daMupfel commented May 2, 2024

Hi, thanks for the feedback so far :).

When adding it to the oem image it won't be updated on existing installations (the oem partition seems to keep the state of the original install), is that correct? At least that was my observation so far.
If so, are there any options to make this work for existing installations which update?

@daMupfel
Copy link
Author

I added a PR regarding this issue in flatcar/scripts. This probably won't fix existing installations (during update) but we can manually fix those in our system quite easily. Please let me now if you think this is a good solution.

@daMupfel
Copy link
Author

daMupfel commented Aug 28, 2024

Hi,

I created the PR more than 2 months ago and after a first review I haven't received any feedback yet. I don't want to rush anyone, but I'd appreciate an idea of the expected timeline for this review. This information is important for my company to decide whether to invest in a custom build job on our CI system or wait for it to be integrated upstream. Currently, we are building the images manually.

Thank you very much for your work.

Best regards,
David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Status: 📝 Needs Triage
Development

No branches or pull requests

4 participants