Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kola: provide node to node connectivity to QEMU platform for multi-node tests #2741

Open
tormath1 opened this issue Mar 7, 2022 · 14 comments

Comments

@tormath1
Copy link

tormath1 commented Mar 7, 2022

Hi,

In flatcar-linux/mantle fork - we implemented back in the days a network setup to provide Internet connectivity to QEMU instances: see this PR and this commit in particular: flatcar/mantle@d065c5d.

Do you think this is something you could be interested to ingest in your codebase ? According to the documentation:

Local platforms do not rely on access to the Internet as a design principle of kola, minimizing external dependencies. Any network services required get built directly into kola itself.

I think the proposed implementation is fulfilling this requirement:

[...] Any network services required get built directly into kola itself.

We only rely on virtual ethernet pair and NAT.

@cgwalters
Copy link
Member

cgwalters commented Mar 7, 2022

For us our main pipeline runs unprivileged in Kubernetes/OpenShift. I think some of the veth and iptables stuff can be done in an unprivileged network namespace, but AIUI it's hard to do all useful networking fully unprivileged.

In practice I think what we really want is to do libvirt-based testing, not direct qemu for this use case. In this case libvirt is kind of multitenant but in practice it's so easy to "leak state" and have tests conflict that the model should be:

  • provision host with libvirt (could even be a FCOS system with libvirt package layered)
  • schedule container with mantle (coreos-assembler) on that host that has access to libvirt socket
  • Tear down that host

An alternative optimization here is to retain a provisioned libvirt host between tests, but flush all libvirt state.

@dustymabe
Copy link
Member

Thanks @tormath1 for reaching out to collaborate.

In mantle today we don't have any internet restrictions in our qemu tests. I know this is working because we have a lot of Fedora CoreOS tests that reach out to the network to perform various actions.

There are two things that happened some time ago (before the mantle code base was merged into coreos-assembler) that I think gave us this:

@pothos
Copy link

pothos commented Mar 9, 2022

Thanks for the details, qemu-unpriv machines cannot communicate is still valid though, or? That's why we extended the QEMU platform (I've read that there are tricks to let the unpriv slirp setup communicate but we haven't looked into it).

@dustymabe
Copy link
Member

hey @pothos we use qemu-unpriv for pretty much all of our testing and we access the network in a large portion of our tests. i.e. if you want to run a test that pulls a container from a container registry and runs it, you can do that with qemu-unpriv.

Where is the qemu-unpriv machines cannot communicate text that you refer to coming from? Our documentation?

@pothos
Copy link

pothos commented Mar 9, 2022

The kola test annotations for excluding qemu-unpriv: https://github.com/coreos/coreos-assembler/search?q=qemu-unpriv+machines+cannot+communicate&type=

That's the main reason we stick to using the other qemu platform because it allows us to run things like the kubeadm test @tormath1 added (I realize that we didn't exclude qemu-unpriv there yet but have to since it won't work).

@bgilbert
Copy link
Contributor

Most of our kola tests are single-node. qemu-unpriv nodes can communicate with the Internet, but nodes in multiple-node test clusters cannot communicate with each other.

@dustymabe
Copy link
Member

Ahh, ok now I understand what you were asking.

@dustymabe dustymabe changed the title kola: provide Internet connectivity to QEMU platform kola: provide node to node connectivity to QEMU platform for multi-node tests Mar 15, 2022
@dustymabe
Copy link
Member

Updated the title to be more accurate.

@dustymabe
Copy link
Member

IIUC we ripped out the qemu platform so I doubt we're going to reinstate it. In that case this turns into one of the following two options:

  • close this request because we can't/won't reinstate the old code
  • add support for intra-node connectivity to qemu-unpriv somehow
    • not sure if this is possible (might have to get creative)

@pothos
Copy link

pothos commented Mar 15, 2022

A while ago I searched and found this: https://lists.gnu.org/archive/html/qemu-discuss/2014-11/msg00020.html
and the socket backend (-netdev socket,id=mynet0,listen=:1234 and -netdev socket,id=mynet0,connect=:1234)
Edit: this looks doable: https://gist.github.com/mcastelino/88195a7d99811a177f5e643d1465e19e
Edit2: implemented it here: flatcar/mantle#307

Sure, we can close this, it was more a hint in case you may have interest.

@cgwalters
Copy link
Member

So...quite a while ago we merged coreos-assembler and mantle. There were a lot of benefits but also drawbacks to this.

I think what we can try to do is factor out at least our qemu code into a separate Go module. Then it seems relatively straightforward to share maintenance of that with flatcar. WDYT?

@tormath1
Copy link
Author

tormath1 commented Mar 5, 2024

With this: flatcar/Flatcar#1386 I'm wondering again if we should not seat together and see what we can do to merge back fcos/mantle with flatcar/mantle both world could benefit from such a merge: for example we added Brightbox and Scaleway platform to our Flatcar fork.
Users will benefit from this merge too as we will cover more test scenarios.

@travier
Copy link
Member

travier commented Mar 7, 2024

Agree it would be nice if we could converge/merge back the mantle tools. As you mentioned we would get Scaleway support.

A while back, we fully merged the mantle code into our coreos-assembler repo but I think it was mostly to simplify building and testing things in a single PR on our side. It should still be usable "standalone".

How do you use mantle in your CI?

@tormath1
Copy link
Author

tormath1 commented Mar 8, 2024

In Flatcar's CI, Mantle (kola, plume and ore) is consumed via its Docker image. For each commit in Mantle, a Docker image is built and this image is consumed in the CI. (https://github.com/flatcar/scripts/blob/main/sdk_container/.repo/manifests/mantle-container)
A first step, would be to get an overview of the diff between the two projects. We could then decide if we go with a common library and keep the specific FCOS / Flatcar bits downstream or merge back everything in a single project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants