Launching clusters via libvirt is especially useful for operator development.
NOTE: Some aspects of the installation can be customized through the
install-config.yaml
file. See
how to create an install-config.yaml file and the libvirt platform customization documents.
It's expected that you will create and destroy clusters often in the course of development. These steps only need to be run once.
Before you begin, install the build dependencies.
Make sure you have KVM enabled by checking for the device:
$ ls -l /dev/kvm
crw-rw-rw-+ 1 root kvm 10, 232 Oct 31 09:22 /dev/kvm
If it is missing, try some of the ideas here.
On CentOS 7, first enable the kvm-common repository to ensure you get a new enough version of qemu-kvm.
On Fedora, CentOS/RHEL:
sudo yum install libvirt libvirt-devel libvirt-daemon-kvm qemu-kvm
Then start libvirtd:
sudo systemctl enable --now libvirtd
In this example, we'll set the base domain to tt.testing
and the cluster name to test1
.
git clone https://github.com/openshift/installer.git
cd installer
Libvirt creates a bridged connection to the host machine, but in order for the network bridge to work IP forwarding needs to be enabled. The following command will tell you if forwarding is enabled:
sysctl net.ipv4.ip_forward
If the command output is:
net.ipv4.ip_forward = 0
then forwarding is disabled and proceed with the rest of this section. If IP forwarding is enabled then skip the rest of this section.
To enable IP forwarding for the current boot:
sysctl net.ipv4.ip_forward=1
or to persist the setting across reboots (recommended):
echo "net.ipv4.ip_forward = 1" | sudo tee /etc/sysctl.d/99-ipforward.conf
sudo sysctl -p /etc/sysctl.d/99-ipforward.conf
The Kubernetes cluster-api components drive deployment of worker machines. The libvirt cluster-api provider will run inside the local cluster, and will need to connect back to the libvirt instance on the host machine to deploy workers.
In order for this to work, you'll need to enable TCP connections for libvirt.
To do this, first modify your /etc/libvirt/libvirtd.conf
and set the
following:
listen_tls = 0
listen_tcp = 1
auth_tcp="none"
tcp_port = "16509"
Note that authentication is not currently supported, but should be soon.
On Fedora 31, you also need to enable and start the libvirtd TCP socket, which is managed by systemd:
sudo systemctl enable libvirtd-tcp.socket
sudo systemctl start libvirtd-tcp.socket
after which you need to restart libvirtd.
On Debian/Ubuntu it might be needed to configure security driver for qemu.
Installer uses terraform libvirt, and it has a known issue, that might cause
unexpected Could not open '/var/lib/libvirt/images/<FILE_NAME>': Permission denied
errors. Double check that security_driver = "none"
line is present in
/etc/libvirt/qemu.conf
and not commented out.
In addition to the config, you'll have to pass an additional command-line
argument to libvirtd. On Fedora, modify /etc/sysconfig/libvirtd
and set:
LIBVIRTD_ARGS="--listen"
On Debian based distros, modify /etc/default/libvirtd
and set:
libvirtd_opts="--listen"
Next, restart libvirt: systemctl restart libvirtd
Finally, if you have a firewall, you may have to allow connections to the libvirt daemon from the IP range used by your cluster nodes.
The following examples use the default cluster IP range of 192.168.126.0/24
(which is currently not configurable) and a libvirt default
subnet of 192.168.122.0/24
, which might be different in your configuration.
If you're uncertain about the libvirt default subnet you should be able to see its address using the command ip -4 a show dev virbr0
or by inspecting virsh --connect qemu:///system net-dumpxml default
.
Ensure the cluster IP range does not overlap your virbr0
IP address.
iptables -I INPUT -p tcp -s 192.168.126.0/24 -d 192.168.122.1 --dport 16509 -j ACCEPT -m comment --comment "Allow insecure libvirt clients"
If using firewalld
, the specifics will depend on how your distribution setup the
various zones.
On Fedora Workstation, as we don't want to expose the libvirt port externally,
we'll need to actively block it. We then use the preexisting dmz
zone for the
traffic between VMs.
sudo firewall-cmd --add-rich-rule "rule service name="libvirt" reject"
sudo firewall-cmd --zone=dmz --change-interface=virbr0
sudo firewall-cmd --zone=dmz --change-interface=tt0
sudo firewall-cmd --zone=dmz --add-service=libvirt
On RHEL8, the bridges used by the VMs are already isolated in their own zones, so we only need to allow traffic on the libvirt port:
sudo firewall-cmd --zone=libvirt --add-service=libvirt
NOTE: When the firewall rules are no longer needed, sudo firewall-cmd --reload
will remove the changes made as they were not permanently added. For persistence,
add --permanent
to the firewall-cmd
commands and run them a second time.
This step allows installer and users to resolve cluster-internal hostnames from your host.
-
Tell NetworkManager to use
dnsmasq
:echo -e "[main]\ndns=dnsmasq" | sudo tee /etc/NetworkManager/conf.d/openshift.conf
-
Tell dnsmasq to use your cluster. The syntax is
server=/<baseDomain>/<firstIP>
.For this example:
echo server=/tt.testing/192.168.126.1 | sudo tee /etc/NetworkManager/dnsmasq.d/openshift.conf
-
Reload NetworkManager to pick up the
dns
configuration change:sudo systemctl reload NetworkManager
Set TAGS=libvirt
to add support for libvirt; this is not enabled by default because libvirt is development only.
TAGS=libvirt hack/build.sh
With libvirt configured, you can proceed with the usual quick-start.
To remove resources associated with your cluster, run:
openshift-install destroy cluster
You can also use virsh-cleanup.sh
, but note that it will currently destroy all libvirt resources.
With the cluster removed, you no longer need to allow libvirt nodes to reach your libvirtd
. Restart
firewalld
to remove your temporary changes as follows:
sudo firewall-cmd --reload
Some things you can do:
The bootstrap node, e.g. test1-bootstrap.tt.testing
, runs the bootstrap process. You can watch it:
ssh "core@${CLUSTER_NAME}-bootstrap.${BASE_DOMAIN}"
sudo journalctl -f -u bootkube -u openshift
You'll have to wait for etcd to reach quorum before this makes any progress.
Using the domain names above will only work if you set up the DNS overlay or have otherwise configured your system to resolve cluster domain names. Alternatively, if you didn't set up DNS on the host, you can use:
virsh -c "${LIBVIRT_URI}" domifaddr "${CLUSTER_NAME}-master-0" # to get the master IP
ssh core@$MASTER_IP
Here LIBVIRT_URI
is the libvirt connection URI which you passed to the installer.
You'll need a kubectl
binary on your path and the kubeconfig from your cluster
call.
export KUBECONFIG="${DIR}/auth/kubeconfig"
kubectl get --all-namespaces pods
Alternatively, you can run kubectl
from the bootstrap or master nodes.
Use scp
or similar to transfer your local ${DIR}/auth/kubeconfig
, then SSH in and run:
export KUBECONFIG=/where/you/put/your/kubeconfig
kubectl get --all-namespaces pods
- There isn't a load balancer on libvirt.
If following the above steps hasn't quite worked, please review this section for well known issues.
In case of libvirt there is no wildcard DNS resolution and console depends on the route which is created by auth operator (Issue #1007).
To make it work we need to first create the manifests and edit the domain
for ingress config, before directly creating the cluster.
- Add another domain entry in the openshift.conf which used by dnsmasq.
Here
tt.testing
is the domain which I choose when running the installer. Here the IP in the address belong to one of the worker node.
$ cat /etc/NetworkManager/dnsmasq.d/openshift.conf
server=/tt.testing/192.168.126.1
address=/.apps.tt.testing/192.168.126.51
- Make sure you restart the NetworkManager after change in the openshift.conf
$ sudo systemctl restart NetworkManager
- Create the manifests
$ openshift-install --dir $INSTALL_DIR create manifests
- Domain entry in cluster-ingress-02-config.yml file should not contain cluster name
# Assuming `test1` as cluster name
$ sed -i 's/test1.//' $INSTALL_DIR/manifests/cluster-ingress-02-config.yml
- Start the installer to create the cluster
$ openshift-install --dir $INSTALL_DIR create cluster
If you're seeing an error similar to
Error: Error refreshing state: 1 error(s) occurred:
* provider.libvirt: virError(Code=38, Domain=7, Message='Unable to resolve address 'localhost' service '-1': Servname not supported for ai_socktype')
FATA[0019] failed to run Terraform: exit status 1
it is likely that your install configuration contains three backslashes after the protocol (e.g. qemu+tcp:///...
), when it should only be two.
Depending on your libvirt version you might encounter a race condition leading to an error similar to:
* libvirt_domain.master.0: Error creating libvirt domain: virError(Code=43, Domain=19, Message='Network not found: no network with matching name 'test1'')
This is also being tracked on the libvirt-terraform-provider but is likely not fixable on the client side, which is why you should upgrade libvirt to >=4.5 or a patched version, depending on your environment.
- Support for libvirt on Mac OS is currently broken and being worked on.
If you're on Arch Linux and get an error similar to
libvirt: “Failed to initialize a valid firewall backend”
or
error: Failed to start network default
error: internal error: Failed to initialize a valid firewall backend
please check out this thread on superuser.
You might find other reports of your problem in the Issues tab for this repository where we ask you to provide any additional information. If your issue is not reported, please do.