Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add with-kvm flag to the create cluster command #9738

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions cmd/talosctl/cmd/mgmt/cluster/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ var (
applyConfigEnabled bool
bootloaderEnabled bool
uefiEnabled bool
kvmEnabled bool
tpm2Enabled bool
extraUEFISearchPaths []string
configDebug bool
Expand Down Expand Up @@ -475,6 +476,7 @@ func create(ctx context.Context) error {
provision.WithBootlader(bootloaderEnabled),
provision.WithUEFI(uefiEnabled),
provision.WithTPM2(tpm2Enabled),
provision.WithKvm(kvmEnabled),
provision.WithDebugShell(debugShellEnabled),
provision.WithExtraUEFISearchPaths(extraUEFISearchPaths),
provision.WithTargetArch(targetArch),
Expand Down Expand Up @@ -1248,6 +1250,7 @@ func init() {
createCmd.Flags().BoolVar(&applyConfigEnabled, "with-apply-config", false, "enable apply config when the VM is starting in maintenance mode")
createCmd.Flags().BoolVar(&bootloaderEnabled, bootloaderEnabledFlag, true, "enable bootloader to load kernel and initramfs from disk image after install")
createCmd.Flags().BoolVar(&uefiEnabled, "with-uefi", true, "enable UEFI on x86_64 architecture")
createCmd.Flags().BoolVar(&kvmEnabled, "with-kvm", true, "enable kvm (QEMU provisioner only)")
createCmd.Flags().BoolVar(&tpm2Enabled, tpm2EnabledFlag, false, "enable TPM2 emulation support using swtpm")
createCmd.Flags().BoolVar(&debugShellEnabled, withDebugShellFlag, false, "drop talos into a maintenance shell on boot, this is for advanced debugging for developers only")
createCmd.Flags().MarkHidden("with-debug-shell") //nolint:errcheck
Expand Down
10 changes: 10 additions & 0 deletions pkg/provision/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,15 @@ func WithTPM2(enabled bool) Option {
}
}

// WithKvm enables or disables KVM with qemu.
func WithKvm(enabled bool) Option {
return func(o *Options) error {
o.KvmEnabled = enabled

return nil
}
}

// WithDebugShell drops into debug shell in initramfs.
func WithDebugShell(enabled bool) Option {
return func(o *Options) error {
Expand Down Expand Up @@ -199,6 +208,7 @@ type Options struct {
JSONLogsEndpoint string

SiderolinkEnabled bool
KvmEnabled bool
}

// DefaultOptions returns default options.
Expand Down
9 changes: 6 additions & 3 deletions pkg/provision/providers/qemu/arch.go
Original file line number Diff line number Diff line change
Expand Up @@ -200,11 +200,14 @@ func generateUEFIPFlashList(uefiSourcePathPrefixes, uefiSourceFiles, uefiVarsFil
}

// QemuExecutable returns name of qemu executable for the arch.
func (arch Arch) QemuExecutable() string {
func (arch Arch) QemuExecutable(kvmEnabled bool) string {
binaries := []string{
"qemu-system-" + arch.QemuArch(),
"qemu-kvm",
"/usr/libexec/qemu-kvm",
}
if kvmEnabled {
binaries = append(binaries,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused a bit, this just changes the search path by removing qemu-kvm ? What does it do in terms of features?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the correct way to handle this parameter would be the function at the bottom of this file:

func (arch Arch) KVMArgs(kvmEnabled bool) []string {

This is because we should first exclude the argument demanding the use of KVM rather than finding a binary that does not include the KVM acceleration support. However qemu-kvm command might force KVM enabled on some distros: not really sure about that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested that myself, but I guess kvm is enabled by default, so we can skip forcing it (which would fix the case when KVM is not available), but without KVM emulation would be super slow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not enabled by default it seems

https://wiki.archlinux.org/title/QEMU#Enabling_KVM

Copy link
Author

@Orzelius Orzelius Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused a bit, this just changes the search path by removing qemu-kvm

Yes, this line just removes qemu-kvmbinary from the search list if KVM is disabled. Maybe qemu-kvm can be used even without kvm, but I think it'd be confusing and I haven't looked into it.

What does it do in terms of features?

Do you mean the PR in general? It allows to disable KVM usage, which was mandatory before and would error if KVM could not be used. Now a user can optionally set --with-kvm to false and thus run talos via qemu even on a system which doesn't have kvm (e.q. wsl, docker on a non linux host or another vm).

edit: looks like kvm can be used with wsl as of few years ago :o

Copy link
Author

@Orzelius Orzelius Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because we should first exclude the argument demanding the use of KVM rather than finding a binary that does not include the KVM acceleration support.

This is correct. The code to disable KVM was already there for the use case of emulating different architectures. In the pr I just pass the new flag into the existing LaunchConfig.EnableKVMparameter.

which is later passed to func (arch Arch) KVMArgs(kvmEnabled bool) []string

I just thought removing the qemu-kvm from binaries in case of not using KVM also made sense and made that change as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think qemu-kvm is just yet another alias present on some systems, and removing it from the search path doesn't make any difference, or make it more confusing in the end.

talosctl cluster create doesn't work on non-Linux, as CNI binaries used to set up networking won't work on non-Linux hosts, so the only case is Linux with KVM disabled, which probably exists, but rare to find.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it then make more sense to just log a warning and proceed with KVM off when it's not available?
I think it still makes sense as it's not a big change and would aid in development and testing of the create command. At least I would benefit from it (see pr description)

"qemu-kvm",
"/usr/libexec/qemu-kvm")
}

for _, binary := range binaries {
Expand Down
4 changes: 2 additions & 2 deletions pkg/provision/providers/qemu/launch.go
Original file line number Diff line number Diff line change
Expand Up @@ -455,9 +455,9 @@ func launchVM(config *LaunchConfig) error {
)
}

fmt.Fprintf(os.Stderr, "starting %s with args:\n%s\n", config.ArchitectureData.QemuExecutable(), strings.Join(args, " "))
fmt.Fprintf(os.Stderr, "starting %s with args:\n%s\n", config.ArchitectureData.QemuExecutable(config.EnableKVM), strings.Join(args, " "))
cmd := exec.Command(
config.ArchitectureData.QemuExecutable(),
config.ArchitectureData.QemuExecutable(config.EnableKVM),
args...,
)

Expand Down
2 changes: 1 addition & 1 deletion pkg/provision/providers/qemu/node.go
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ func (p *provisioner) createNode(state *vm.State, clusterReq provision.ClusterRe
ExtraISOPath: extraISOPath,
PFlashImages: pflashImages,
MonitorPath: state.GetRelativePath(fmt.Sprintf("%s.monitor", nodeReq.Name)),
EnableKVM: opts.TargetArch == runtime.GOARCH,
EnableKVM: opts.TargetArch == runtime.GOARCH && opts.KvmEnabled,
BadRTC: nodeReq.BadRTC,
DefaultBootOrder: defaultBootOrder,
BootloaderEnabled: opts.BootloaderEnabled,
Expand Down
4 changes: 2 additions & 2 deletions pkg/provision/providers/qemu/preflight.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,14 @@ func (check *preflightCheckContext) verifyRoot(context.Context) error {
func (check *preflightCheckContext) checkKVM(context.Context) error {
f, err := os.OpenFile("/dev/kvm", os.O_RDWR, 0)
if err != nil {
return fmt.Errorf("error opening /dev/kvm, please make sure KVM support is enabled in Linux kernel: %w", err)
return fmt.Errorf("error opening /dev/kvm, please make sure KVM support is enabled in Linux kernel: %w\nOr disable kvm with --with-kvm=false", err)
}

return f.Close()
}

func (check *preflightCheckContext) qemuExecutable(context.Context) error {
if check.arch.QemuExecutable() == "" {
if check.arch.QemuExecutable(check.options.KvmEnabled) == "" {
return fmt.Errorf("QEMU executable (qemu-system-%s or qemu-kvm) not found, please install QEMU with package manager", check.arch.QemuArch())
}

Expand Down
1 change: 1 addition & 0 deletions website/content/v1.9/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,7 @@ talosctl cluster create [flags]
--with-init-node create the cluster with an init node
--with-json-logs enable JSON logs receiver and configure Talos to send logs there
--with-kubespan enable KubeSpan system
--with-kvm enable kvm (QEMU provisioner only) (default true)
--with-network-bandwidth int specify bandwidth restriction (in kbps) on the bridge interface when creating a qemu cluster
--with-network-chaos enable to use network chaos parameters when creating a qemu cluster
--with-network-jitter duration specify jitter on the bridge interface when creating a qemu cluster
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ To see a live demo of this writeup, see the video below:

- Linux
- a kernel with
- KVM enabled (`/dev/kvm` must exist)
- `CONFIG_NET_SCH_NETEM` enabled
- `CONFIG_NET_SCH_INGRESS` enabled
- KVM enabled (`/dev/kvm` must exist) if you wish to use KVM
- at least `CAP_SYS_ADMIN` and `CAP_NET_ADMIN` capabilities
- QEMU
- `bridge`, `static` and `firewall` CNI plugins from the [standard CNI plugins](https://github.com/containernetworking/cni), and `tc-redirect-tap` CNI plugin from the [awslabs tc-redirect-tap](https://github.com/awslabs/tc-redirect-tap) installed to `/opt/cni/bin` (installed automatically by `talosctl`)
Expand All @@ -33,7 +33,7 @@ To see a live demo of this writeup, see the video below:
### How to get QEMU

Install QEMU with your operating system package manager.
For example, on Ubuntu for x86:
For example, on Ubuntu for x86 with kvm:

```bash
apt install qemu-system-x86 qemu-kvm
Expand Down