Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM rebooted at every change because of hotPlugChange set to false #640

Closed
fad3t opened this issue Oct 10, 2023 · 8 comments · Fixed by #641
Closed

VM rebooted at every change because of hotPlugChange set to false #640

fad3t opened this issue Oct 10, 2023 · 8 comments · Fixed by #641
Assignees

Comments

@fad3t
Copy link
Contributor

fad3t commented Oct 10, 2023

Hello,

We have noticed that some of our VMs get rebooted every time we run a terraform apply.
Those VMs have a custom boot configuration: they use the UEFI BIOS mode.

We suspect the following piece of code to be the cause of the reboots as it sets hotPlugChange to false:

	if bc, change := bootConfigHasChange(res.BootConfig, d); !reflect.DeepEqual(*bc, v3.VMBootConfig{}) {
		res.BootConfig = bc
		hotPlugChange = change
	}

The following happens when it enters the bootConfigHasChange function:

  • hotPlugChange is set to false;
  • none of the conditions are matched, because there are no planned changes related to the boot config of the VM;
  • the function returns the unchanged boot config and false.

Right after the DeepEqual check is executed and compares the boot config of the VM with an empty/default VM boot config. In our case the boot type is set to UEFI, so DeepEqual returns false and hotPlugChange is assigned the value of change - i.e. false.

I would be happy to prepare a PR to fix this (I guess DeepEqual should be comparing bc with res.BootConfig instead) but before that it would be great if somebody could review and confirm our assumptions.

FYI we run Terraform 1.5.5 with the latest version of the Nutanix provider (1.9.3).

Thanks,
Fred

@g1df
Copy link

g1df commented Oct 12, 2023

Hello,

I would add the following observations :

The default boot config of a VM is "Legacy - Default boot order".

When creating a VM with these defaults, nothing seems to be written in the API data.

But, if we change the boot_config to anything else, it will then be written in the API data, and even if you get back to the default values, they will now be written as well, so the provider will then trigger a reboot after comparing the values.

@abhimutant
Copy link
Collaborator

Hi, can you please share your tf config ?

@fad3t
Copy link
Contributor Author

fad3t commented Oct 12, 2023

Sure, here's the part calling our custom module:

module "testvm" {
  source  = "custom.registry/vm-windows/nutanix"
  version = "2.3.4"

  name              = "testvm"
  description       = "Test VM Windows"
  vcpu              = 2
  memory_gb         = 8
  disk_size_gb      = 100
  disk_size_logs_gb = 0

  subnet_name = "DEV"

  domain_ou                      = "DEV"
  domain_user                    = "nutanix.svc"
  domain_password                = var.nutanix_windows_domain_password
  windows_administrator_password = var.windows_administrator_password

  category = {
    env = "DEV"
  }
}

The provider config:

terraform {
  required_version = "~> 1.0"

  required_providers {
    nutanix = {
      source  = "nutanix/nutanix"
      version = "~> 1.9.2"
    }
  }

  backend "remote" {
    hostname     = "custom.remote"
    organization = "abc"
    workspaces {
      name = "DEV"
    }
  }
}

provider "nutanix" {
  username     = var.nutanix_username
  password     = var.nutanix_password
  endpoint     = var.nutanix_endpoint
  port         = var.nutanix_port
  insecure     = false
  wait_timeout = 10
}

The module we use is mostly a wrapper around the virtual machine resource. The most interesting part is the boot_type in this case.

resource "nutanix_virtual_machine" "this" {
  name                 = var.name
  description          = var.description
  cluster_uuid         = data.nutanix_cluster.this.id
  num_vcpus_per_socket = var.core_vcpu
  num_sockets          = var.vcpu
  memory_size_mib      = var.memory_gb * 1024
  boot_type            = "UEFI"

  // guest_customization_sysprep removed for brevety

  nutanix_guest_tools = {
    state           = "ENABLED"
    iso_mount_state = "MOUNTED"
  }

  ngt_enabled_capability_list = [
    "SELF_SERVICE_RESTORE",
    "VSS_SNAPSHOT"
  ]

  ngt_credentials = {
    username = "${var.name}\\Administrator"
    password = var.windows_administrator_password
  }

  // nic_list removed for brevety
  // disk_list removed for brevety

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [owner_reference, project_reference, guest_customization_sysprep, categories, nutanix_guest_tools, disk_list.0.data_source_reference]
  }
}

Edit: we also faced this issue by using the nutanix_virtual_machine resource alone.

@abhimutant
Copy link
Collaborator

Also, do you see any change in state while performing tf plan ?

@fad3t
Copy link
Contributor Author

fad3t commented Oct 12, 2023

No, there are no state changes during plans.

Edit: unless you meant "do you see any planned changes during the plan"?

@abhimutant
Copy link
Collaborator

abhimutant commented Oct 12, 2023

so, when you perform terraform apply do you see /vms API called twice ? you can check it using export TF_LOG=TRACE
Also, what's the PC version ?
Edit: Also, how are you verifying that VM is getting rebooted ?

@fad3t
Copy link
Contributor Author

fad3t commented Oct 12, 2023

Yes, we see the /vms API endpoint being called multiple times (a few GETs, a couple of PUTs).
The PC version is 2022.6.0.7.
We verify VM is rebooted through Nutanix audit, VM uptime, ..

Can you please check the description of the issue? We tried to include as many details as we could, and I even created a PR (that we just tested, it looks OK).

@abhimutant
Copy link
Collaborator

Merged. Closing the issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants