r/virtual_machine: Current limitations and roadmap for disk device management and vMotion #295

vancluever · 2017-12-08T18:38:26Z

NOTE: Read update for more details on current plans

This is something we need to block across the board - running storage vMotion on a linked clone does not necessarily work anyway, but there are situations where it can function (so long as vSphere has determined that it can properly link the disks).

However, upon migration of these disks, The filename of the backings will change to a delta disk filename, removing the original name from the snapshot chain. Worse, the new filename gets saved to state. The next plan will look something like:

  ~ vsphere_virtual_machine.example_virtual_machines[0]
      disk.#:                  "1" => "2"
      disk.0.name:             "vancluever-test0/vancluever-test0_1-000001.vmdk" => "<deleted>"
      disk.1.attach:           "" => "false"
      disk.1.disk_mode:        "" => "persistent"
      disk.1.disk_sharing:     "" => "sharingNone"
      disk.1.eagerly_scrub:    "" => "false"
      disk.1.io_limit:         "" => "-1"
      disk.1.io_reservation:   "" => "0"
      disk.1.io_share_count:   "" => "0"
      disk.1.io_share_level:   "" => "normal"
      disk.1.keep_on_remove:   "" => "false"
      disk.1.key:              "" => "0"
      disk.1.name:             "" => "vancluever-test0.vmdk"
      disk.1.size:             "" => "32"
      disk.1.thin_provisioned: "" => "true"
      disk.1.unit_number:      "" => "0"
      disk.1.write_through:    "" => "false"

This means that the disk will be deleted, with the disk being completely destroyed.

At this point in time I can't think of a good way to 100% support this and in the spirit of making sure the provider is as safe as possible we need to be able to block this completely.

1.1.x mitigation steps

Block in plan/CustomizeDiff
~~Block in apply (if possible) NOTE: This is not necessary with CustomizeDiff~~
Document the limitations of vMotion

The text was updated successfully, but these errors were encountered:

vancluever · 2017-12-13T20:54:43Z

Further details on this. In light of this, I want to get away from using file name to track virtual disks and switch to another scheme, discussed below.

Migration with Storage vMotion changes virtual machine files on the destination datastore to match the inventory name of the virtual machine. The migration renames all virtual disk, configuration, snapshot, and .nvram files. If the new names exceed the maximum filename length, the migration does not succeed.

Source

This has been confirmed by manual testing.

As such, the stop-gap for now will be to block all storage vMotions where the disk name does not match the standard vSphere naming convention of disk.vmdk and disk_INDEX.vmdk, computed off of the VM name post-modification (to catch situations where the VM is renamed in the same apply operation, as this currently happens pre-migration).

We will target 1.2 for removing this limitation, where name will only dictate the initial file name of a virtual disk. For tracking virtual disks, we will also be switching to using disk UUIDs rather than device address. Actual disk path will be tracked in path, which will only be settable when an external disk is being attached. name will be removed in 2.0 to indicate that we will no longer support the ability to choose the name of non-externally attached virtual disks.

Plan, to recap:

1.1.x

During the rest of 1.1, storage vMotions will be blocked if the name on the disk does not match the the standard vSphere naming convention of disk.vmdk and disk_INDEX.vmdk, computed off of the VM name post-modification (to catch situations where the VM is renamed in the same apply operation, as this currently happens pre-migration).
vMotions will also be blocked on clone.linked_clone == true.
vMotions for external disks (when attach is set) will also be blocked and will remain blocked after the other limitations are removed.

1.2.0/1.3.0

Introduction of three new disk attributes, label, path, and uuid. label, along with uuid will form a composite key that should be able to track the entire disk lifecycle from creation (or attachment), regardless of if the disk changes position within vSphere. path will only be configurable by the user if attach is set, with changes blocked during the diff validation stage.
name will still be a valid option, serving both the role of label and path and determining the initial name of the disk. It will be set to conflict with both attributes so that one scheme or the other has to be chosen. A transition from name to label will be supported in diff, but not the other way around.
With the introduction of this functionality we will lift all of the storage vMotion restrictions except for the attach restriction, as we will now be able to track a virtual disk across file name changes without risk to the data.
Users will get a deprecation warning for the name attribute.

2.0.0

The name option will be removed, with a forced state migration to label and path if the change has not been made already. This will effectively remove support for the ability to choose the name for all disks that are no attached externally.

Some notes:

Disk UUIDs are shared during an entire snapshot chain it appears, so linked clones will share the disk UUID of the template. However, historically, as per here, you cannot add disks with the same UUIDs to a virtual machine. I'm checking to see how up to date this information is. EDIT: This no longer seems to be the case as I was able to offline-copy a VMDK file and add it as a second disk with the same UUID as the source. Trying to think if we can work around this. Note that I had to actually do this on the back end, messing with the VM configuration in a way that would normally not happen in vSphere. We could possibly just refuse to read the state of a virtual machine where this is the case, but that's kind of what r/virtual_machine: Failsafe against disk name mismatch #304 was going for and that didn't go over so well internally. I'm not too sure if the fact that a duplicate UUID situation would never happen during normal vSphere operation changes this though.

This commit adds restrictions to storage vMotion: * Externally-attached disks cannot be migrated. * Disk names must match the standard vSphere naming convention based on the current name of the virtual machine. * Linked clones are blocked from being migrated at this time. The latter two restrictions will be lifted once #295 is fully realized and we are tracking disks via a label/UUID combination.

vancluever · 2017-12-15T00:34:15Z

Now that 1.1.1 is out and we've mitigated this, I've removed the "bug" label from this so that it's classified as an enhancement, and it will be taken care of in the next set of changes for possibly 1.2 or 1.3.

vancluever · 2018-01-17T16:50:52Z

vancluever · 2018-01-18T02:08:28Z

Just an update that this is moving along and is looking pretty positive. General behaviour for transitioning is looking stable during smoke testing. Our acceptance tests will need to be updated mostly, and we will probably add a couple specifically for the transition and move some cases that were error cases before to expect success instead. But otherwise I'm pretty excited about how this is looking 👍

vancluever · 2018-01-25T21:12:34Z

#363 was merged, so this is pretty much done. Everything works as intended and nothing out of the ordinary came out of the smoke tests, except an unrelated issue for which a fix has been submitted as #371.

One last thing I'm testing before we commit to a release is what happens when someone does something like add multiple disks with the same UUID. Again, this should never happen during regular operation, but in the event it can happen, we need to be prepared via some sort of fail-safe.

Add disk with duplicate UUIDs manually though the console, with UUID in state
Add disk with duplicate UUIDs manually though the console, without UUID in state
Try adding disk with attach in Terraform

vancluever · 2018-01-25T22:03:54Z

Alright, all three of these operations are safe in terms of not destroying data, and ultimately providing errors when Terraform detects a UUID conflict in the read cycle. The differences in the above three scenarios lies in where Terraform fails:

Disk added in console with UUID in state: Terraform fails on the next plan.
Disk added in console with UUID not in state: Terraform will read in the disks as orphaned for the next plan, but fails the apply as the read happens again and detects the disks as duplicates.
Disks added using attach: The apply passes, but the next read gives an error.

The errors resemble something like:

vsphere_virtual_machine.foo: disk.1: cannot find disk device: multiple virtual disks with UUID **** found

This seems to be safe, and as we established internally in #304 Terraform should not be working any harder than normal to work with situations that should never happen during normal operation (such as a user doing major modifications on the back end to a virtual machine outside of Terraform). The only issue I could see coming up is the possible rendering of the state un-correctable after the latter two scenarios, but these are edge cases, and the only one that we could possibly track is the last one, with heavy effort on Terraform's part to check not just state and the current devices on the virtual machine, but comparing the UUIDs of all disks to be attached as well, which is not something we currently handle right now anyway.

As everything in this issue is mainly complete, save removal of the name attribute completely (which will be done closer to around 2.0 and will have a separate issue associated with it), I'm closing this issue now!

vancluever added the bug Type: Bug label Dec 8, 2017

vancluever mentioned this issue Dec 9, 2017

r/virtual_machine: Failsafe against disk name mismatch #304

Closed

vancluever changed the title ~~Storage vMotion: Block linked clones~~ r/virtual_machine: Current limitations and roadmap for disk device management and vMotion Dec 13, 2017

vancluever added the enhancement Type: Enhancement label Dec 13, 2017

vancluever mentioned this issue Dec 14, 2017

r/virtual_machine: Enforce storage vMotion restrictions #319

Merged

vancluever removed the bug Type: Bug label Dec 15, 2017

This was referenced Dec 18, 2017

r/virtual_machine: Block disk names from coming from computed values #329

Merged

vSphere: discussion - how to make hard disks evolve #49

Closed

This was referenced Jan 11, 2018

create_before_destroy #342

Closed

vsphere provider 1.2.0 plan wants to delete and rename some disks made with 1.1.1 #350

Closed

vancluever mentioned this issue Jan 20, 2018

r/virtual_machine: New disk device naming and tracking behaviour #363

Merged

vancluever closed this as completed Jan 25, 2018

ghost locked and limited conversation to collaborators Apr 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

r/virtual_machine: Current limitations and roadmap for disk device management and vMotion #295

r/virtual_machine: Current limitations and roadmap for disk device management and vMotion #295

vancluever commented Dec 8, 2017 •

edited

Loading

vancluever commented Dec 13, 2017 •

edited

Loading

vancluever commented Dec 15, 2017

vancluever commented Jan 17, 2018 •

edited

Loading

vancluever commented Jan 18, 2018 •

edited

Loading

vancluever commented Jan 25, 2018 •

edited

Loading

vancluever commented Jan 25, 2018

r/virtual_machine: Current limitations and roadmap for disk device management and vMotion #295

r/virtual_machine: Current limitations and roadmap for disk device management and vMotion #295

Comments

vancluever commented Dec 8, 2017 • edited Loading

1.1.x mitigation steps

vancluever commented Dec 13, 2017 • edited Loading

1.1.x

1.2.0/1.3.0

2.0.0

vancluever commented Dec 15, 2017

vancluever commented Jan 17, 2018 • edited Loading

vancluever commented Jan 18, 2018 • edited Loading

vancluever commented Jan 25, 2018 • edited Loading

vancluever commented Jan 25, 2018

vancluever commented Dec 8, 2017 •

edited

Loading

vancluever commented Dec 13, 2017 •

edited

Loading

vancluever commented Jan 17, 2018 •

edited

Loading

vancluever commented Jan 18, 2018 •

edited

Loading

vancluever commented Jan 25, 2018 •

edited

Loading