Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Resource: azurerm_virtual_machine_implicit_data_disk_from_source #25537

Merged
merged 12 commits into from
May 29, 2024

Conversation

ms-zhenhua
Copy link
Contributor

@ms-zhenhua ms-zhenhua commented Apr 9, 2024

Community Note

  • Please vote on this PR by adding a 👍 reaction to the original PR to help the community and maintainers prioritize for review
  • Please do not leave "+1" or "me too" comments, they generate extra noise for PR followers and do not help prioritize for review

Description

Note: ideally, this feature should be supported by adding data_disks in azurerm_[linux|windows]_virtual_machine. However, due to technical limitations, a new resource is defined to support this feature.

This new resource is used to support a new virtual machine feature named implicit disk creation. Being different from the existing resource azurerm_virtual_machine_data_disk_attachment, which attaches an existing managed disk to the VM, this new resource creates a new data disk when updating a VM and allows the system to place the this disk closer to the parent VM to provide better latency and align the fault domain of the disk with the parent VM for providing better reliability.

This resource is similar with azurerm_virtual_machine_data_disk_attachment so that they share the same ID format. However, since this new resource uses some unique new properties, it is defined as a separate resource. These two resources are distinguished by different values of the create_option, which are added into the import code.

PR Checklist

  • I have followed the guidelines in our Contributing Documentation.
  • I have checked to ensure there aren't other open Pull Requests for the same update/change.
  • I have checked if my changes close any open issues. If so please include appropriate closing keywords below.
  • I have updated/added Documentation as required written in a helpful and kind way to assist users that may be unfamiliar with the resource / data source.
  • I have used a meaningful PR title to help maintainers and other users understand this change and help prevent duplicate work.
    For example: “resource_name_here - description of change e.g. adding property new_property_name_here

Changes to existing Resource / Data Source

  • I have added an explanation of what my changes do and why I'd like you to include them (This may be covered by linking to an issue above, but may benefit from additional explanation).
  • I have written new tests for my resource or datasource changes & updated any relevent documentation.
  • I have successfully run tests with my changes locally. If not, please provide details on testing challenges that prevented you running the tests.
  • (For changes that include a state migration only). I have manually tested the migration path between relevant versions of the provider.

Testing

  • My submission includes Test coverage as described in the Contribution Guide and the tests pass. (if this is not possible for any reason, please include details of why you did or could not add test coverage)

image

image

Change Log

Below please provide what should go into the changelog (if anything) conforming to the Changelog Format documented here.

  • azurerm_virtual_machine_implicit_data_disk_from_source - a new resource to support implicit disk creation feature for virtual machines

This is a (please select all that apply):

  • Bug Fix
  • New Feature (ie adding a service, resource, or data source)
  • Enhancement
  • Breaking Change

Related Issue(s)

Fixes #0000

Note

If this PR changes meaningfully during the course of review please update the title and description as required.

Copy link
Contributor

@tombuildsstuff tombuildsstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @ms-zhenhua

I've taken a look through and left some comments inline - but as it stands I suspect the name of this Resource is going to be a source of confusion in its current form.

Since we're creating a Data Disk from a Snapshot and then attaching that - perhaps a better name for this would be azurerm_virtual_machine_implicit_data_disk_from_source?

Thanks!

Comment on lines 317 to 330
// delete data disk if delete_option is set to delete
if toBeDeletedDisk != nil && pointer.From(toBeDeletedDisk.DeleteOption) == virtualmachines.DiskDeleteOptionTypesDelete &&
toBeDeletedDisk.ManagedDisk != nil && toBeDeletedDisk.ManagedDisk.Id != nil {
diskClient := metadata.Client.Compute.DisksClient
diskId, err := commonids.ParseManagedDiskID(*toBeDeletedDisk.ManagedDisk.Id)
if err != nil {
return err
}

err = diskClient.DeleteThenPoll(ctx, *diskId)
if err != nil {
return fmt.Errorf("deleting Managed Disk %s: %+v", *diskId, err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an API setting - why are we deleting the disk here?

Copy link
Contributor Author

@ms-zhenhua ms-zhenhua Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this disk is created by the backend so that it's not managed by any TF resource such as azurerm_managed_disk. If we do not delete it explicitly here, it will only be removed from the VM configuration, but the disk itself still exists. When VM is deleted, this disk will not be deleted together because it's already been removed from the VM configuration. Since the disk still exists in the resource group, the resource group cannot be deleted unless prevent_deletion_if_contains_resources=false. Due to this reason, I choose to explicitly delete the disk when DeleteOption=true.
Could you kindly help confirm whether it's feasible? If not, I can remove the delete_option and use prevent_deletion_if_contains_resources=false to delete this resource.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm @ms-zhenhua, what're your thoughts on deleting the disk when it's unhooked from the vm and not worrying about delete_option? This resource means that Terraform is creating the disk and I would expect Terraform to also delete the disk

Are there use cases when we want to keep the disk around?

Copy link
Contributor Author

@ms-zhenhua ms-zhenhua May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mbfrahry, when we create this resource, what we really do is to add a data disk configuration to the existing VM, which will trigger the Azure backend to create an implicit data disk for the VM. However, when we delete this resource, what we really do is just remove the data disk configuration from the VM, but the physical data disk is only detached from the VM and still exists in the resource group. That's why I use the delete option to manually delete this data disk, otherwise, the resource group cannot be deleted normally because the data disk still exists in the resource group.

Comment on lines 169 to 172
if model := resp.Model; model != nil {
if props := model.Properties; props != nil {
if profile := props.StorageProfile; profile != nil {
if dataDisks := profile.DataDisks; dataDisks != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's ways to provision a VM without a StorageProfile and/or DataDisks (e.g. when creating from certain types of image) - as such we'd end up doing nothing in those cases here - we'd need to update this to account for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code updated. Thanks.

Type: pluginsdk.TypeString,
Required: true,
ValidateFunc: validation.StringInSlice([]string{
string(virtualmachines.CachingTypesNone),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the None topic in the Contributor Guide

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code updated. Thanks.

_ sdk.ResourceWithCustomImporter = VirtualMachineImplicitDataDiskResource{}
)

type VirtualMachineImplicitDataDiskResource struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the PR description:

this new resource creates a new data disk when updating a VM and allows the system to place the this disk closer to the parent VM to provide better latency and align the fault domain of the disk with the parent VM for providing better reliability.

Can you provide a source showing the performance benefits of this? Last I read using a Zone/Proximity Placement Group should give the same benefit here - and from what I understood was replacing the Fault Domains concept (which is associated with Availability Sets, which I believe aren't recommended anymore?)?

Based on this issue (Azure/azure-cli#28686) - since this exists only for copies of a Disk from a Snapshot, rather than a net-new Disk - I'm concerned that the resource name is going to be misleading too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with the service team, Proximity Placement Group can reduce distances between VMs, however, it does not include the data disk resources.
Availability Sets are being replaced by VMSS Flex, which will support storage aligned FDs in a single zone.
This feature is mainly used for optimizing the operations that need to happen when attaching a disk to a VM in a multi-FD deployment. By using implicit disk creation, system will create a new disk and place the disk in the right Fault Domain, which can provide a better latency.

@ms-zhenhua ms-zhenhua marked this pull request as draft April 15, 2024 05:45
@ms-zhenhua ms-zhenhua marked this pull request as ready for review April 17, 2024 02:57
@ms-zhenhua
Copy link
Contributor Author

Hi @tombuildsstuff, thank you for the review. I have updated the code and comments. Could you kindly have another review?

@ms-zhenhua ms-zhenhua changed the title New Resource: azurerm_virtual_machine_implicit_data_disk New Resource: azurerm_virtual_machine_implicit_data_disk_from_source Apr 25, 2024
Copy link
Member

@mbfrahry mbfrahry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ms-zhenhua, I've left some more comments, I'm still hung up on the Delete method and would like some further discussion on that piece

@@ -0,0 +1,476 @@
package compute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This filename wants to be virtual_machine_implicit_data_disk_from_source_resource.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. Thanks.

@@ -0,0 +1,992 @@
package compute_test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this filename wants to be virtual_machine_implicit_data_disk_from_source_resource_test.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. Thanks.

return nil, fmt.Errorf("Data Disk %s was not found", *id)
}

if disk.CreateOption != virtualmachines.DiskCreateOptionTypesAttach && disk.CreateOption != virtualmachines.DiskCreateOptionTypesEmpty {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make a test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added a new testcase:

=== RUN TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk
=== PAUSE TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk
=== CONT TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk
--- PASS: TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk (594.13s)
PASS

@@ -154,6 +197,15 @@ func resourceVirtualMachineDataDiskAttachmentCreateUpdate(d *pluginsdk.ResourceD
WriteAcceleratorEnabled: pointer.To(writeAcceleratorEnabled),
}

// there are ways to provision a VM without a StorageProfile and/or DataDisks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also include a test to confirm that the existing VM won't have any diffs when a disk is attached this way

Copy link
Contributor Author

@ms-zhenhua ms-zhenhua May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I have re-run all existing testcases which are passed and attached in the description.
  2. Though this comment said there are ways to provision a VM without a StorageProfile and/or DataDisks with certain types of images, I do not really know which image will trigger such scenario. Kindly let me know if you know the image type and I will add the testcase. However, this part of code will not cause any diff because this code only helps initialize the storage and disk when they are nil, which is a protection for the existing code.

}

// delete data disk if delete_option is set to delete
if toBeDeletedDisk != nil && pointer.From(toBeDeletedDisk.DeleteOption) == virtualmachines.DiskDeleteOptionTypesDelete &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Terraform is creating this resource, I believe we can make the assumption that Terraform can delete this resource? If there is value in keeping the disk around, we can add a new variable to the features block that will hang on to the disk instead of deleting it. Or vice versa if the default should be to keep the disk around instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This resource is a little special. It manages the data disk of a VM such as azurerm_virtual_machine_data_disk_attachment. When creating this resource, what is really done is to add a data disk configuration to the existing VM, which will trigger the Azure service to create a new data disk for the VM. When deleting this resource, what is really done is to remove the data disk configuration from the existing VM. However, the physical data disk will not be removed from the resource group, which may cause the resource group cannot be deleted later. That's why I add this ugly delete option. According to your comment, do you mean we can hang on to the disk by default and should add a new variable to the provider features block to control whether to delete the physical data disk when deleting this resource?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tough because from the users perspective, Terraform is creating a disk (I do see that Azure is creating the disk from what we are sending to the api but the user will see that it is Terraform that is creating the disk) so running destroy should delete the disk that was created.

My thought is to have the default be to delete this disk and then in a features block, we add an option to detach the disk instead of deleting it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does make sense. I have removed the delete_option and added a variable in the feature block.

=== RUN TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk
=== PAUSE TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk
=== CONT TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk
--- PASS: TestAccVirtualMachineDataDiskAttachment_importImplicitDataDisk (447.13s)
PASS
ok github.com/hashicorp/terraform-provider-azurerm/internal/services/compute 498.662s

--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_destroy (395.01s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_basic (414.26s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_requiresImport (418.42s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_managedServiceIdentity (423.98s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_multipleDisks (472.35s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_virtualMachineExtension (499.92s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_updatingWriteAccelerator (509.00s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_updatingCaching (511.20s)
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_virtualMachineApplication (1033.91s)
PASS
ok github.com/hashicorp/terraform-provider-azurerm/internal/services/compute 1045.466s

=== RUN TestAccVirtualMachineImplicitDataDiskFromSource_detachImplicitDataDisk
=== PAUSE TestAccVirtualMachineImplicitDataDiskFromSource_detachImplicitDataDisk
=== CONT TestAccVirtualMachineImplicitDataDiskFromSource_detachImplicitDataDisk
--- PASS: TestAccVirtualMachineImplicitDataDiskFromSource_detachImplicitDataDisk (462.70s)
PASS
ok github.com/hashicorp/terraform-provider-azurerm/internal/services/compute 510.088s

@ms-zhenhua
Copy link
Contributor Author

Hi @mbfrahry , thank you for your review. I have updated the code and replied your comments. Please help take another review and leave your questions and comments. Thanks.

Copy link
Member

@mbfrahry mbfrahry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ms-zhenhua I've responded and really do think that we should add a flag in the features block to keep the disk rather than doing the DeleteOption you've written out

@ms-zhenhua
Copy link
Contributor Author

Hi @mbfrahry, thank you for your advice. I removed DeleteOption and made Delete as the default operation. I added a variable in the features block to control whether to detach the data disk when destroying the resource. Please take another review. Thanks.

@ms-zhenhua ms-zhenhua requested a review from mbfrahry May 22, 2024 09:03
Copy link
Member

@mbfrahry mbfrahry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for working through the reviews @ms-zhenhua !

@mbfrahry mbfrahry merged commit 3ec5554 into hashicorp:main May 29, 2024
33 checks passed
mbfrahry added a commit that referenced this pull request May 29, 2024
@github-actions github-actions bot added this to the v3.106.0 milestone May 29, 2024
Copy link

github-actions bot commented Jul 1, 2024

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants