Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for disk attachment to VMs at creation time #6117

Open
mal opened this issue Mar 15, 2020 · 39 comments
Open

Support for disk attachment to VMs at creation time #6117

mal opened this issue Mar 15, 2020 · 39 comments
Assignees
Labels
enhancement service/virtual-machine upstream/terraform This issue is blocked on an upstream issue within Terraform (Terraform Core/CLI, The Plugin SDK etc)
Milestone

Comments

@mal
Copy link

mal commented Mar 15, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

Azure allows VMs to be booted with managed data disks pre-attached/attached-on-boot. This enables use cases where cloud-init and/or other "on-launch" configuration management tooling is able to prepare them for use as part of the initialisation process.

This provider currently only supports this case for individual VMs with the older, deprecated azurerm_virtual_machine resource. The new azurerm_linux_virtual_machine and azurerm_windows_virtual_machine resources instead opt to push users towards the separate azurerm_virtual_machine_data_disk_attachment which only attaches data disks to an existing VM post-boot, which fails to service the use case laid out above.

This is in contrast to the respective *_scale_set providers which (albeit out of necessity) support this behaviour.

Please could a repeatable data_disk block be added to the new VM resources (analogous to the same block in their scale_set counterparts) in order to allow VMs to be started with managed data disks pre-attached.

Thanks! 😁

New or Affected Resource(s)

  • azurerm_linux_virtual_machine
  • azurerm_windows_virtual_machine

Potential Terraform Configuration

resource "azurerm_linux_virtual_machine" "example" {
  [...]

  os_disk {
    name                 = "example-os"
    caching              = "ReadWrite"
    storage_account_type = "StandardSSD_LRS"
  }

  data_disk {
    name                 = "example-data"
    caching              = "ReadWrite"
    disk_size_gb         = 4096
    lun                  = 0
    storage_account_type = "StandardSSD_LRS"
  }

  [...]
}

References

@rgl
Copy link
Contributor

rgl commented Mar 15, 2020

azurerm_virtual_machine_data_disk_attachment which only attaches data disks to an existing VM post-boot

Oh, that is really unfortunate... I wish I could try this but I'm not even able to create a managed disk due to #6029

@lanrongwen
Copy link

lanrongwen commented Jun 10, 2020

If I'm following this thread correctly (as we are still using the legacy disk system and were looking to move over) can you not deploy VMs with disks already attached? Is it truly rebooting VMs for each disk (thread in #6314 above)? This feels like a HUGE step backwards especially if the legacy mode we are using is being deprecated.

@lightdrive
Copy link

Also how do you deploy and configure a data disk that is in the source reference image if the data disk block is no longer valid?

@rgl
Copy link
Contributor

rgl commented Aug 21, 2020

@lightdrive, I've worked around it by using ansible at https://github.com/rgl/terraform-ansible-azure-vagrant

@scott1138
Copy link
Contributor

This is something I just ran across as well, I'd like to be able to use cloud-init to configure the disks. Any news on a resolution?

@jackofallops
Copy link
Member

This item is next on my list, no ETA yet though sorry. I'll link it to a milestone when I've had chance to size and scope it.

@ilons
Copy link

ilons commented Nov 2, 2020

It seems that the work done by @jackofallops have been closed with a note that it needs to be implemented in a different way.

Does anyone have a possible work-around for this?

My use-case are like others have pointed out:

  • Use cloud-init to manage data disks for Linux VMs

Writing my own scripts to make this instead of using cloud-init seems like a waste.
Using the workaround mentioned in #6074 (comment) might be possible, but seems to hacky indeed, and require some large changes to how resources are created.

@mal
Copy link
Author

mal commented Nov 2, 2020

Alas, was really looking forward to an official fix for this. 🙁

In lieu of that however, here's what I came up with about six months ago having had no option but to make this work at minimum for newly booted VMs (note: this has not been tested with changes to, or replacements of the disks - literally just booting new VMs). I'm also not really a Go person, and as a result this is definitely a hack and nothing even approaching a "good" solution, much less sane contents for a PR. Given that be warned that whatever state is generated is almost certainly destined to be incompatible with whatever shape the official implementation yields should it ever land, but on the off chance it does prove useful in some capacity or simply the embers to spark someone else's imagination, here's the horrible change I made to allow for booting VMs with disk attached such that cloud-init could run correctly: 6e19897.

Usage:
resource "azurerm_linux_virtual_machine" "example" {
  [...]
  data_disk {
    name                 = "example-data"
    caching              = "ReadWrite"
    disk_size_gb         = 320
    lun                  = 0
    storage_account_type = "StandardSSD_LRS"
  }
  [...]
}

@tombuildsstuff
Copy link
Contributor

@mal FWIW this is being worked on, however the edge-cases make this more complicated than it appears - in particular we're trying to avoid several limitations from the older VM resources, which is why this isn't being lifted over 1:1 and is taking longer here.

@mal
Copy link
Author

mal commented Nov 2, 2020

Thanks for the insight @tombuildsstuff, great to know it's still being actively worked on. I put that commit out there in response to the request for possible work-arounds in case it was useful to someone that finds themself in the position I was in previously, where waiting for something to cover all the cases wasn't an option. Please don't take that as any kind of slight or indictment of the ongoing efforts, I definitely support any official solution covering all the cases, in my case it just wasn't possible to wait for it, but I'll be first in line to move definitions over to it when it does land. 😁

@alec-pinson
Copy link

incase this helps anyone else... main part to note is the top line waiting for 3 disks before trying to format them etc

write_files:
  - content: |
      # Wait for x disks to be available
      while [ `ls -l /dev/disk/azure/scsi1 | grep lun | wc -l` -lt 3 ]; do echo waiting on disks...; sleep 5; done

      DISK=$1
      DISK_PARTITION=$DISK"-part1"
      VG=$2
      VOL=$3
      MOUNTPOINT=$4
      # Partition disk
      sed -e 's/\s*\([\+0-9a-zA-Z]*\).*/\1/' << EOF | fdisk $DISK
        n # new partition
        p # primary partition
        1 # partition number 1
          # default - start at beginning of disk
          # default - end of the disk
        w # write the partition table
        q # and we're done
      EOF

      # Create physical volume
      pvcreate $DISK_PARTITION

      # Create volume group
      if [[ -z `vgs | grep $VG` ]]; then
        vgcreate $VG $DISK_PARTITION
      else
        vgextend $VG $DISK_PARTITION
      fi

      # Create logical volume
      if [[ -z $SIZE ]]; then
        SIZE="100%FREE"
      fi

      lvcreate -l $SIZE -n $VOL $VG

      # Create filesystem
      mkfs.ext3 -m 0 /dev/$VG/$VOL

      # Add to fstab
      echo "/dev/$VG/$VOL   $MOUNTPOINT     ext3    defaults        0       2" >> /etc/fstab

      # Create mount point
      mkdir -p $MOUNTPOINT

      # Mount
      mount $MOUNTPOINT
    path: /run/create_fs.sh
    permissions: '0700'

runcmd:
  - /run/create_fs.sh /dev/disk/azure/scsi1/lun1 vg00 vol1 /oracle
  - /run/create_fs.sh /dev/disk/azure/scsi1/lun2 vg00 vol2 /oracle/diag

@ruandersMSFT

This comment has been minimized.

@tombuildsstuff
Copy link
Contributor

@ruandersMSFT that's what this issue is tracking - you can find the latest update here

As per the community note above: Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request - which is why comments are marked as off-topic - we ask instead that users add a 👍 to the issue.

@andyliddle

This comment was marked as off-topic.

@michaelbaptist

This comment was marked as off-topic.

@TychonautVII
Copy link

TychonautVII commented Dec 30, 2022

I think a hacky work around is that azure deployment templates are able to deploy a VM and attach a disk at creation. So you can:

(1) make an azure deployment template for the VM you need. (It's easy to do this in the Azure console by manually configuring the VM and clicking the "Download template for automation" button
(2) Deploy that template using "azurerm_resource_group_template_deployment" (or outside of terraform).

I'd much rather have the terraform resource support this, but I think something like this might be a stopgap. I'm trying to get this integrated into our process now and it's working so far.

@jeffwmiles
Copy link

I expect this will be marked off-topic, but after nearly 3 years since open, this issue needs more attention.

The AzureRM provider has put its users in a bad place here. There are critical features of Azure that are now inaccessible as mentioned by others in this thread. Because my shared OS image has data disks, I cannot use dedicated hosts, cloud-init, or proper identity support for my virtual machine, and this list will only continue to grow because the cloud never stops moving.

How can we as a community help here? There is clearly a lot of development effort going into this provider, judging by the changelog and rate of pull requests; can we raise the priority of this issue?

There is certainly an opportunity for more transparency on why this hasn't moved and other items are getting development attention.

@michaelbaptist
Copy link

If there is a clean way to migrate from virtual_machine to deployment template, I can live with that, but current terraform will try to do unexpected things due to how they've implemented deployment templates as well.

@GraemeMeyerGT
Copy link

@jackofallops it's hard to tell in this thread, but looks like you may have added this to the "blocked" milestone. It's no longer clear in the thread what is blocking this issue. Can you clarify? We are seeing a lot of activity in this thread and it's the third-most 👍 issue.

@TheBlackMini
Copy link
Contributor

What is the state of this issue? Is it blocked?

It's currently 3 years old and we still can't build a VM from a template which has data disks?

@kvietmeier
Copy link

kvietmeier commented May 20, 2023

The azurerm_linux_virtual_machine_ docs include "storage_data_disk" as a valid block but terraform plan errors out claiming it is unsupported. I tried a dynamic block and a standard block - with a precreated disk to "attach" and "empty" with no disk created - all failed.

When I've seen this error before it was either a syntax error or a no longer supported block type.

Is this a documentation bug?

Versions:

KV C:\Users\ksvietme\repos\Terraform\azure\VMs\linuxvm_2> terraform version
Terraform v1.4.6
on windows_amd64
+ provider registry.terraform.io/hashicorp/azurerm v3.57.0
+ provider registry.terraform.io/hashicorp/random v3.5.1
+ provider registry.terraform.io/hashicorp/template v2.2.0

Error:

╷
│ Error: Unsupported block type
│
│   on linuxvm_2.main.tf line 162, in resource "azurerm_linux_virtual_machine" "linuxvm01":
│  162:  dynamic "storage_data_disk" {
│
│ Blocks of type "storage_data_disk" are not expected here.
╵

Disk creation (works)


resource "azurerm_managed_disk" "lun1" {
  name                 = "lun17865"
  location                        = azurerm_resource_group.linuxvm_rg.location
  resource_group_name             = azurerm_resource_group.linuxvm_rg.name
  storage_account_type = "Standard_LRS"
  create_option        = "Empty"
  disk_size_gb         = "100"

  tags = {
    environment = "staging"
  }
}

Call to storage_data_disk:

resource "azurerm_linux_virtual_machine" "linuxvm01" {
  location                        = azurerm_resource_group.linuxvm_rg.location
  resource_group_name             = azurerm_resource_group.linuxvm_rg.name
  size                            = var.vm_size
  
  # Make sure hostname matches public IP DNS name
  name          = var.vm_name
  computer_name = var.vm_name

  # Attach NICs (created in linuxvm_2.network)
  network_interface_ids = [
    azurerm_network_interface.primary.id,
  ]

  # Reference the cloud-init file rendered earlier
  # for post bringup configuration
  custom_data = data.template_cloudinit_config.config.rendered

  ###--- Admin user
  admin_username = var.username
  admin_password = var.password
  disable_password_authentication = false

  admin_ssh_key {
    username   = var.username
    public_key = file(var.ssh_key)
  }

 ###--- End Admin User
 dynamic "storage_data_disk" {
    content {
    name = azurerm_managed_disk.lun1.name
    managed_disk_id   = azurerm_managed_disk.lun1.id
    disk_size_gb = azurerm_managed_disk.lun1.disk_size_gb
    caching = "ReadWrite"
    create_option = "Attach"
    lun = 1
    }
  }
  

  ### Image and OS configuration
  source_image_reference {
    publisher = var.publisher
    offer     = var.offer
    sku       = var.sku
    version   = var.ver
  }

  os_disk {
    name                 = var.vm_name
    caching              = var.caching
    storage_account_type = var.sa_type
  }

  # For serial console and monitoring
  boot_diagnostics {
    storage_account_uri = azurerm_storage_account.diagstorageaccount.primary_blob_endpoint
  }

  tags = {
    # Enable/Disable hyperthreading (requires support ticket to enable feature)
    "platformsettings.host_environment.disablehyperthreading" = "false"
  }

}
###--- End VM Creation

Thanks. I'm sure I'm missing something here.

@matteus8
Copy link

matteus8 commented Sep 20, 2023

So is this not possible, and if not now, will this be possible in the future as the azurerm_virtual_machine becomes depreciated.

`resource "azurerm_linux_virtual_machine" "example_name" {
name = "${var.lin_machine_name}"
#...
source_image_id = "/subscriptions/XXXXXXXXX/resourceGroups/example_RG/Microsoft.Compute/galleries/example_gallary/images/example_image/versions/0.0.x"

os_disk {
name = "lin_name"
caching = "ReadWrite"
storage_account_type = "StandardSSD_LRS"
}

depends_on = [
#...
]
}`

######################
essentially my 'source_image_id' has a snapshot of an image with 2 data disks attached. However when doing a 'terraform apply' I will get the following error...
"Original Error: Code="InvalidParameter" Message="StorageProfile.dataDisks.lun does not have required value(s) for image specified in storage profile." Target="storageProfile""
######################

I have tried using the "data_disk" option, but this is not supported as stated above.

` data_disks {
lun = 0
create_option = "FromImage"
disk_size_gb = 1024
caching = "None"
storage_account_type = "Premium_LRS"
}

data_disks {
lun = 1
create_option = "FromImage"
disk_size_gb = 512
caching = "None"
storage_account_type = "Premium_LRS"
}`

Are there any other suggestions, or will this be included in terraform in the near future?

@shaneholder
Copy link

I feel I must be missing something here as my scenario seems like it would be so common that this issue would need to have been addressed much sooner.

I am trying to use Packer to build CIS/STIG compliant VMs for GoldenImages. Part of the spec has several folders that need to go onto non root partitions. To achieve this I added a drive added the partitions and moved data around. We also use LVM in order to achieve availability requirements if a partition gets full. I used az cli to boot the VM and I was also able to add an additional data drive using the --data-disk-sizes-gb option so I know the control plane will handle it.

When I try to use the VM with Terraform I get the storageAccount error mentioned above. Is there really no viable workaround for building golden images with multiple disks and using TF to create the VM's?

@djryanj
Copy link
Contributor

djryanj commented Oct 17, 2023

@shaneholder for now, the generally accepted workaround (which I have used successfully) is to use a secondary azurerm_virtual_machine_data_disk_attachment resource to attach the disk, and the cloud-init script recommended by @agehrig in this comment.

It would be great to hear from the developers as to exactly why this is still blocked, since it's unclear to everyone here especially given the popularity of the request.

@shaneholder
Copy link

@djryanj thanks for the reply. I'm trying to understand it in the context of my problem though. The image in the gallery already has 2 disks, 1 os and 1 data, and right now i'm not trying to add another disk but that would be the next logical step. The issue I'm having is that I can't even get to the point where the VM has been created.

I ran TF with a trace and found the PUT command that creates the VM and what I believe is happening is that TF seems to be incorrectly adding a "dataDisks": [] element to the JSON sent in the PUT request. If I take the JSON data for the PUT and remove that element and then run the PUT command manually the VM is created with 2 disks as expected.

@djryanj
Copy link
Contributor

djryanj commented Oct 17, 2023

@shaneholder ah I understand. If the gallery image has 2 disks and is not deployable via Terraform using the azurerm_linux_virtual_machine resource because of that, I don't think it's solvable using the workaround I suggested and I'm afraid I don't know what to suggest other than moving back to an azurerm_virtual_machine resource, or getting a working ARM template for the deployment and using something like a azurerm_resource_group_template_deployment resource to deploy that from the working template, which is awful, but would work.

@tombuildsstuff - I'm sure you can see the activity here. Any input?

@shaneholder
Copy link

A little more information. I just ran the same TF but used a VM image that does not have a data disk built in. That PUT request also has the "dataDisks": [] element in the JSON but instead of failing it succeeds and builds the VM. So it seems that if a VM image has an existing data disk and the dataDisks element is passed in the JSON then the VM build will fail, however if the VM Image does not have a data disk then the dataDisks element can be sent and the VM will build.

@shaneholder
Copy link

Another piece to the puzzle. I set the logging option for az cli and noticed that it adds the following dataDisks element when I specify additional disks. The lun:0 object is the disk that is built into the image. If I run similar code in TF the dataDisks property is an empty array rather than an array that includes the dataDiskImages from the VM Image Version combined with the additional disks I asked to be attached.

"dataDisks": [
                {
                  "lun": 0,
                  "managedDisk": {
                    "storageAccountType": null
                  },
                  "createOption": "fromImage"
                },
                {
                  "lun": 1,
                  "managedDisk": {
                    "storageAccountType": null
                  },
                  "createOption": "empty",
                  "diskSizeGB": 30
                },
                {
                  "lun": 2,
                  "managedDisk": {
                    "storageAccountType": null
                  },
                  "createOption": "empty",
                  "diskSizeGB": 35
                }
              ]

@shaneholder
Copy link

Alright, so I cloned the repo and fiddled around a bit. I hacked the linux_virtual_machine_resource.go file around line 512. I changed:

				DataDisks: &[]compute.DataDisk{},

to:

				DataDisks: &[]compute.DataDisk{
					{
						Lun:          utils.Int32(0),
						CreateOption: compute.DiskCreateOptionTypesFromImage,
						ManagedDisk:  &compute.ManagedDiskParameters{},
					},
				},

And I was able to build my VM with the two drives that are declared in the image in our gallery. Additionally I was also able to add a third disk using the azurerm_managed_disk/azurerm_virtual_machine_data_disk_attachment.

I was trying to determine how to find the dataDiskImages from the image in the gallery but I've not been able to suss that out yet. It seems that what needs to be done is the code should pull the dataDiskImages property and do a similar conversion as it does with the osDisk.

Hoping that @tombuildsstuff can help me out then maybe I can PR a change?

@shaneholder
Copy link

Ok, so on a hunch I completely commented out the DataDisks property and ran it again and it worked, I created a VM with both the included image data drive AND an attached drive.

shaneholder pushed a commit to shaneholder/terraform-provider-azurerm that referenced this issue Oct 25, 2023
- Remove the DataDisks property instead of making it an empty array
  allows for the usage of golden images that have multiple disks
@tombuildsstuff
Copy link
Contributor

👋 hey folks

To give an update on this one, unfortunately this issue is still blocked due to a combination of the behaviour of the Azure API (specifically the CreateOption field) and limitations of the Terraform Plugin SDK.

We've spent a considerable amount of time trying to solve this; however given the number of use-cases for disks, every technical solution possible using the Terraform Plugin SDK has hit a wall for some subset of users which means that Terraform Plugin Framework is required to solve this. Unfortunately this requires bumping the version of the Terraform Protocol being used - which is going to bump the minimum required version of Terraform.

Although bumping the minimum version of Terraform is something that we've had scheduled for 4.0 for a long time - unfortunately that migration in a codebase this size is non-trivial, due to the design of Terraform Plugin Framework being substantially different to the Terraform Plugin SDK, which (amongst other things) requires breaking configuration changes.

Whilst porting over the existing data_disks implementation seems a reasonable solution, unfortunately the existing implementation is problematic enough that we'd need to introduce further breaking changes to fix this properly once we go to Terraform Plugin Framework. In the interim the way to attach Data Disks to a Virtual Machine is by using the azurerm_virtual_machine_data_disk_attachment resource.

Moving forward we plan to open a Meta Issue tracking Terraform Plugin Framework in the not-too-distant future, however there's a number of items that we need to resolve before doing so.

We understand that's disheartening to hear, we're trying to unblock this (and several other) of the larger issues - but equally we don't want to give folks false-hope that this is a quick win when doing so would cause larger issues.

Given the amount of activity on this thread - I'm going to temporarily lock this issue for the moment to avoid setting incorrect expectations - but we'll post an update as soon as we can.


To reiterate/TL;DR: adding support for Terraform Plugin Framework is a high priority for us and will unblock work on this feature request. We plan to open a Meta Issue for that in the not-too-distant future - which we'll post an update about here when that becomes available.

Thank you all for your input, please bear with us - and we'll post an update as soon as we can.

@hashicorp hashicorp locked and limited conversation to collaborators Oct 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement service/virtual-machine upstream/terraform This issue is blocked on an upstream issue within Terraform (Terraform Core/CLI, The Plugin SDK etc)
Projects
None yet