-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azurerm VM refresh with delete_os_disk_on_termination=true is failing with a cannot find storage account error. #102
Comments
Hey @djsly Thanks for opening this issue :) I've spent some time looking into this but I'm struggling to reproduce this issue - the error message being returned states that the Storage Account doesn't exist (or there's an eventual consistency bug in the API) however I'd expect to be able to reproduce this (and I've been unsuccessful so far). So that we can investigate this further - would you be able to answer the following:
Thanks! |
I am also seeing this issue pretty frequently. We are using an outside script to invoke Terraform but definitely not running it twice. And I have confirmed that the storage account was not deleted outside Terraform before running the destroy (the OS disk for the VM being destroyed is in that storage account so I wouldn't have been able to delete it before the VM was destroyed). Interestingly enough, a second attempt to destroy seems to succeed every time, so this does seem like some sort of consistency/timing issue. I am going to see if I can reproduce when running terraform manually to destroy. |
I still can reproduce using the exact same Config File
|
the provided configuration files from the OP are missing this
|
what I did was to run
and reran |
In my case, there are no changes to the VM name or anything. I've just deployed a VM and then later want to destroy it, and that's when I see the error. Then when I try destroying again, it succeeds. |
Actually I realized we are detaching a secondary disk right before we delete the VM (this is done using the azure CLI). I wonder if Azure is still propagating that change when the delete comes in? I am going to try a delete without the secondary disk involved, and one with the secondary disk, and see if that seems to be related. @djsly did you make any changes to your VM or its configuration outside of terraform? |
I was able to reproduce even without a secondary disk, so that doesn't seem to be it. I created a VM with terraform, waited for a few minutes and then ran "terraform destroy" and saw the issue. I am using a custom VHD file for my VMs, could that be it? It looks like @djsly is also using a custom VHD file. One other interesting thing -- I noticed that when this error happens, even after I run terraform again to destroy, my VM's OS disk still remains in the storage account (I have delete_os_disk_on_termination set to true). Is this error happening when terraform tries to delete the OS disk after terminating the VM? It seems like the second time through, during the refresh it doesn't find the VM in Azure and so it doesn't try again to destroy the OS disk? |
No, I only use Terraform CLI and never log on to the portal.
I'm using the official Ubuntu Image as my Base Image for the sake of this example. So no custom VHD |
Ah sorry I see the Ubuntu image in your terraform config above. @tombuildsstuff did you have delete_os_disk_on_termination = "true" when you were trying it out? I am struggling to find any common "weird stuff" between @djsly and my configs that could explain why we are the only ones seeing this. I started to see this issue maybe 3 weeks ago or so, and it didn't seem to be triggered by any changes to my configs (or a new version of terraform). So I was thinking maybe something changed on the Azure side. I just added a retry since that seemed to work (and hoped that Azure would fix things). It would still be nice to know for sure. |
FYI: I simply used the official Azure example from terraform's website and I added |
I've pasted some debug output that I'm getting here: https://gist.github.com/bpoland/dd300ccc387a1671b060d01adb4734e6 A colleague noticed that the response from Azure includes no results but does include a "nextLink" -- is it possible the results are paginated and terraform needs to get the next "page" of results to find the storage account? The Azure subscription I'm working in has a lot of resources so maybe others don't see this if they have fewer resources. @djsly are there a lot of resources in the Azure subscription you're using? |
I'm not sure what I guess it could be identified as |
Haha yeah hard to say what "a lot" is :) @tombuildsstuff when you were trying to reproduce, how many resources did you have in your azure subscription? Any thoughts about the pagination? Thanks! |
Hi @djsly , I used your tf files and the following steps:
But the issue is not reproduced |
Hi @JunyiYi , we moved to Managed disk so we haven't exercised this logic path for a while. I do not mind closing it as it was probably fixed by now :) |
Has anyone made any changes that they think should fix this problem? I think you need to be using a subscription with a lot of storage accounts in order to see the problem, because some results coming back from Azure are paginated and that causes terraform to not be able to find the storage account. |
@JunyiYi the issue I experienced is the exact one that @djsly reported in this issue. It seems that in order to reproduce you need to have a large number of storage accounts in the subscription. Could you try creating 50 or 100 more storage accounts temporarily in your subscription and then see if you can reproduce it? |
@bpoland is correct, we used to have over 200 storage account (one per VM) |
We're still seeing this as well under Terraform v0.11.3, not sure what the provider version was. Same boat as everyone else, destroy fails when |
@JunyiYi @tombuildsstuff would you be able to reopen this issue since it was never actually fixed? |
I know it's bad form to "bump" or add a "me too", but I just ran into this bug. Please could it be re-opened as it's not fixed? I have some 60-odd Azure storage accounts holding disk images, It therefore looks to be exactly the same issue with Terraform not paginating results returned from the Azure API so it assumes the storage account does not exist. |
I ended up moving to managed disks but we still had problems with it before we switched. My workaround was to add a separate azurerm_storage_blob resource for the OS disk:
Then in the VM itself turn delete_os_disk_on_termination off and add But this is absolutely still an issue with the provider. |
@JunyiYi Could this be re-opened please ? Or would you prefer me to create a new (duplicate) issue ? As mentioned above, we're seeing this exact same issue, and for various reasons cannot move to managed disks to work around the problem, or switch to a separate blob store resource. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
This issue was originally opened by @djsly as hashicorp/terraform#15228. It was migrated here as part of the provider split. The original body of the issue is below.
Terraform Version
0.9.8
Affected Resource(s)
Please list the resources as a list, for example:
Terraform Configuration Files
Debug Output
https://gist.github.com/djsly/11300a541a92432002a843509b1fb1ed
Expected Behavior
the VM refresh should delete the os_disk and proceed with the deletion
Actual Behavior
Errors out trying to delete the blob
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
terraform apply
terraform apply
The text was updated successfully, but these errors were encountered: