-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File provisioning hangs with Terraform 12.* terraforming a Windows server with Powershell 5 installed #22006
Comments
I have the exact same issue. I switched to 11.14 and now file provisioner step completes no problem. |
Have the same issue on 0.12.3.
|
I also have this problem with terraform 0.12.5 and aws 2.19
My behavior is that occasionally 1-2 of the connections will occur so long as you only have 1 resource defined in the .tf file. If you have more than 1 resource that have the file provisioner none of them will connect using winRM. |
Having the same issue, terraform 0.12 using the google provider. |
Having the same issue, using aws provider
Creating an aws_instance with
hangs on |
Facing the same issue with windows 10 over winrm using vsphere provider and file provisioner , from the folder only some of the files get copied and the creation goes into endless loop . If inside the folder i am copying i only have 1 file it always works as expected |
Hi folks! I'm sorry that y'all are experiencing this behavior. Please do not post "+1" comments here, since it creates noise for others watching the issue and ultimately doesn't influence our prioritization because we can't actually report on these. Instead, react to the original issue comment with 👍, which we can and do report on during prioritization. That being said, if you have an example of this issue that includes new information, please do continue to share! |
I saw this issue as well when upgrading to terraform v12.3 as soon as it became viable for us to update but I assumed it was something wrong with our windows template or VSphere instance as changes had been done to these at the same time. I have noticed this behaviour on WIndows 10 versions, 1511, 1803 and 1903. During this version timeframe we have been using Vpshere provider 1.11.0 and 1.12.0. The issue seems inconsitend however, Say I roll out with count=10, perhaps the first 3 make it ok and the rest just hang. We install a bunch of things into the template but the only things we touch with powershell is PowerCLI and AD Module. Hope this info helps. edit
|
This issue appears to occur with remote-exec's as well. I replaced all my file provisioners with a combination of local-exec and remote-exec that would transfer the files to my instance(s) via AWS S3 - now the unzip stage (Of very small files <1MB) will hang in this same way. Experienced this on Windows Server 2019 DC Edition |
Experiencing the same issue on W2K16 - going to downgrade to |
Having this same issue with the |
Seeing the same issue via remote-exec, 12.7 |
I am getting same issue with
target machine is Windows 2019 DC with PowerShell 5 trace logs are
|
I have the same issue
|
I've posted a workaround to stack overflow which encodes extra files as base64 encoded strings and then decodes them inline back using the desired file name. Posting here in case it is helpful for others as well. E.g. in userdata text:
|
We see the same issue on openstack when creating Windows Server 2016 instances.
What we usually see that it works when we create just one instance. But when creating multiple instances in parallel it usually hangs on all of them. Occasionally it works for the first and hangs on the rest. |
Same issue vsphere provider : Version : $ terraform --version
Apply output :
Provisionner code : connection {
host = "${self.default_ip_address}"
type = "winrm"
port = 5986
https = true
timeout = "4m"
user = "Administrator"
password = "${var.local_adminpass}"
insecure = true
}
provisioner "file" {
source = "${path.module}/scripts/remote"
destination = "C:/InstallBinaries"
}
provisioner "remote-exec" {
inline = [
"powershell.exe -ExecutionPolicy bypass -File C:\\InstallBinaries\\Add-RoleToReg.ps1 -role \"${var.role}\" -envir \"${var.environment}\"",
"powershell.exe -ExecutionPolicy bypass -File C:\\InstallBinaries\\Install-SCCMClient.ps1",
]
} Scenarios tested :
Workaround : None...i'm deploying VMs two by two by adding them to my plan |
@jpatigny this may not be an option for you, and it’s a hassle, but I believe we’ve all found that parallel provisioning works if you downgrade to v0.11. The syntax reversion is the most time-consuming part, and you’ll have to determine if you’re using 0.12 features that aren’t supported in v0.11. |
Same issue with latest azurerm provider and terraform 0.12.9, file and remote provisioners have stopped working all together, trying to use winrm with certificate and port 5986 Is this prioritized? I really can't understand how such a major problem is open since July.. seem like a long due bug that many people are experiencing across providers.. we were about to move to 0.12, i am so happy that we didn't.. only converted around 5 modules out of 30, not going to convert any other From output:
From Log:
|
I certainly don't mean to cast aspersions, but I get a sense that this is being lowly-prioritized in part due to HC/TF's negative position on provisioners in general: https://www.terraform.io/docs/provisioners/index.html#provisioners-are-a-last-resort |
@mcascone interesting read, wow, it really doesn't seem like it is going to be fixed, so once again we need to go ahead and start modifying lots of terraform files, i evne have no idea how i am going to be able to cope with some of the issues, such as running remote exec provisioner only during destroy thank you for that link.. i didn't know that.. following the history of the docs i see this was only inserted lately, on september 9th.. so i wonder how are we supposed to copy files now and execute scripts, i really dont get why this product supported software was removed and we need to solely rely on the os' options. so many other tools are implemeting such technology, i would accepted such requirement in the problematic windows' worlds, but not on linux. I don't get the answers on the page asking to switch to working with imaging process, this is just not logical, so many artifacts and items that cannot get into images as they need to be dynamically generated after server is up.. cant believe we had to change so much to get to 0.12 and now in 0.12 so much to get even working, i thought 0.12 will make these issues disappear, this really makes me think of other altenatives to terraform that can handle such authentication. anyhow, thanks @mcascone |
@pixelicous , parallel provisioning works in v0.11, so if you aren't using v0.12-exclusive features, you can roll back the syntax. It's a hassle, for sure. |
rolling back isn't an option for me as i'm using a good deal of semi-complex 12-only features. i had to kill the process yesterday which yielded a corrupt state file that had a handle on dozens of resources so this is a pretty painful bug |
managed to get past this by changing my file destination path to use double backslashes "\" instead of forward slashes "/" . I recognize others on the forum already used backslashes so it may not be the real solution; it's something to try if you get stuck on ideas. |
@dhekimian No I haven't tested it with an older version, nor have I been able to replicate the issue here. |
I have been struggling for a whole day getting winRM work with AWS and Win Server 2019. I couldn't get the file provisioner to work. However, the latest implementation by philthynz using https worked for me. I got the file over but now there is something strange with access rights over winRM! That may be outside the subject of this issue but maybe someone know.. The first line in mount_efs.ps1 give this: The second line gives this: Running the same commands in the user_data powershell together with the winrm setup (without using remote-exec) works though, using the same admin account. I found that a bit strange! Why does not the winrm session allow me to create credentials? Terraform script:
mount_efs.ps1:
The combination that worked for me for mounting an SMB drive from a Linux host (which in turn is an EFS drive because AWS refuse to make EFS work directly on windows). That is, not use remote-exec for this, only for other scripts:
|
I wanted to give an update on this, because I know that the lack of a response has created the impression that this is deprecated. WinRM is supported with the file provisioner, and we intend to fix this. We talked about this issue as a team today and decided to work on it. I don't have an ETA for fixing it yet: the next step will be for an engineer to dig into it, and assess how involved it is. I wanted to give an update so that people who are trying to make decisions based on whether this is supported or not know that it is indeed supported. |
Hi all, We're still not totally sure what's going on here, but I did some investigation today and wanted to share what I learned both as a starting point for someone possibly picking up this bug to work on later and also in case any of what I've learned causes any theories for anyone reading this who is more familiar with WinRM than I am. (I am essentially totally unfamiliar, so I'm not hard to beat!) I understand from the discussion above that uploading files over WinRM was working in 0.11.14 but stopped working somewhere before 0.12.3. Given how early 0.12.3 was within the 0.12 series I'm going to assume for the moment that this regression occurred in 0.12.0, as part of the broader internal refactoring that came with that release. Terraform's WinRM communicator is mainly just a thin wrapper around a third-party library For file copying in particular (which is what the Based on the above, my sense is that this change in behavior wasn't caused directly by a change to the WinRM components in Terraform, but rather a change to something in the broader system that has changed some assumptions that the WinRM support was relying on. One possibly-relevant thing that changed in 0.12.0 was switching from the old plugin protocol to the new one based on gRPC. A key implication of that shift is in how it handles "instances" of plugins: the old protocol separated the idea from starting up the child process from the idea of creating an instance of the provisioner type inside it, which I believe meant that each separate The new protocol changed the model so that each plugin process contains exactly one "instance" of the plugin's main type, which directly answers incoming gRPC requests. Consequently, from Terraform 0.12.0 onwards I believe (though have so far only confirmed by reading code, not active testing) that there is only one instance of the file provisioner being shared across all calls. With that said, at first look I've been unable to find a reason why that change in model should have a negative effect. Unlike provider plugins, provisioner plugins are intended to retain no state between calls, and indeed the terraform/builtin/provisioners/file/resource_provisioner.go Lines 41 to 49 in 5e3c02b
The file provisioner code is also the same whether it's using So with that said, I've not been able yet to find a specific link between the plugin model change and the behavior change described in this bug. Further detailed debugging of both versions can hopefully confirm that it is indeed creating an entirely new communicator instance per provisioner call. Another thing that changed between Terraform 0.11.14 and 0.12.0 is that we began building against a different version of Go. Unfortunately we were not yet tracking specific Go versions in the repository during the 0.11.14 line, but I believe that Terraform 0.11.14 was built with one of the Go 1.11 releases, while Terraform 0.12.0 was built with Go 1.12.4. I don't have any immediate ideas about any specific things that changed between Go 1.11 and Go 1.12, but I just wanted to note that in case it spurs any thoughts from others who might know of some changes to how Go standard library functions behave on Windows between those two releases. Finally, I considered that Packer is another similar program which has a "file" provisioner that can work over WinRM. I had a look in the Packer repository to see if they'd encountered any similar issues but so far I wasn't able to find anything similar to what's reported here in the issues for their Packer has quite a few issues relating to WinRM though, and timeouts are a common symptom of misconfigured provisioners, so I can't be sure I saw everything. If any readers of this encountered any similar problems with Packer at around the same time as Terraform 0.12.0 was released (I see several mentions of Packer in the comments above, so I assume some of you use it), please let me know! It would be very useful to be able to find a correlated change in Packer to help narrow down what's going on here. This is all I was able to get from some initial research here. If anything above causes any ideas for anyone else reading, please let me know. Otherwise, someone on the Terraform team will dig into this deeper in the future and see if we can figure out what's changed here. |
Hi everyone! I'd like to take the next steps digging into this issue, and it looks like I'm going to need some help. I would like to get a working Terraform 0.11 config that fails after upgrading to the latest version of terraform. It took me all day to get the configuration right for a windows instance provisioned with the file provisioner in terraform v0.11 - and it works just fine in 0.13. I did spent 6 hours today looking at timeouts, but they were all caused by configuration issues. Here's a gist with the configuration I used; if anyone here has a reproduction case that works in 0.11 and not 0.12/0.13 I would really appreciate it if you could share. I would prefer AWS, but I believe I can get access to azure as well for testing. Thanks! |
@mildwonkey This issue appears when we use a file provisioner to copy a folder of many files/folders to the remote host. For example, we clone https://github.com/dsccommunity/SqlServerDsc to the local filesystem and then use a file provisioner to copy that to the remote node. The process takes a really long time and eventually just stops working (on specific terraform versions). This works on 0.11.14. When we updated to 0.12.23, that's when we first found this issue. Yesterday, I did some testing and found the following:
I couldn't identify anything obvious here: v0.12.25...v0.12.26, but it seems like something definitely changed from 0.12.25 to 0.12.26. We'll do more testing w/ 0.12.26 to see if we uncover other issues. For reference, here's the provisioner content we use:
We're trying to use 0.13.0-beta2, but we've have had issues w/ our providers properly loading w/ |
Thank you @dandunckelman , that's helpful extra info! I'll use that same repository so I have a more realistic set of files to transfer. Can I ask what backend you are using? There may be some changes to the backends worth looking into. |
@mildwonkey we're using https://github.com/terraform-providers/terraform-provider-vsphere on vCenter/ESXi 6.7u3 |
That's good to know too, but what |
@mildwonkey right, that was provider. Backend is local. |
Thanks @dandunckelman I've also been suffering from this issue. Just done some tests with Terraform 0.12.26 and can confirm this now works to deploy multiple Windows VMs and with multiple file provisioners into Azure. Tests with same code using Terraform 0.12.24 fails. Code extract below for anyone else Thank you
|
I'm still working on this issue, but I have merged a PR that slightly helps matters by resulting in a time out, instead of an endless hanging run. This will be included in the next 0.13 release. The underlying winrmcp library uses
While I absolutely still want to figure out why this was working for you and then stopped working, it's important to know that the file provisioner is not very efficient at uploading larger directories. One possible workaround could be to upload an archive and use remote-exec to extract the archive. |
Just hit this bug on up to date windows 10 with powershell installed and terraform 0.12.28. Even some trivial terraform to write a file locally just hangs forever:
I managed to work round it using a very hacky local-exec and some base64 round tripping (borrowed from SO). Ugly but effective while we wait for a proper fix.
|
@alastairtree I have confirmed the hangs-forever issue you showed with 0.12.26 and the code you showed. This specific example no longer works in 0.13 because about 8 months ago the ability to use the file provisioner locally without a remote host was removed, so I suspect this issue is still there but this is no longer a usable reproduction case. |
I've tried to reproduce this using a reproduction case @mildwonkey put together and linked in her gist, above. Using the current 0.14.0 alpha release, the Windows EC2 instance took 6m20s to provision and accept a winrm connection, and then provisioning ran fairly quickly, not at all the long hang I've previously observed trying to reproduce this. Based on this test result, @darrens280 reporting similar success, I think that this issue is resolved. I've re-published the reproduction case in https://github.com/danieldreier/terraform-issue-reproductions/tree/master/22006. I want to be clear that this is not a catch-all for all Windows provisioning issues: the specific problem that was reported was windows file provisioning hanging entirely, or being unusable, even with tiny files. I was able to reproduce that in earlier 0.12.x versions, and I'm not able to reproduce that failure mode anymore in 0.14.0 alpha, so I would like to close this issue. If you've been encountering these problems, please test again with a recent 0.13.x release or an 0.14.0 pre-release, and if you're able to reproduce it, please share a clear reproduction case or contribute to the one I linked above with a pull request. If we don't have a reproduction case by the time the second 0.14 beta ships, I'm going to consider this fixed and close the issue. Please feel free to reach out and ask for help if you're seeing the issue in practice but are struggling to make a clear reproduction case - if this is still a problem I want to help. |
Closing as we have not seen any more recent reproductions of the issue. If anyone encounters a similar case, please verify against the latest release (which is 0.15-beta2 at this time) and file a new issue. Thanks! |
Having the issue with 0.15beta2 |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Terraform Version
Terraform Configuration Files
Debug Output
The interrupt received was me cancelling the cmd line process.
Crash Output
N/A
Expected Behavior
On a Windows Server with Powershell 5 installed, be able to use Terraform 12.* to complete file provisioning of TestFolder1, copying all content and subfolder content, and continue to the next file provisioning step.
This works in Terrafform 11.*
I am using terraform v0.11.8 successfully with provider.aws v1.35.0
Actual Behavior
Since upgrading to Terraform 12.*, the file provisioning of TestFolder1 copies a couple of the 1 KB files over, seems to copy over what can be done in 1 minute, and then stops copying over content. The Terraform console logging continues to report "Still creating..." without end and doesn't obey the timeout. The Terraform process never completes.
Steps to Reproduce
Setup Windows Server 2012 to have Powershell 5
For AWS, I am using the source AMI filter
2.1 Chocolatey via, "choco install powershell"
2.2 Manually by downloading Win8.1AndW2K12R2-KB3191564-x64.msu from https://www.microsoft.com/en-us/download/details.aspx?id=54616, Windows Management Framework 5.1 (KB3191564).
2.2.1 Install via the GUI manually or run "Win8.1AndW2K12R2-KB3191564-x64.msu /quiet" in a cmd line Administrator elevated prompt.
Another option is to leverage a Windows Server 2016 instance, which natively uses Powershell 5.1. For AWS, I used the "name": "Windows_Server-2016-English-Full-Containers-*" source_ami_filter.
Attempt to terraform
terraform init
terraform apply
Additional Context
Terraform 11.* has never been an issue. The Terraform file isn't new, outside of being upgraded to Terraform 12's syntax, using the automated upgrade cmd. Without Powershll 5 installed, I can terraform the Windows Server system, with Powershell 4.0 natively, just fine. After installing Powershell 5 or against a Windows Server 2016 instance with Powershell 5.1 natively installed, Terraform hangs and never completes or errors out.
The Windows system being terraformed has a Powershell execution policy set to: localmachine bypass.
"set-executionpolicy bypass -force"
The server I run Terraform from has Powershell 5.1 installed.
References
N/A
The text was updated successfully, but these errors were encountered: