-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent net/http: TLS handshake timeout error when downloading providers #16448
Comments
Hi @brikis98, Sorry this is causing an issue for you. I have also seen this before, but so far only with VMs on extremely oversubscribed hosts. The default TLS handshake timeout is 10 seconds, which is quite a long time to establish the connection. A minor fix I have coming soon will better re-use connections, reducing the number of handshakes that need to be done. I have a feeling that extending the timeout might not help much either, as I think this is partly the CDN servers reaction to the extremely slow clients. We need to reproduce this and trace the failing handshakes to be certain. |
Well, if it helps debug the issue, this happens most often when we run tests in CircleCI, which I believe has a ~24 core machine, so there could be as many as a couple dozen of these Many CDNs have throttling built in (DoS protection); any chance this is the cause here? |
Update: For those struggling with this same issue, as a workaround, I'm doing the following:
|
Thanks for sharing that @brikis98! I wasn't previously familiar with CircleCI caching. From a quick read of what that feature does, it may also work to have CircleCI cache the contents of It looks like the mechanism requires using the checksum of some files as a key, which may be tricky in Terraform since the entire config is consulted to decide which plugins to install. However, that could perhaps be worked around by having a separate The output of If you're running I imagine using Terraform's caching mechanism vs. caching the |
Hi @brikis98, Have you had a chance to try out 0.11.2 on CircleCI? That release enabled the DualStack dialer by default for http requests, so terraform can still contact the release servers on a network with a broken IPv6 configuration. Looking at the CircleCI docs it seems that they don't have complete IPv6 support yet, so it guessing it could be related. |
@jbardin We are updating all of our repos to 0.11 now, so I'll let you know once we complete that process! |
I have been getting a similar error and I am unable to get rid of it.
Which ends up in a TLS handshake timeout error |
Hi @crouchjay! The request you see failing there is the one that powers the upgrade and security bulletin checks. This particular request is not required for correct Terraform operation, so you could choose to disable it (using the settings described on the page I linked) if you don't mind Terraform not warning you about new versions being available. If you're still seeing an error like that on 0.11.2 or newer then I'd welcome you to open a new issue describing that, since some details are different for that request (it's in a separate library, subject to different timeouts, etc) but the fix we applied for dual-stack dialing should've applied to that call as well and so that would suggest that you've encountered a new problem which we can investigate further in a new issue. |
Hi,
But i often have the same issue :
Sometimes it works... but ti's very annoying |
Hi @tperelle! Sorry that isn't working as expected. Those particular requests are coming from the |
Hi all, Further to my previous comment, I just wanted to sum up a few different causes we've seen for this kind of issue for future reference:
Since this particular issue was within CircleCI I'm not sure if these solutions apply there, so for the moment I'm going to leave this one open. It is possible that the IPv6 connectivity issue was affecting CircleCI, in which case Terraform should behave correctly there from v0.11.2 onwards. |
Using Terraform v0.11.11, is there any way as a user to adjust what that net/http TLS handshake value is? I need to talk to our proxy admins about performance, but our connections through it are taking ~10.1-10.3 seconds to respond, so naturally Terraform bombs with the timeout.
|
@brikis98 have you continued to have issues like this running a recent version of terraform on CircleCI? I have tried to simulate the following conditions using terraform 0.12.17 using Apple's Network Link Conditioner, installing the cloudflare and github providers from a trivial main.tf. Scenario: base case, no delays injected Scenario: simulated 3G wireless network with 780kbps down, 330kbps up, and 100ms delay Scenario: 250ms delay on TCP and DNS requests, 0% packet loss Scenario: 250ms delay on TCP and DNS requests, 20% packet loss Scenario: 500ms delay on TCP and DNS requests, 0% packet loss Scenario: 1000ms delay on TCP and DNS requests, 0% packet loss Based on my testing and the lack of recent updates to this issue, especially the test case with packet loss, I am inclined to think that the improvements made to 0.12 have sufficiently mitigated this such that terraform is usable in slow network conditions. I'm going to close this out for now because I'm pretty confident that the 0.12 improvements @apparentlymart described have resolved this. If you're still seeing these types of issues, feel free to re-open or file a new issue linked to this one. I'm definitely interested in hearing about people's experiences using terraform on slow or intermittent networks. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Terraform Version
Terraform v0.10.7
Terraform Configuration Files
This happens with just about any configuration.
Expected Behavior
I can run
terraform init
without errors.Actual Behavior
I get intermittent errors for downloading plugins that look like this:
Note that the particular plugin or file that fails changes randomly.
Steps to Reproduce
terraform init
Important Factoids
This happens more often when working with many modules in parallel, such as on a CI server running many automated tests. Is
releases.hashicorp.com
failing under concurrent load? Or is it intentionally throttling requests?Either way, this makes automated tests involving Terraform very brittle.
I enabled the plugin cache to reduce the number of necessary downloads, but I still see these errors on a regular basis.
The text was updated successfully, but these errors were encountered: