Intermittent net/http: TLS handshake timeout error when downloading providers #16448

brikis98 · 2017-10-25T12:19:10Z

Terraform Version

Terraform v0.10.7

Terraform Configuration Files

This happens with just about any configuration.

Expected Behavior

I can run terraform init without errors.

Actual Behavior

I get intermittent errors for downloading plugins that look like this:

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...

Error installing provider "template": error fetching checksums: Get https://releases.hashicorp.com/terraform-provider-template/1.0.0/terraform-provider-template_1.0.0_SHA256SUMS: net/http: TLS handshake timeout.

Note that the particular plugin or file that fails changes randomly.

Steps to Reproduce

terraform init

Important Factoids

This happens more often when working with many modules in parallel, such as on a CI server running many automated tests. Is releases.hashicorp.com failing under concurrent load? Or is it intentionally throttling requests?

Either way, this makes automated tests involving Terraform very brittle.

I enabled the plugin cache to reduce the number of necessary downloads, but I still see these errors on a regular basis.

The text was updated successfully, but these errors were encountered:

jbardin · 2017-10-25T13:25:53Z

Hi @brikis98,

Sorry this is causing an issue for you. I have also seen this before, but so far only with VMs on extremely oversubscribed hosts.

The default TLS handshake timeout is 10 seconds, which is quite a long time to establish the connection. A minor fix I have coming soon will better re-use connections, reducing the number of handshakes that need to be done.

I have a feeling that extending the timeout might not help much either, as I think this is partly the CDN servers reaction to the extremely slow clients. We need to reproduce this and trace the failing handshakes to be certain.

brikis98 · 2017-10-25T17:29:19Z

Well, if it helps debug the issue, this happens most often when we run tests in CircleCI, which I believe has a ~24 core machine, so there could be as many as a couple dozen of these init calls happening from various tests in parallel.

Many CDNs have throttling built in (DoS protection); any chance this is the cause here?

brikis98 · 2017-10-26T14:28:32Z

Update: For those struggling with this same issue, as a workaround, I'm doing the following:

Enable the plugin provider cache.
Cache the plugin provider cache using CircleCI caching

apparentlymart · 2017-10-26T17:38:19Z

Thanks for sharing that @brikis98! I wasn't previously familiar with CircleCI caching.

From a quick read of what that feature does, it may also work to have CircleCI cache the contents of .terraform/plugins since, after an initial terraform init that should contain all of the plugins for that particular config.

It looks like the mechanism requires using the checksum of some files as a key, which may be tricky in Terraform since the entire config is consulted to decide which plugins to install. However, that could perhaps be worked around by having a separate providers.tf file that contains a provider block for each of the providers you use (including version constraints), and then using just that file as the cache key.

The output of terraform providers might make a reasonable thing to hash to get an overview of the providers used across the whole config, though it may change more often than necessary if e.g. modules are refactored while retaining the same plugin versions.

If you're running terraform init on every run as part of your automation anyway (which I would recommend) then it shouldn't hurt to let the cache persist between runs even if the dependencies do change, since terraform init is able to manage the .terraform/plugins dir automatically and clean up any plugins that are no longer used.

I imagine using Terraform's caching mechanism vs. caching the .terraform/plugins directory are functionally equivalent, since CircleCI caching is immutable, but perhaps caching the local plugin dir is more straightforward since it doesn't require any unusual configuration within Terraform itself, and Terraform is able to remove items from its local dir when they are no longer needed to prevent the cache from growing indefinitely.

jbardin · 2018-01-25T22:32:17Z

Hi @brikis98,

Have you had a chance to try out 0.11.2 on CircleCI? That release enabled the DualStack dialer by default for http requests, so terraform can still contact the release servers on a network with a broken IPv6 configuration. Looking at the CircleCI docs it seems that they don't have complete IPv6 support yet, so it guessing it could be related.

brikis98 · 2018-01-25T23:48:23Z

@jbardin We are updating all of our repos to 0.11 now, so I'll let you know once we complete that process!

crouchjay · 2018-02-06T14:54:46Z

I have been getting a similar error and I am unable to get rid of it.

2018/02/06 15:53:10 [ERR] Checkpoint error: Get https://checkpoint-api.hashicorp.com/v1/check/terraform?arch=amd64&os=darwin&signature=56982404-9b8e-0f76-67e5-26bb3a8299d2&version=0.11.3: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Which ends up in a TLS handshake timeout error

apparentlymart · 2018-02-09T15:47:10Z

Hi @crouchjay! The request you see failing there is the one that powers the upgrade and security bulletin checks. This particular request is not required for correct Terraform operation, so you could choose to disable it (using the settings described on the page I linked) if you don't mind Terraform not warning you about new versions being available.

If you're still seeing an error like that on 0.11.2 or newer then I'd welcome you to open a new issue describing that, since some details are different for that request (it's in a separate library, subject to different timeouts, etc) but the fix we applied for dual-stack dialing should've applied to that call as well and so that would suggest that you've encountered a new problem which we can investigate further in a new issue.

tperelle · 2018-07-02T17:13:21Z

Hi,
I'm with a recent version of Terraform and i try to create droplets on Digital Ocean :

% terraform version
Terraform v0.11.7
+ provider.digitalocean v0.1.3
+ provider.template v1.0.0

But i often have the same issue :

Error: Error refreshing state: 2 error(s) occurred:

* digitalocean_droplet.ucp_master: 2 error(s) occurred:

* digitalocean_droplet.ucp_master[2]: digitalocean_droplet.ucp_master.2: Error retrieving droplet: Get https://api.digitalocean.com/v2/droplets/100173689: net/http: TLS handshake timeout
* digitalocean_droplet.ucp_master[1]: digitalocean_droplet.ucp_master.1: Error retrieving droplet: Get https://api.digitalocean.com/v2/droplets/100173687: net/http: TLS handshake timeout
* digitalocean_droplet.ucp_worker: 1 error(s) occurred:

* digitalocean_droplet.ucp_worker: digitalocean_droplet.ucp_worker: Error retrieving droplet: Get https://api.digitalocean.com/v2/droplets/100173688: net/http: TLS handshake timeout

Sometimes it works... but ti's very annoying

apparentlymart · 2018-07-02T22:56:58Z

Hi @tperelle! Sorry that isn't working as expected.

Those particular requests are coming from the digitalocean provider itself, so if something needs to be fixed for that it'd need to be done in the provider's own repository. Would you mind opening an issue for this over there? It's possible that the maintainers of that provider would just need to make a similar change to that from #16805 (upgrading the cleanhttp dependency), which is what we changed to make this work better for Terraform Core.

apparentlymart · 2018-08-13T15:39:45Z

Hi all,

Further to my previous comment, I just wanted to sum up a few different causes we've seen for this kind of issue for future reference:

On a host with both IPv4 and IPv6 connectivity, Terraform versions prior to v0.11.2 will prefer IPv6. This can be problematic on systems where the IPv6 connection is slower or is actually inoperable in practice. From v0.11.2 onwards, Terraform implements RFC 6555 to mitigate this problem.
Some environments have either explicit or transparent HTTP proxies that are required for outbound access. Occasionally we've seen reports that poorly-performing or misconfigured proxies have led to timeout and TLS-related issues. In this case, there is no known Terraform-specific workaround and so working with the administrator of that proxy is the primary path to resolution.
Some users run Terraform on WiFi networks with "captive portal" intercepts which can cause confusion. There are several different approaches to intercepting outgoing traffic to redirect to a captive portal, including DNS intercepts and HTTP-level intercepts, and some of these can lead to Terraform appearing to timeout or have TLS handshake issues due to the interference of that system.

Since this particular issue was within CircleCI I'm not sure if these solutions apply there, so for the moment I'm going to leave this one open. It is possible that the IPv6 connectivity issue was affecting CircleCI, in which case Terraform should behave correctly there from v0.11.2 onwards.

jakauppila · 2019-01-23T23:56:16Z

Using Terraform v0.11.11, is there any way as a user to adjust what that net/http TLS handshake value is?

I need to talk to our proxy admins about performance, but our connections through it are taking ~10.1-10.3 seconds to respond, so naturally Terraform bombs with the timeout.

Error downloading modules: Error loading modules: Failed to request discovery document: Get https://registry.terraform.io/.well-known/terraform.json: net/http: TLS handshake timeout

danieldreier · 2019-12-05T00:48:24Z

@brikis98 have you continued to have issues like this running a recent version of terraform on CircleCI?

I have tried to simulate the following conditions using terraform 0.12.17 using Apple's Network Link Conditioner, installing the cloudflare and github providers from a trivial main.tf.

Scenario: base case, no delays injected
Result: Success, time 0:05

Scenario: simulated 3G wireless network with 780kbps down, 330kbps up, and 100ms delay
Result: Success, time 9:47

Scenario: 250ms delay on TCP and DNS requests, 0% packet loss
Result: Success, time 1:30

Scenario: 250ms delay on TCP and DNS requests, 20% packet loss
Result: Success, time 7:36

Scenario: 500ms delay on TCP and DNS requests, 0% packet loss
Result: Success, time 3:01

Scenario: 1000ms delay on TCP and DNS requests, 0% packet loss
Result: Success, time 3:14

Based on my testing and the lack of recent updates to this issue, especially the test case with packet loss, I am inclined to think that the improvements made to 0.12 have sufficiently mitigated this such that terraform is usable in slow network conditions. I'm going to close this out for now because I'm pretty confident that the 0.12 improvements @apparentlymart described have resolved this. If you're still seeing these types of issues, feel free to re-open or file a new issue linked to this one. I'm definitely interested in hearing about people's experiences using terraform on slow or intermittent networks.

ghost · 2020-03-28T02:12:25Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

jbardin added the bug label Oct 25, 2017

jbardin mentioned this issue Oct 31, 2017

Use pooled http client for fetching providers #16509

Merged

This was referenced Nov 29, 2017

Error loading modules: host registry.terraform.io #16751

Closed

network delays when fetching providers #16804

Closed

thejmazz mentioned this issue Jul 18, 2018

Terraform provider downloads fail with TLS handshake timeout #15817

Closed

brikis98 mentioned this issue Aug 26, 2018

Add retries for storing state and releasing locks #18741

Open

rbojan mentioned this issue Apr 9, 2019

Failure running from macbook rbojan/terraform-hcloud-rke-helm#1

Closed

hashibot added v0.10 Issues (primarily bugs) reported against v0.10 releases v0.11 Issues (primarily bugs) reported against v0.11 releases labels Aug 22, 2019

danieldreier closed this as completed Dec 5, 2019

ghost locked and limited conversation to collaborators Mar 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent net/http: TLS handshake timeout error when downloading providers #16448

Intermittent net/http: TLS handshake timeout error when downloading providers #16448

brikis98 commented Oct 25, 2017 •

edited

Loading

jbardin commented Oct 25, 2017

brikis98 commented Oct 25, 2017

brikis98 commented Oct 26, 2017

apparentlymart commented Oct 26, 2017 •

edited

Loading

jbardin commented Jan 25, 2018

brikis98 commented Jan 25, 2018

crouchjay commented Feb 6, 2018 •

edited

Loading

apparentlymart commented Feb 9, 2018

tperelle commented Jul 2, 2018 •

edited

Loading

apparentlymart commented Jul 2, 2018

apparentlymart commented Aug 13, 2018

jakauppila commented Jan 23, 2019 •

edited

Loading

danieldreier commented Dec 5, 2019

ghost commented Mar 28, 2020

Intermittent net/http: TLS handshake timeout error when downloading providers #16448

Intermittent net/http: TLS handshake timeout error when downloading providers #16448

Comments

brikis98 commented Oct 25, 2017 • edited Loading

Terraform Version

Terraform Configuration Files

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

jbardin commented Oct 25, 2017

brikis98 commented Oct 25, 2017

brikis98 commented Oct 26, 2017

apparentlymart commented Oct 26, 2017 • edited Loading

jbardin commented Jan 25, 2018

brikis98 commented Jan 25, 2018

crouchjay commented Feb 6, 2018 • edited Loading

apparentlymart commented Feb 9, 2018

tperelle commented Jul 2, 2018 • edited Loading

apparentlymart commented Jul 2, 2018

apparentlymart commented Aug 13, 2018

jakauppila commented Jan 23, 2019 • edited Loading

danieldreier commented Dec 5, 2019

ghost commented Mar 28, 2020

brikis98 commented Oct 25, 2017 •

edited

Loading

apparentlymart commented Oct 26, 2017 •

edited

Loading

crouchjay commented Feb 6, 2018 •

edited

Loading

tperelle commented Jul 2, 2018 •

edited

Loading

jakauppila commented Jan 23, 2019 •

edited

Loading