-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent remote S3 state failure #10779
Comments
I'm using 0.8.1 and I have the same problem without using S3 remote state file. I get this error running get, plan, apply and destroy but is randomly. Some examples:
and
|
Same here. It only happens in the ca-central-1 region for us but, to be fair, it's the only region we've been working on in the last few days so it may just be happenstance. |
All regions for me - primarily |
Hey James, so I ran this in a loop for the past ~60 minutes (of configure, reset state, configure) on Mac and Linux and I was never able to see an issue. It has probably configured and synced remote state about 300 times during that time (sleep 12 seconds, 5 times per minute). I've also heard of other people getting issues recently, though, so I'm not discounting your claim. I just don't know what causes it. I still continue to doubt its any change we made since we haven't touched any of the remote state code nor HTTP client initialization code. Any ideas? |
I think it is new since 0.8. I've never seen it with 0.7.13. If it was just me I'd put it down to AWS bucket weirdness but the fact that a few people see it too makes me suspect there's a wider issue, again perhaps not TF but still an issue, here. |
We changed to Go 1.7.4 which had very few changes, the only one of which I can imagine affecting this being: golang/go#18141 I'm not saying thats the issue at fault, but thats the only change between 0.7.13 and current that has anything to do with TLS in our code. We probably did update the AWS SDK during that time too, so its possible the issue is in the AWS SDK. At any rate, we're not doing any special TLS configuration for the AWS SDK or Go directly so the issue is likely in one of those two. I'd lean towards the former just because I find it unlikely that something like this is broken in Go itself. |
I just ran ten minutes of
|
@jamtur01 I just compiled TF 0.8.1 with Go 1.7.3. Do you mind giving this a shot? Since you have a reliable repro I just want to eliminate the "wtf" that Go might be causing this. https://dl.dropboxusercontent.com/u/46819/terraform_081_go173.zip (Note for the future: I probably deleted the file since it was just in my dropbox) |
Was having the same issue with certificates consistently on 0.8.1. Tried the build with go 1.7.3 linked above and was able to successfully work with remote state again. |
I can confirm that the issue manifests itself in a custom compiled version of Terraform 0.7.13 compiled with go 1.7.4. |
@mitchellh Tried that build with the ten minute test. No errors! |
@jamtur01 Yep, okay, so it is Go 1.7.4 causing this. Bradfitz also offered up a solution that is already a CL for Go (not merged yet though). Ouch! We'll try to resolve this one way or another for 0.8.2, either dropping back to Go 1.7.3 or finding a way to have cgo-enabled builds for Darwin. |
The same thing applies to Illumos builds of Terraform by the look of it - both 0.8 and 0.8.1 exhibit the issue running on SmartOS. |
@mitchellh I can add a bit more confirmation. Installed terraform 0.8.1 via brew and got the x509 issue on sts and s3. It's compiled with 1.7.4 |
0.8.2 will be released today built with Go 1.7.3. That reverts the "security fixes" made in Go 1.7.4 unfortunately but hopefully 0.8.3 will be built with Go 1.8 which will bring all this back with a longer term fix from the Go team. |
@mitchellh unfortunately it still happens on 0.8.2. I just executed
or
I had exactly the same issues on 0.8.1, but never seen that on 0.7.x
|
@mitchellh can confirm that it's still a problem with 0.8.2 |
Hi @myoung34! Could you try force refreshing the download page? I see the download for 0.8.2 there. |
Weird. It's there, never thought i'd fail to the cache. Compiled master |
Same here:
|
I have been seeing this in 0.8.1, I have not seen this in 7.11 as I Run both versions for different environments... Just saying, seems to be an issue with terraforms latest releases. |
Same issue with Terraform 0.9.5 and go 1.8. Any one find a reproducible solution?
|
In my case it was an issue with my SSL certs that curl was using. I fixed it by setting |
I'm seeing the same issue with Terraform 0.10.8. Every now and then (1 out of 20 or 30 times, perhaps?) I get a |
Also intermittently experiencing this issue using Terraform 0.10.4. |
I'm seeing it intermittently on 0.11.1/OSX |
Same issue with 0.11.1 |
Same issue with go version go1.8.3 darwin/amd64 and terraform Terraform v0.11.2 |
As a temporary bandaid you can add @hgallo0 Can you elaborate on your AWS credentials setup? Are you using access/secret keys, using a profile, assumed role, STS with MFA? |
Hi @denniswebb thanks for your quick reply. I am using access/secret key currently stored in my ~/.aws/credentials. no STS or MFA |
I'm seeing this exact issue as well. I'm setting profile in my backend config. The profile is in my ~/.aws/credentials file and the credentials work. Setting ❯ terraform --version |
just started happening for me too. osx 10.12.6. the only recent local updates i can think of was installing a specific version of golang to use some new kubernetes incubator packages (external-dns). the terraform issue it is intermittent and i can't seem to figure out why. i did notice that if i switch networks it seems to clear up if only temporarily. like get on a vpn and try from there, or hop back off the vpn and try again. no idea if that's just a coincidence or not. maybe something to do with golang and stale dns/cache something something i'm grasping for answers. |
Same here
Terraform v0.11.7 |
Has there a fix for this as I am also seeing this with 0.11.7? |
Same issue with Terraform v0.11.7 on Alpine. I fixed it installing the following package: |
I'm seeing these issues quite often on OS X 0.11.7 too. Should this issue be reopened? |
@brikis98 I think this is the same issue being discussed here: hashicorp/terraform-provider-aws#4709. If so, add your comment / upvote to that issue since it's still open. I believe this needs to be solved in the provider, not in terraform core. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Terraform Version
0.8
Affected Resource(s)
remote state on s3
Debug Output
When running terraform plan/apply or destroy.
Expected Behavior
Should get remote state.
The text was updated successfully, but these errors were encountered: