Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform loses access token and requires az login #502

Closed
jwendl opened this issue Nov 3, 2017 · 17 comments · Fixed by #2387
Closed

Terraform loses access token and requires az login #502

jwendl opened this issue Nov 3, 2017 · 17 comments · Fixed by #2387

Comments

@jwendl
Copy link
Contributor

jwendl commented Nov 3, 2017

Terraform requires az login after 30 minutes of in activity, even though az account list still works.

Terraform Version

Terraform v0.10.8

Affected Resource(s)

All

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

resource "random_id" "server" {
  keepers = {
    azi_id = 1
  }

  byte_length = 8
}

resource "azurerm_resource_group" "test" {
    name = "resourceGroup1"
    location = "West Europe"
}

resource "azurerm_cosmosdb_account" "test" {
  name                = "${random_id.server.hex}"
  location            = "${azurerm_resource_group.test.location}"
  resource_group_name = "${azurerm_resource_group.test.name}"
  offer_type          = "Standard"
  consistency_policy {
    consistency_level = "BoundedStaleness"
  }

  failover_policy {
    location = "West Europe"
    priority = 0
  }

  failover_policy {
    location = "East US"
    priority = 1
  }

  tags {
    hello = "world"
  }
}

Debug Output

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.


Error: Error running plan: 1 error(s) occurred:

  • provider.azurerm: No valid (unexpired) Azure CLI Auth Tokens found. Please run az login.

Panic Output

Expected Behavior

az account list => works
terraform plan => works

Actual Behavior

az account list => works
terraform plan => gives the message above
(this happens after a bit of time from when I did az login, the command works for a while, but go to lunch and it shows above error).

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform plan

Important Factoids

References

@devigned
Copy link

devigned commented Nov 3, 2017

I think az will refresh the token when it makes a call. Don't quote me, but I'm pretty sure there are two tokens in az. The access token, which is what is used when making a call to Azure and is generally short lived (30 mins), and the refresh token, which has a longer lifespan and can be used to request new access tokens. @yugangw-msft could probably provide more insight or correct my statement.

The bottom line is that az may need to expose a way to fetch a new access token using the refresh token (if it doesn't already).

@yugangw-msft
Copy link

yugangw-msft commented Nov 4, 2017

The access token's life span can be configured by tenant admins, but the default is 60 minutes. By default, a token retrieved to access common azure resources, like resource manager, key-vault, etc, will have a paired refresh token in the token response payload. CLI will save it and use it to get new tokens when the old one expires. By default azure refresh token are multi-resource refresh tokens, CLI will use it to get tokens for other resource. Say, if you have a refresh token for resource manager, the refresh token can be used to request token to access key-vault.
CLI exposes a command az account get-access-token which will handle the refreshing and guarantee it will be valid for 5+ minutes. Please note, this command will not work reliably inside the cloud shell as shell doesn't supply refresh tokens to CLI.
I have not used Terraform, but I hope my explanation above would help.

@jwendl
Copy link
Contributor Author

jwendl commented Nov 6, 2017

Yeah the explanation makes some sense. I understand the perspective of cloud shell as well because it automatically authenticates you into azure cli (I've never needed to auth into az cli with cloud shell or terraform for that matter).

With regular bash sessions (say through VM or WSL in my case) though, my azure cli persists even between shutdown of the laptop. The terraform azure provider does not see that token though and requires to re-auth with az login.

@ziurjam
Copy link

ziurjam commented Nov 17, 2017

Any solution to this issue?

@tombuildsstuff
Copy link
Contributor

hey @jwendl

Sorry for the delayed response to this!

I think az will refresh the token when it makes a call. Don't quote me, but I'm pretty sure there are two tokens in az. The access token, which is what is used when making a call to Azure and is generally short lived (30 mins), and the refresh token, which has a longer lifespan and can be used to request new access tokens.

Indeed - this is stored as part of the Access Token used by the Azure CLI - whilst we could use this to generate a new refresh token, since we don't own this file we've opted not to use the token (and write out the refreshed access token at this time) - since were the schema to change, we'd potentially break the Azure CLI (which certainly isn't ideal).

Whilst we could probably take the risk of that - the problem comes in that the schema's of the Access Token is different depending on where Terraform is running (for instance, a different schema is used in CloudShell since the authentication token is generated differently) - which means we'd be at risk of breaking things subtly for a percentage of users; which really isn't ideal.

By default azure refresh token are multi-resource refresh tokens, CLI will use it to get tokens for other resource. Say, if you have a refresh token for resource manager, the refresh token can be used to request token to access key-vault.

What's unclear here is if the Refresh Tokens can be used multiple times (I'd presume not) - which would allow us to generate a new Access Token but not persist this to disk. @yugangw-msft would you be able to confirm what the state of the Refresh Token is?

From what I can see - it looks like running the command:

$ az account get-access-token

will update the Azure CLI Access Token to be valid - as such I'm going to suggest running this might be the best workaround for the moment.

That said, there's some other options here which we're looking into - but unfortunately I don't believe we've got an immediate path forward for this issue - and as such I'm wondering if it's worth closing this issue for the moment?

Thanks!

@yugangw-msft
Copy link

yugangw-msft commented Mar 8, 2018

@tombuildsstuff, sorry for the delay. Looks like the git notification in my mail box fell through the crack

 > @yugangw-msft would you be able to confirm what the state of the Refresh Token is?

A refresh token is good for 14 days by default. Each refresh will get back a new refresh token and reset the 14 days windows, and this can go on for 3 months by default. After that you need to explicitly log in again using your credentials. It has been talked about to remove the 3 months limitation, but I don't know whether it has become official

 > $ az account get-access-token will update the Azure CLI Access Token to be valid

That is correct. The token will be good for at least 5 minutes, so make sure you use it within that time window, and ensure to call the command again if the token gets expired, say your command/app might take more than 5 minutes.

@jwendl
Copy link
Contributor Author

jwendl commented May 18, 2018

Hey @tombuildsstuff

"but unfortunately I don't believe we've got an immediate path forward for this issue - and as such I'm wondering if it's worth closing this issue for the moment?"

That's fine for the moment, but we should consider maybe some documentation stating that this is the case so we could possibly revisit at a later date once other dependencies support it?

Thank you!

@StianOvrevage
Copy link

At least update the error message from terraform to include the workaround using az account get-access-token which is considerably easier than doing az login several times a day 😺

Error: Error running plan: 1 error(s) occurred:
* provider.azurerm: No valid (unexpired) Azure CLI Auth Tokens found. Please run `az login`.

@colemickens
Copy link

colemickens commented May 31, 2018

Is it time to revisit my original suggestion and have terraform call az get-access-token directly? I had suggested it specifically to avoid the problem raised here and in other issues. Context: #42 (comment)

@lawrencegripper
Copy link
Contributor

@tombuildsstuff I've been playing with this recently. My theory was that while the token may be expired the refreshToken will be valid and the Azure Go SDK will automatically use the refresh token to get a new valid token.

I made the following changes which allow TF to use the least expired token and rely on the refresh token being used. Based on some limited testing this appeared to work (I waited until a token expired on my machine then ran a build of this provider and it successfully deployed resources).

Is there something that I've missed with this approach or would you be happy to accept this as a PR following some more testing (Cloudshell etc)?

@tombuildsstuff
Copy link
Contributor

@lawrencegripper my understanding is that the refresh token can only be used a single time within the expiry window, although perhaps I'm wrong? If that's not the case then we'd happily accept a PR for it / go through and test it :)

@lawrencegripper
Copy link
Contributor

Reading the docs here it does suggest that a new refresh token is returned when it is used but it doesn't explicitly say the existing one is single use only.

When you redeem a refresh token for a new access token, you will receive a new refresh token in the token response. You should save the newly issued refresh token, replacing the one you used in the request. This will guarantee that your refresh tokens remain valid for as long as possible.

https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-token-and-claims#refresh-tokens

So the test case would be as follows:

With an expired management endpoint token in the token file a user should be able to use my modified version of the azure provider. This will use the refresh token to get a non-expired token and deploy some resource. Then if the user returns to use azure-cli it should be able to use the same refresh token and still functioning correctly.

Failure case: The azure provider uses the refresh token (and is returned a new one) and then when azure-cli is next used the refresh token used by the provider is still present in the token file which fails as it is now invalid causing the azure-cli operation to fail.

@lawrencegripper
Copy link
Contributor

lawrencegripper commented Jul 19, 2018

So I've been reviewing this a bit more and my change appears to be consistently working for me, not causing the azure cli any issues and allowing TF to function as well.

I'm struggling to work out how I can force an update to my refresh token other than waiting and testing...

One thing that did cross my mind is that my changes aren't as much of a change to the behavior as I previously thought.

Currently TF checks that the token has not expired, but this doesn't mean the refresh token isn't being using by the current implementation. If you run a deployment which takes a long time it is entirely possible that the token expires and, as the refresh token is present, it will be used automatically without anything being written back to the azure token file. Or have I missed a bit of code in the current approach?

@tombuildsstuff
Copy link
Contributor

@lawrencegripper

Currently TF checks that the token has not expired, but this doesn't mean the refresh token isn't being using by the current implementation. If you run a deployment which takes a long time it is entirely possible that the token expires and, as the refresh token is present, it will be used automatically without anything being written back to the azure token file. Or have I missed a bit of code in the current approach?

The issue is that multiple versions of the AzureRM plugin are initialized to provide parallelization, such that if one of them uses up the Refresh Token I'm unsure whether the other instances can refresh using the same token, since it's not persisted? One option would be to persist the updated token back, but the Azure CLI team were unable to guarantee the file format wouldn't change in the future - maybe we just need to persist the file anyway?

I'm struggling to work out how I can force an update to my refresh token other than waiting and testing...

I'm unaware of any other means of doing this unfortunately, I know it's possible to configure the timeout for the Azure CLI tokens (but I believe this is set globally on an Azure AD Domain). When I've tested this in the past I've done some tests, switched to something else and then come back to it later, but it's not exactly a fast process unfortunately.

@tombuildsstuff tombuildsstuff modified the milestones: Temp/To Be Sorted, Future Jul 24, 2018
lawrencegripper added a commit to lawrencegripper/terraform-provider-azurerm that referenced this issue Aug 10, 2018
@lawrencegripper
Copy link
Contributor

lawrencegripper commented Aug 10, 2018

@tombuildsstuff I've been thinking about this and I've come up with a fix which I think solves this problem

One option would be to persist the updated token back, but the Azure CLI team were unable to guarantee the file format wouldn't change in the future - maybe we just need to persist the file anyway?

I've use the ADAL callbacks to get notified when a new refresh token is returned and then use a simple find and replace method to update the AzureCLI file. This means, if the json structure does change, the code should continue to function normally as it only edits the refresh token.

It needs some more testing but tried it locally and it appears to behave as expected. I've created a PR here for you to take a look and get you're thoughts before going further. #1752

@tombuildsstuff
Copy link
Contributor

hey @jwendl

Apologies for the delayed response here - we've been working on the authentication logic for the last couple of releases which has led us to split the authentication logic out into it's own package so that we're able to iterate on it and then reuse this across multiple providers.

Given this has now been split out, I'm going to migrate this issue over to the new repository (which I've done here) - however we're planning on taking a look into this as a part of the next release :)

Thanks!

@ghost
Copy link

ghost commented Mar 5, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants