Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Power Provisioning is failing with Token is expired #3611

Closed
sunilagrawal86 opened this issue Feb 23, 2022 · 12 comments · Fixed by #3614
Closed

Power Provisioning is failing with Token is expired #3611

sunilagrawal86 opened this issue Feb 23, 2022 · 12 comments · Fixed by #3614
Assignees
Labels
service/Power Systems Issues related to Power Systems

Comments

@sunilagrawal86
Copy link

Terraform CLI and Terraform IBM Provider Version

Terraform v0.12.21

Affected Resource(s)

  • ibm_pi_instance

Terraform Configuration Files

provider ibm {
  version = "1.8.1"
  region  = "${var.piRegion}"
  zone    = "${var.piZone}"
}

resource "ibm_pi_instance" "pvminstance" {
  depends_on            = ["ibm_pi_volume.powervolumes"]
  pi_memory             = var.memory
  pi_processors         = var.processors
  pi_instance_name      = var.serverName
  pi_proc_type          = var.procType
  pi_image_id           = "${data.ibm_pi_image.powerimages.id}"
  pi_volume_ids         = flatten(["${ibm_pi_volume.powervolumes.*.volume_id}"])
  pi_network_ids        = flatten(["${data.ibm_pi_network.powernetworks.*.name}"])
  pi_key_pair_name      = "${var.sshkeyName}"
  pi_sys_type           = "${var.systemType}"
  pi_replication_policy = "${var.replicationPolicy}"
  pi_replication_scheme = "${var.replicantScheme}"
  pi_replicants         = "${var.replicants}"
  pi_cloud_instance_id  = "${var.powerinstanceId}"
  pi_user_data          = "${var.cloud_init_data}"
  pi_pin_policy         = var.piPinPolicy

  timeouts {
    create = "120m"
    delete = "60m"
  }
}

Copy-paste your Terraform configurations here - for large Terraform configs,

please share a link to the ZIP file.


### Debug Output

<!---
Please provide a link to a GitHub Gist containing the complete debug output. Please do NOT paste the debug output in the issue; just paste a link to the Gist.

To obtain the debug output, see the [Terraform documentation on debugging](https://www.terraform.io/docs/internals/debugging.html).
--->

### Panic Output

<!--- If Terraform produced a panic, please provide a link to a GitHub Gist containing the output of the `crash.log`. --->

### Expected Behavior

Power provisioning should complete without token expiration issue. 

### Actual Behavior
Power provisioning is failing with token expiration issue . 

Failed to apply Terraform template. 
Error: Failed to Get PVM Instance 0cbf38d8-9943-4ff9-830e-05820d069738 :[GET /pcloud/v1/cloud-instances/{cloud_instance_id}/pvm-instances/{pvm_instance_id}][500] pcloudPvminstancesGetInternalServerError  &{Code:500 Description: Error: Message:not authorized: be…

### Steps to Reproduce
1 : Try provisioning a power instance or resource which takes more than 20 minutes 

1. `terraform apply`


### Important Factoids

<!--- Are there anything atypical about your accounts that we should know? For example: Running in EC2 Classic? --->

### References

<!---
Information about referencing Github Issues: https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests

Are there any other GitHub issues (open or closed) or pull requests that should be linked here? Vendor documentation? For example:
--->

* #0000
@github-actions github-actions bot added the service/Power Systems Issues related to Power Systems label Feb 23, 2022
@christopher-horn
Copy link

Multiple people are running into this. It seems if a task runs > 60 minutes the bearer token expires and is not refreshed.

@sunilagrawal86
Copy link
Author

Yes but it is failing in 20 minutes which surprise me. Power provisioning takes around ~40 minutes and it is failing in 20 minutes. I checked IAM setting for customer as well and it is set to default value

@mjturek
Copy link

mjturek commented Feb 23, 2022

This is effecting IPI on Power VS as well. We are using an older branch so it doesn't seen like the issue has been introduced through code.

@clnperez
Copy link

Thanks for opening @sunilagrawal86.

We thought initially this was related to the ibmcloud account login issues we'd been seeing. This is affecting several teams and should be treated as an urgent issue. Let us know if you need any help figuring this one out.

The output from my recent encounter with this:

DEBUG 2022/02/23 13:51:45 [TRACE] statemgr.Filesystem: writing snapshot at /tmp/openshift-install-cluster-1150709102/terraform.cluster.tfstate 
ERROR                                                                                                                                                                                                                                         
ERROR Error: Failed to Get PVM Instance 24ab8965-93f0-482c-8a33-d7eb9f31497e :[GET /pcloud/v1/cloud-instances/{cloud_instance_id}/pvm-instances/{pvm_instance_id}][500] pcloudPvminstancesGetInternalServerError  &{Code:500 Description: Erro
r: Message:not authorized: bearer token could not be validated: Token is expired}                                                                                                                                                             
ERROR                                                                                                                  
ERROR   on ../../../../tmp/openshift-install-cluster-1150709102/bootstrap/main.tf line 62, in resource "ibm_pi_instance" "bootstrap": 
ERROR   62: resource "ibm_pi_instance" "bootstrap" {   

@christopher-horn
Copy link

While this is a legit issue, it should be refreshing the token, I think something else changed on PowerVS side that is causing this one to surface.

@yussufsh
Copy link
Collaborator

There was a recent fix at PowerVS which was broken and was allowing expired token. I will work on refreshing the token via the sdk used by Terraform. Only workaround is to run terraform apply again to get over this error.

@sunilagrawal86
Copy link
Author

@yussufsh - Thank you but it is not helping as ibm_pi_instance takes around ~30 minutes and it is failing before so resource is not yet completed. Hence when you do terraform apply -- it is recreating again and same timeout issue. So it is in the loop. Need urgent help to fix the issue. This started happening from last 3 days

@yussufsh
Copy link
Collaborator

yussufsh commented Feb 24, 2022

well, it turns out the call which service broker was using to validate tokens was broken and was not actually validating the token. so users were able to be authenticated with expired/invalid tokens. I fixed that in the last sprint, so that is probably why you are seeing this

tokens generated by [iam.cloud.ibm.com](http://iam.cloud.ibm.com/) are good for 60 minutes and tokens generated by ibmcloud cli are good for up to 20 minutes (can expire sooner depending on activity of the login session)

This is information I got from the service broker team. I did not get to check what is the actual timeout and if we can change it in account settings so I cannot give you the numbers.

The fix is created in #3614 which will refresh the token before each API call .We will need to wait for the next release of the provider. You could build the binary on my branch and setup your terminal to use the fix if you want to try it out sooner.

@clnperez
Copy link

Thanks @yussufsh. The max token expiry time is unfortunately an hour but it sounds like your PR should get us around this problem.

@sudarshanaks
Copy link

@hkantare We are looking for this fix...Could you please release the version ?

@sunilagrawal86
Copy link
Author

@hkantare - Can you please let me know version which can be used with updated code

@christopher-horn
Copy link

For those who did not see the note, @hkantare commented in #3614 that the Prod release will be on Monday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service/Power Systems Issues related to Power Systems
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants