Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent failure to create service principal #535

Closed
dimbleby opened this issue Aug 31, 2021 · 24 comments
Closed

intermittent failure to create service principal #535

dimbleby opened this issue Aug 31, 2021 · 24 comments

Comments

@dimbleby
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritise this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritise the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureAD Provider) Version

terraform 1.0.2, azuread 2.0.1

Affected Resource(s)

  • azuread_service_principal

Terraform Configuration Files

Part of larger configuration that I mostly can't share, but configuring service principals like this:

resource "azuread_service_principal" "timer" {
  count                        = var.aad_timer_name == "" ? 0 : 1
  application_id               = azuread_application.timer[0].application_id
  app_role_assignment_required = true
  owners                       = [data.azuread_client_config.current.object_id]
}

Will do my best to provide more details if needed.

Debug Output

Panic Output

Expected Behavior

We have a large configuration that creates several azuread_application and azuread_service_principal, all following the same pattern.

Expect all configuration to succeed,

Actual Behavior

Intermittently, some of the service principals fail to create:

│ Error: Could not create service principal
│ 
│   with module.aad.azuread_service_principal.timer[0],
│   on aad/main.tf line 599, in resource "azuread_service_principal" "timer":
│  599: resource "azuread_service_principal" "timer" {
│ 
│ ServicePrincipalsClient.BaseClient.Post(): unexpected status 403 with OData
│ error: Authorization_RequestDenied: When using this permission, the backing
│ application of the service principal being created must in the local tenant
╵

Can this really be an Azure permissions error?

  • The app that we are using to run terraform has the Application.ReadWrite.OwnedBy permission in the Graph API.
  • some service principals are successfully created in the very same terraform apply where others fail
  • the entire configuration is sometimes successfully applied

Steps to Reproduce

  1. terraform apply

Important Factoids

References

@manicminer
Copy link
Contributor

manicminer commented Aug 31, 2021

Hi @dimbleby, thanks for reporting.

It's possible this could be a consistency-related error. I believe that particular 403 error may sometimes be returned if the linked application can't be found. I will try to reproduce and work out where this can be mitigated if that's the case. However it may indeed be permissions-related.

Can you assign the Application.ReadWrite.All role and see if the error persists?

@dimbleby
Copy link
Author

Alas, I don't have the power to do that in my organisation

@callppatel
Copy link

callppatel commented Sep 1, 2021

I have the same issue and I can confirm Application.ReadWrite.All works, but this is not allowed in my case too. I am using service principal with permission "Application.ReadWrite.OwnedBy"

@manicminer
Copy link
Contributor

@dimbleby, @callppatel I've done some testing and this appears to be an API bug. The docs state that Application.ReadWrite.OwnedBy is sufficient for creating service principals, but this doesn't appear to be the case.

Whilst testing, I needed to have Application.ReadWrite.All, and having only Application.ReadWrite.OwnedBy consistently produced a 403 with the message When using this permission, the backing application of the service principal being created must in the local tenant.

We'll update the provider docs to reflect this, and I will report this to the relevant team. In the meantime we can keep this issue open for tracking.

@dimbleby
Copy link
Author

dimbleby commented Sep 2, 2021

Thanks! My experience is that creating the service principal sometimes works which maybe hints that there is something more complicated going on, but I will watch with interest.

@callppatel
Copy link

@dimbleby, @callppatel I've done some testing and this appears to be an API bug. The docs state that Application.ReadWrite.OwnedBy is sufficient for creating service principals, but this doesn't appear to be the case.

Whilst testing, I needed to have Application.ReadWrite.All, and having only Application.ReadWrite.OwnedBy consistently produced a 403 with the message When using this permission, the backing application of the service principal being created must in the local tenant.

We'll update the provider docs to reflect this, and I will report this to the relevant team. In the meantime we can keep this issue open for tracking.

@manicminer Could you please explain and provide more details about API bug, I used API which is mentioned here https://docs.microsoft.com/en-us/graph/api/serviceprincipal-post-serviceprincipals?view=graph-rest-1.0&tabs=http and I was able to create service principal by using credential which has "Application.ReadWrite.OwnedBy" permission. Mandatory Input for this API is only "displayName"

Which other API terraform use to create this resource? If you provide more details and error then we will contact our MS support as well.

@manicminer
Copy link
Contributor

manicminer commented Sep 2, 2021

Here's a trace of one test attempt: https://gist.github.com/manicminer/9d343cdc8c9cb7765121f74c75acf586

This is the provider making these requests. As you can see the same endpoints are being used (POST /applications and POST /servicePrincipals).

With these permissions:

Screenshot 2021-09-02 at 14 38 23

EDIT: The only difference seems to be the API version. At the moment we are forced to use beta and not 1.0 because there are features missing from 1.0.

@dimbleby
Copy link
Author

dimbleby commented Sep 2, 2021

At the moment we are forced to use beta and not 1.0 because there are features missing from 1.0.

That's kind of scary - Microsoft's take on this is:

Be aware that APIs in preview status are subject to change, and may break existing scenarios without notice. Don't take a production dependency on APIs in the beta endpoint.

Should I infer that it would be unwise to rely on azuread 2.0 in production, until you can move off the beta API?

@manicminer
Copy link
Contributor

At present the API version that's used applies to a given service package (e.g. users, groups, applications) but we are planning to introduce more fine grained control to enable us to select the minimum version for a given resource or operation.

It's quite common for Azure APIs to remain in preview or beta status for extended periods, and we regularly make use of APIs versioned as such in both the AzureRM and AzureAD providers.

The AzureAD provider will be settling into a weekly release cadence just like AzureRM so there will be plenty of opportunity to issue fixes on the rare occasion an API has a breaking change.

@callppatel
Copy link

@manicminer @dimbleby - With my API test, version 1.0 doesn't give any error, while beta API give 403 issue.

@manicminer
Copy link
Contributor

manicminer commented Sep 3, 2021

@callppatel I just tried again, making sure to use the 1.0 API but I still get the same 403 :/

I also checked to see whether we could move to the 1.0 API for this resource, but we use the field publishedPermissionScopes which hasn't yet been added to 1.0

@callppatel
Copy link

publishedPermissionScopes

Could you please share the log for 1.0 API, as pe my test it worked, This was my API call.

POST https://graph.microsoft.com/v1/0/servicePrincipals
      { 
            "appId": "9e7725b0-7988-46be-bbd5-c47f32abdeee"
}

@manicminer
Copy link
Contributor

I just re-tested with version 1.0 and beta and am now getting a 200 response with both 😕

There definitely seem to be some consistency issues here.

@manicminer
Copy link
Contributor

manicminer commented Sep 3, 2021

@dimbleby Have you set the owners for both the application and the service principal in your configuration (to include the calling principal)?

One thing I also noticed is that if you don't set an owner for the application, Terraform will set the initial owner and then remove it, unless you explicitly include the caller in the owners property. This could account for the intermittency of the 403 when creating the SP - as you do need to own the linked application in order to create an SP for it when holding the Application.ReadWrite.OwnedBy role.

@callppatel
Copy link

I just re-tested with version 1.0 and beta and am now getting a 200 response with both 😕

There definitely seem to be some consistency issues here.

my test, never worked with beta and always worked with 1.0 :)

@manicminer
Copy link
Contributor

manicminer commented Sep 3, 2021

@callppatel Did you set owner(s) on both the app and SP when creating them (to include "yourself")? If you assume the API will do that implicitly, it sometimes doesn't.

@dimbleby
Copy link
Author

dimbleby commented Sep 3, 2021

@dimbleby Have you set the owners for both the application and the service principal in your configuration (to include the calling principal)?

yes

@dimbleby

This comment has been minimized.

@callppatel
Copy link

@callppatel Did you set owner(s) on both the app and SP when creating them (to include "yourself")? If you assume the API will do that implicitly, it sometimes doesn't.

No, I didnt set owner. I will check with owner as well.

@callppatel
Copy link

@manicminer @dimbleby
I am able to test and create azuread_application and azuread_service_principal with access of "Application.ReadWrite.OwnedBy". Here is my code. It works all the time.. :)

data "azuread_client_config" "current" {}

#Create Azure AD App Registration
resource "azuread_application" "adapp" {
  display_name = "azured-test-fix"
  owners       = [data.azuread_client_config.current.object_id]
}

#Create Service Principal
resource "azuread_service_principal" "sp" {
  application_id               = azuread_application.adapp.application_id
  app_role_assignment_required = true
  owners                       = [data.azuread_client_config.current.object_id]
}

terraform {
  required_version = ">= 0.14.0"
  required_providers {
    azurerm = ">= 2.48.0"
    azuread = ">=2.1.0"
  }
}

Issue was with resource azuread_application where owner field was missing.

Here is the my finding,

  1. In my terraform azuread_application application was created without owner parameter with (Azured 1.6 version) and by default service principal which create the resource gets added as owner of application. This is the not the case of (Azured 2.1.0 version), if don't pass the owner field, it won’t set. In Azured version(2.1.0 version) owner will be reset to blank if you don’t pass it.
  2. If you mention the same owner in resources “azuread_application” and “azuread_service_principal”, your both resource will get created with "Application.ReadWrite.OwnedBy" as well.
  3. I have tested Azured provider upgrade scenario as well, by just adding owner field in both above resource

@dimbleby
Copy link
Author

dimbleby commented Sep 8, 2021

I also haven't seen the original error recently, but in testing that today I did hit this.

╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.aad.azuread_application.sas[0], provider
│ "provider[\"registry.terraform.io/hashicorp/azuread\"]" produced an
│ unexpected new value: Root resource was present, but now absent.
│ 
│ This is a bug in the provider, which should be reported in the provider's
│ own issue tracker.

@manicminer is this all likely to have the same root cause, or shall I raise a fresh issue?

@manicminer
Copy link
Contributor

@dimbleby If you could open a new issue that'd be great. Please include logs (via TF_LOG=DEBUG). Thanks!

@manicminer
Copy link
Contributor

manicminer commented Sep 9, 2021

After speaking with several parties seemingly affected by this issue, for each case it was resolved by double checking the Terraform configuration in question and ensuring that the owners field is specified for both the linked application and the service principal itself. We'll seek to document this to help future practitioners from falling afoul here.

To clarify, owners must be set for both the application and the service principal, e.g.

data "azuread_client_config" "current" {}

resource "azuread_application" "myapp" {
  display_name = "myapp"
  owners = [
    data.azuread_client_config.current.object_id,
    # ... plus any other desired owners
  ]
}

resource "azuread_service_principal" "myapp" {
  application_id = azuread_application.myapp.application_id
  owners = [
    data.azuread_client_config.current.object_id,
    # ... plus any other desired owners
  ]
}

As such I'm going to close this issue as resolved. Terraform appears to be doing the right thing given the correct configuration. If anyone is still experiencing this after double checking their configuration, please feel free to post a further comment with your exact configuration and debug logs, and I will be happy to help investigate.

Thanks!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants