Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_key_vault_{key|secret|certificate} - Directly using key_vault_id during read and delete #20326

Closed
wants to merge 2 commits into from

Conversation

magodo
Copy link
Collaborator

@magodo magodo commented Feb 6, 2023

The key_vault_id is a required, force new property, which is always a constant in the state file if the resource is properly created/imported via Terraform. Especially, for import, this property is constructed by the importer and set to the state prior to calling the Read for all the affected 3 resources. Therefore, there is actually only one place that is required to transmute the key vault URI to its resource id: the importer. While for the others:

  • Create/Update: it is always available in the plan input
  • Read/Delete: it is always available in the state

This PR removes the transmuation code in the read and delete, but directly use the key_vault_id from state.

This is a follow up of #17536.

(Finally) Fix #11059.

I've verified that for a terraform workspace that only contains azurerm_key_vault_key, a terraform plan and terraform destroy won't call the resource LIST API.

…ult_id` during read and delete

The `key_vault_id` is a required, force new property, which is always a constant in the state file if the resource is properly created/imported via Terraform. Especially, for import, this property is constructed by the importer and set to the state prior to calling the `Read` for all the affected 3 resources. Therefore, there is actually only one place that is required to transmute the key vault URI to its resource id: the importer. While for the others:

- Create/Update: it is always available in the plan input
- Read/Delete: it is always available in the state

This PR removes the transmuation code in the read and delete, but directly use the `key_vault_id` from state.
@jackofallops
Copy link
Member

hi @magodo

Thanks for this PR.

As mentioned previously (in I think #17536), unfortunately we can't rely on d.Get in both the Read and Delete functions, as there's no guarantees that a value will be available (or where it's sourced from, if the value has changed) - which is why any usages of d.Get in Read functions across the provider are surrounded by a nil-check (parsing the value only if it's available) - instead we can only guarantee that the ID field is available during both Read and Delete methods, meaning we need to rely exclusively on that field.

Instead, can we look to workaround the upstream API in a different manner - perhaps by extending the timeout here?

Thanks!

@magodo
Copy link
Collaborator Author

magodo commented Feb 7, 2023

@jackofallops Unfortunately, extending the timeout doesn't help here since if the target ARM cache doesn't hold the record of the key vault at one point, there is no guarantee when the record will be synced eventually.

I agree that during read, there is only the ID that is guaranteed to be available, mostly because of importing. However, the key vault nested items are a bit special, given all of them have the importer defined, e.g.

Importer: pluginsdk.ImporterValidatingResourceIdThen(func(id string) error {
_, err := parse.ParseNestedItemID(id)
return err
}, nestedItemResourceImporter),

Where in the importer, it always set the key_vault_id to the datasource, otherwise the import will fail:

d.Set("key_vault_id", keyVaultId)

Given this, it should be safe to get the key_vault_id in the Read. As for Delete, I've also mentioned in #17536 that both "destroy" cases work as expected (terraform destroy and remove the config and terraform apply).

Following is the evidence that this works as I said:

The d.Get in both "Read during terraform plan/apply" and "Delete" will ends up to following:

> github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MultiLevelFieldReader).ReadFieldMerge() ./vendor/github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema/field_reader_multi.go:53 (PC: 0xe4a109)
    48:                                         "Error reading level %s: %s", l, err)
    49:                         }
    50:
    51:                         // TODO: computed
    52:                         if out.Exists {
=>  53:                                 result = out
    54:                         }
    55:                 }
    56:
    57:                 if l == level {
    58:                         break
(dlv) p l
"state"

Which indicates that attribute is read from the state.

@tombuildsstuff
Copy link
Contributor

@magodo however this isn't guaranteed to work where the config has been removed and a user runs terraform apply to remove the resource (since the refresh which runs prior to it will fail) - unfortunately by design we can't rely on d.Get in the Read functions.

Since we can't rely on d.Get within the Read function, unfortunately we're unable to ship this approach and as such I'm going to close this PR for the moment - as @jackofallops has mentioned above, we'll need to use a different approach here instead.

Once we're fully migrated over to hashicorp/go-azure-sdk we can investigate using the regional endpoints which should allow us to workaround the API issue by using the location directly, however this is ultimately an API bug rather than something we can reliably fix in Terraform.

I'm wondering if there's a means of working around this API issue in the interim by invalidating the cache using an ever changing filter, for example, rather than doing:

filter := fmt.Sprintf("resourceType eq 'Microsoft.KeyVault/vaults' and name eq '%s'", *keyVaultName)

if we were instead to do check the name equals the exact value, and that it doesn't equal the current time in ticks:

filter := fmt.Sprintf("resourceType eq 'Microsoft.KeyVault/vaults' and name eq '%s' and name ne '%i'", *keyVaultName, time.Now().UnixNano())

WDYT?

Thanks!

@magodo
Copy link
Collaborator Author

magodo commented Feb 9, 2023

@tombuildsstuff My above debug session has been tested for both direct terraform destroy and remove the config then terraform apply (with refresh), both ends up reading the value from state instead of config.

I'm not sure how/whether filter := fmt.Sprintf("resourceType eq 'Microsoft.KeyVault/vaults' and name eq '%s' and name ne '%i'", *keyVaultName, time.Now().UnixNano()) can invalidate the cache, is there some document about this trick?

After all, please reconsider evaluate this PR, as the issue has been really a pain for a long while and this PR can almost fix that (except for the import case). I think we shall review and merge this, then revert the change once we migrate to hashicorp/go-azure-sdk. Or?

@magodo
Copy link
Collaborator Author

magodo commented Feb 13, 2023

A small update about the actual source that reads the key_vault_id during delete, it was not state (though state also has that value), but from the diff (as diff has higher priority than state).

@magodo
Copy link
Collaborator Author

magodo commented Mar 1, 2023

@tombuildsstuff Since our provider is still using v5 protocol, it has the same definition as v6 that the read request won't be able to read from config:

https://github.com/hashicorp/terraform-plugin-go/blob/4d06167ea92b92d8cbd2414f3a7244c5e4b1de99/tfprotov5/internal/tfplugin5/tfplugin5.proto#L251-L265

Can we revisit this PR?

@tombuildsstuff
Copy link
Contributor

@magodo we've closed multiple variations of this PR stating that we can't use this approach - and we continue to have no plans to support using d.Get inside the Delete function - as such unfortunately we can't proceed with this PR.

Since unfortunately we're unable to take this approach and we've closed multiple variations of this PR - I'm going to have to lock this PR for the moment - whilst we're open to other ways of solving this issue (to workaround the API issue), unfortunately we can't use d.Get to do it.

Thanks!

@hashicorp hashicorp locked as resolved and limited conversation to collaborators Mar 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure KeyVault Error: Provider produced inconsistent result after apply
4 participants