Allow provider to define uniqueness of a resource #224

radeksimko · 2019-10-30T16:00:53Z

Problem Statement

ID field

Each resource today has an ability to set an ID.

This was and to some extent still is treated as a "meta-field" which is special-cased throughout Terraform and SDK, e.g. it has its own getter/setter (ResourceData.GetId(), ResourceData.SetId()), existence of ID was (in Terraform <0.12) internally used as indicator whether a resource exists or not and it is currently the easiest way to import resources (via their ID) and id is always available in CRUD but must not be overridden in the schema.

ID is however assigned when the resource has either started the process of creation or finished creating (in CRUD). Terraform currently has no way of constructing unique identifier beforehand and therefore no way of predicting whether it will collide with any other unique identifier (it doesn't even attempt to do so for that reason).

Example

DNS is a globally shared namespace and it's possible that in some DNS providers you are able to create duplicate domains for valid reasons (e.g. migration), but often times this is not possible. Route53 will return error for one of the following zones - depending on which one is scheduled for creation 1st while Terraform parallelises requests.

resource "aws_route53_zone" "primary" {
  name = "example.com"
}

resource "aws_route53_zone" "secondary" {
  name = "example.com"
}

Proposal

Implementation is likely to be subject to RFC & further discussion.

The schema could allow marking certain fields as "identifier", then core would be able to query provider for such enhanced schema and combined with values from the config it should be able to tell if there are any duplicate resources.

(hypothetical) Example

func resourceAwsRoute53Zone() *schema.Resource {
	return &schema.Resource{
		// ...
		Schema: map[string]*schema.Schema{
			"name": {
				Type:         schema.TypeString,
				Required:     true,
				ForceNew:     true,
				IsIdentifier: true,
			},

then assuming the HCL config from above

$ terraform plan

Error: 2 duplicate resources found.

  on main.tf line 2, in resource "aws_route53_zone" "primary":
   1: resource "aws_route53_zone" "primary" {
   2:     name = "example.com"

Better and more detailed diagnostic messages are subject to our ability to either display & highlight multiple fields which can make up the identifier or be able to identify the most important one in case there's more than one.

References

The text was updated successfully, but these errors were encountered:

bflad · 2019-11-06T14:55:50Z

(Updated the issue description to include a references section 👍 )

In the Terraform AWS Provider, this functionality would become immensely more helpful if given the availability of both resource configuration value(s) (historically via ResourceData) and provider configuration values (historically via meta). As a small sample, the following resources have uniqueness constraints on both dimensions of AWS Region and a name-like identifier:

aws_backup_vault (unique name per AWS Region per AWS Account ID)
aws_dynamodb_table (unique name per AWS Region per AWS Account ID)
aws_sns_topic (unique name per AWS Region per AWS Account ID)

Or 1 per provider configuration, such as:

aws_ebs_default_kms_key (1 per AWS Region per AWS Account ID)
aws_ebs_encryption_by_default (1 per AWS Region per AWS Account ID)
aws_iam_account_alias (1 per AWS Account ID, naming doesn't matter)
aws_iam_account_password_policy (1 per AWS Account ID, no naming)

Another consideration may be resources where there can only be 1 per some resource. Sample resources:

aws_s3_bucket_notification (1 per S3 Bucket name)
aws_s3_bucket_policy (1 per S3 Bucket name) -- same applies across most, if not all aws_*_policy resources except IAM

Or as another example, 1 per containing resource and name-like identifier, such as:

aws_iam_group_policy (unique name per IAM Group name)
aws_iam_role_policy (unique name per IAM Role name)
aws_iam_user_policy (unique name per IAM User name)

Hopefully the examples above highlight that determining the uniqueness constraints might be a little more nuanced than a simple field on a single attribute. To put some of the examples into "how these values would be determined in a Create/Read/Update/Delete function manner with ResourceData and the provider interface":

aws_backup_vault: fmt.Sprintf("%s:%s:%s", meta.(*AWSClient).accountid, meta.(*AWSClient).region, d.Get("name").(string)
aws_ebs_default_kms_key: fmt.Sprintf("%s:%s", meta.(*AWSClient).accountid, meta.(*AWSClient).region
aws_s3_bucket_notification: d.Get("bucket").(string)
aws_iam_group_policy: fmt.Sprintf("%s:%s:%s", meta.(*AWSClient).accountid, d.Get("group").(string), d.Get("name").(string))

Maybe something conceptually like the below at the resource level could do this?

type UniquenessIdentifierFunc func(ResourceData, interface{}) string

type Resource struct {
  // UniquenessIdentifier is a string identifier to determine duplicate resource configurations.
  // This is globally checked against all other Terraform resources in a single run.
  // Any values that cannot be determined at plan time, such as those not known until an upstream
  // resource has been applied, this check to be skipped. If the return value is an empty
  // string (""), this check will be skipped.
  UniquenessIdentifier UniquenessIdentifierFunc
}

// Example in a Terraform AWS Provider resource, such as aws_backup_vault
func PerRegionalNameUniquenessIdentifier(d schema.ResourceData, meta interface{}) string {
  return fmt.Sprintf("%s:%s:%s", meta.(*AWSClient).accountid, meta.(*AWSClient).region, d.Get("name").(string))
}

UniquenessIdentifier: PerRegionalNameUniquenessIdentifier,

// Another example in Terraform AWS Provider, such as aws_ebs_default_kms_key
func PerRegionUniquenessIdentifier(d schema.ResourceData, meta interface{}) string {
  return fmt.Sprintf("%s:%s", meta.(*AWSClient).accountid, meta.(*AWSClient).region)
}

UniquenessIdentifier: PerRegionUniquenessIdentifier,

The nice part here is that it gives resources the freedom to fully synthesize the criteria for uniqueness although it comes at the expense of being a little verbose. The logic checking these would need bypass any with computed attribute values.

When writing up the above examples, I found it may make more sense to have the return value be a *string to allow for nil values or treat "" as a special value to skip the check. A problem that we have in the Terraform AWS Provider is that while the AWS Account ID is generally available, there are reasons to disable the provider initialization for it (skip_requesting_account_id) which leaves that crucial meta.(*AWSClient).accountid empty for us. We wouldn't want a configuration that happens to be against two AWS Accounts correctly to throw a uniqueness error (the result being ":us-east-1" for both resources in that case). Here's how the logic would more likely have to look to appropriately handle that situation:

// Example in a Terraform AWS Provider resource, such as aws_backup_vault
func PerRegionalNameUniquenessIdentifier(d schema.ResourceData, meta interface{}) string {
  accountID := meta.(*AWSClient).accountid

  if accountID == "" {
    return "" // or nil if *string
  }

  return fmt.Sprintf("%s:%s:%s", accountID, meta.(*AWSClient).region, d.Get("name").(string))
}

UniquenessIdentifier: PerRegionalNameUniquenessIdentifier,

// Another example in Terraform AWS Provider, such as aws_ebs_default_kms_key
func PerRegionUniquenessIdentifier(d schema.ResourceData, meta interface{}) string {
  accountID := meta.(*AWSClient).accountid

  if accountID == "" {
    return "" // or nil if *string
  }

  return fmt.Sprintf("%s:%s", accountID, meta.(*AWSClient).region)
}

UniquenessIdentifier: PerRegionUniquenessIdentifier,

I also wonder if maybe the return value could be something like []string, where if any of the values are "" then it automatically skips the check. This would cover the AWS Account ID scenario above as well as computed values.

type UniquenessIdentifierFunc func(ResourceData, interface{}) []string

type Resource struct {
  // UniquenessIdentifier is an identifier to determine duplicate resource configurations.
  // This is globally checked against all other Terraform resources in a single run.
  // If the return value is an empty string ("") for any elements, this check will be skipped.
  // Any values that cannot be determined at plan time, such as those not known until an
  // upstream resource has been applied, will also cause this check to be skipped.
  UniquenessIdentifier UniquenessIdentifierFunc
}

// Example in a Terraform AWS Provider resource, such as aws_backup_vault
func PerRegionalNameUniquenessIdentifier(d schema.ResourceData, meta interface{}) []string {
  return []string{
    meta.(*AWSClient).accountid,
    meta.(*AWSClient).region,
    d.Get("name").(string),
  }
}

UniquenessIdentifier: PerRegionalNameUniquenessIdentifier,

// Another example in Terraform AWS Provider, such as aws_ebs_default_kms_key
func PerRegionUniquenessIdentifier(d schema.ResourceData, meta interface{}) []string {
  return []string{
    meta.(*AWSClient).accountid,
    meta.(*AWSClient).region,
  }
}

UniquenessIdentifier: PerRegionUniquenessIdentifier,

Thanks for the consideration! 😄

radeksimko · 2019-11-06T15:49:43Z

@paddycarver described to me earlier in a separate conversation that it would be helpful in GCP provider to have the ability to validate all occurrences of a given resource type. The main reason was also uniqueness - so I believe the problem definition matches one of this issue. We couldn't really come up with any other reason to run such validation other than to guarantee/validate some form of uniqueness.

Paddy mentioned two GCP specific examples:

google_app_engine_application is unique per Google Project
a disk may only be attached to one instance at a time -> while a provider may have _attachment resource which manages the disk/instance relationship and that resource may have unique ID, not all providers have _attachment resources for various reasons

@paddycarver feel free to add/correct anything above.

bflad · 2019-11-06T16:15:02Z

There is a related upstream Terraform Core issue here: hashicorp/terraform#22094

bflad · 2019-11-06T16:48:13Z

a disk may only be attached to one instance at a time -> while a provider may have _attachment resource which manages the disk/instance relationship and that resource may have unique ID, not all providers have _attachment resources for various reasons

Oh! This reminds me of very similar problems we have in the Terraform AWS Provider where there may be two methods for managing a specific piece of infrastructure such as EC2 Security Groups, but using both methods can causes perpetual differences:

Inline aws_security_group resource rules via ingress and egress configuration blocks
Separate aws_security_group_rule resources with type argument set to ingress or egress

In this case the "uniqueness" is even more nuanced since its involves the values of "sub" resources, but the "sub" resources have no identifiers of their own and it actually has nothing to do with overlap either (its matching EC2 Security Group IDs and the aws_security_group resource having any inline blocks of the same aws_security_group_rule resource type argument 😬).

I'm wondering if this class of problem should be considered outside the scope of this feature request due to its inherent additional complexity and maybe treated with a different type of solution that can take specific other resource types and their attribute keys/values into consideration.

tombuildsstuff · 2019-12-08T22:06:15Z

Presumably having this information available would make caching of duplicate Data Sources possible too? hashicorp/terraform#19942

radeksimko · 2019-12-09T07:01:44Z

@tombuildsstuff I would say easier.

Due to their nature, data sources may require slightly different approach though. Unlike resources which tend to be managed (and changed) entirely via Terraform, data sources may change outside of Terraform more often and what we consider ID today is far less likely to represent unique state and data sources are expected to always have fresh state.

For example you wouldn't want aws_ami to keep stale AMI ID, when you have a fresh one, nor you'd want to implement something like fresh = true/false that will just confuse less skilled users and make them flip it back and forth. I think I'd just expect this to work out of the box and not make unnecessary requests, but always get fresh data.

#7 describes an example of how we may address that. Perhaps a combination of these two suggestions will work - labelling fields as "unique combination" (e.g. name+etag) and make it easier to send more "lightweight" HEAD requests to check whether anything changed, before falling back to Read() and make it easier to just attach headers such as If-None-Match to all requests.

We'll probably need to check what major APIs support today but I'm guessing some combination of the above will satisfy the majority.

bflad · 2020-07-30T00:43:13Z

I no longer have rights in this repository to edit the issue description references again, but please note I've consolidated the various Terraform AWS Provider use cases into hashicorp/terraform-provider-aws#14394.

radeksimko added the enhancement New feature or request label Oct 30, 2019

radeksimko mentioned this issue Oct 30, 2019

TF not warning same resource being declared twice hashicorp/terraform-provider-aws#10621

Closed

bflad mentioned this issue Nov 5, 2019

creating multiple aws_sns_topic with the same name hashicorp/terraform-provider-aws#6245

Closed

bflad mentioned this issue Nov 6, 2019

More than 2 ECR lifecycle policies can't be defined for one ECR repository hashicorp/terraform-provider-aws#6212

Closed

This was referenced Nov 6, 2019

Global IDs for representing relationships between resource objects (object containment, name collision detection, etc) hashicorp/terraform#22094

Open

Plan-time validation if attempting to manage the same physical resource with two Terraform resources #202

Closed

bflad mentioned this issue Nov 26, 2019

Not throwing Error while creating multiple s3 buckets with same bucket name in a single AWS account using terraform. hashicorp/terraform-provider-aws#11022

Closed

radeksimko added upstream-terraform upstream-protocol Requires change of protocol specification, i.e. can't be done under the current protocol labels Dec 8, 2019

This was referenced Dec 8, 2019

Error when fixing template with 2 IAM policies with the same name #104

Closed

When lifecycle has create_before_destroy, we can ask the provider ahead-of-time if naming conflicts will block create-before-destroy #127

Closed

UnquietCode mentioned this issue Jan 7, 2020

Make provider configuration available prior to GetSchema call #281

Open

bflad mentioned this issue Jul 20, 2020

EntityAlreadyExists on Terraform managed resources hashicorp/terraform#25614

Closed

bflad mentioned this issue Jul 30, 2020

Plan-Time Resource Uniqueness Errors hashicorp/terraform-provider-aws#14394

Open

tombuildsstuff mentioned this issue Aug 24, 2020

azurerm_kubernetes_cluster_node_pool needs validation for creating node pools with the same name hashicorp/terraform-provider-azurerm#8202

Closed

gdavison mentioned this issue Sep 23, 2020

aws doesn't support two policies to be applied on the same efs file system - so aws provider shouldn't either hashicorp/terraform-provider-aws#15318

Closed

tombuildsstuff mentioned this issue Dec 16, 2020

azurerm_app_service_certificate issue when creating multiple App Service certificates backended by same Key Vault Cert hashicorp/terraform-provider-azurerm#4991

Closed

ewbankkit mentioned this issue Feb 10, 2021

Check if bucket with that "name" already exists when running plan hashicorp/terraform-provider-aws#2187

Closed

bflad mentioned this issue Mar 1, 2021

Warning when creating multiple aws_s3_bucket_notification resources on the same bucket hashicorp/terraform-provider-aws#17842

Closed

dvdvorle mentioned this issue Mar 21, 2021

Provider should error on duplicate resource_group names hashicorp/terraform-provider-azurerm#11000

Closed

ewbankkit mentioned this issue May 24, 2021

Duplicated aws_volume_attachment configurations cause incorrect behavior hashicorp/terraform-provider-aws#19479

Closed

tombuildsstuff mentioned this issue Mar 1, 2022

Duplicate Resource Group Resources Possible hashicorp/terraform-provider-azurerm#15633

Closed

Sczlog mentioned this issue Jun 26, 2023

cluster read should base check if ip of cluster is already connected smartxworks/terraform-provider-cloudtower#14

Closed

bflad mentioned this issue Dec 8, 2023

PlanResourceChange can't indicate whether it is for a "recreate" hashicorp/terraform-plugin-go#350

Open

b-deam mentioned this issue May 9, 2024

Refactor 'id' schema for SLO resource elastic/terraform-provider-elasticstack#622

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow provider to define uniqueness of a resource #224

Allow provider to define uniqueness of a resource #224

radeksimko commented Oct 30, 2019 •

edited by bflad

Loading

bflad commented Nov 6, 2019 •

edited

Loading

radeksimko commented Nov 6, 2019

bflad commented Nov 6, 2019

bflad commented Nov 6, 2019

tombuildsstuff commented Dec 8, 2019

radeksimko commented Dec 9, 2019 •

edited

Loading

bflad commented Jul 30, 2020

Allow provider to define uniqueness of a resource #224

Allow provider to define uniqueness of a resource #224

Comments

radeksimko commented Oct 30, 2019 • edited by bflad Loading

Problem Statement

ID field

Example

Proposal

(hypothetical) Example

References

bflad commented Nov 6, 2019 • edited Loading

radeksimko commented Nov 6, 2019

bflad commented Nov 6, 2019

bflad commented Nov 6, 2019

tombuildsstuff commented Dec 8, 2019

radeksimko commented Dec 9, 2019 • edited Loading

bflad commented Jul 30, 2020

radeksimko commented Oct 30, 2019 •

edited by bflad

Loading

bflad commented Nov 6, 2019 •

edited

Loading

radeksimko commented Dec 9, 2019 •

edited

Loading