Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acceptance testing: Ability to prevent a resource being deleted at the end of a TestCase #85

Closed
ewbankkit opened this issue Dec 16, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@ewbankkit
Copy link
Contributor

ewbankkit commented Dec 16, 2022

SDK version

% go list -m github.com/hashicorp/terraform-plugin-sdk/...
github.com/hashicorp/terraform-plugin-sdk/v2 v2.24.1

Use-cases

Acceptance tests for hashicorp/terraform-provider-aws#26274 need to manipulate an active AWS Direct Connect connection. Activating a newly created Direct Connect connection requires manual intervention at a physical location, so creating the aws_dx_connection and waiting for it to become active in a TestStep is not feasible.
Using the TestStep.ImportStatePersist flag with an existing active connection enables the manipulation of that connection to be tested via the acceptance testing framework, but at the end of the TestCase all resources in the Terraform state created during the test are destroyed and destroying the active connection is not at all desirable given the overhead in getting the connection into the active state.

Attempted Solutions

Each TestStep.Check function is passed a *terraform.State object, so I tried deleting the resource from this object. However this in-memory state is never persisted and is local to each individual TestStep.

Proposal

Add a new TestStep field that indicates to the acceptance testing framework that a resource (or list of resources) is not to be deleted at the end of the TestCase.

References

Upstream Terraform CLI issue for configuration language support of a similar facility: hashicorp/terraform#3874.

@ewbankkit ewbankkit added the enhancement New feature or request label Dec 16, 2022
@bflad
Copy link
Contributor

bflad commented Dec 16, 2022

Implementation Proposals

This implementation should likely reach out to the terraform state rm command, following the current model that the testing framework uses Terraform CLI commands and their outputs as its interface to perform any manipulation of state, etc. The current version of terraform-exec already supports terraform state rm via (tfexec.Terraform).StateRm().

I would also recommend that this implementation accept a list of resource addresses, similar to the Taint field, to simplify provider implementations should it need to be performed across multiple resource instances. []string is okay.

Given the current setup of the testing framework, which is struct type-based, I would encourage we follow the existing pattern of adding fields rather than introduce a separate code paradigm for this feature. In a future major version of the testing framework, we can remove the slippery slope of forever expanding TestCase/TestStep fields that require additional documentation and validation burden to know how to use properly (e.g. switch TestStep to an interface with various implementation types instead).

State Removal TestCase Field

Provider implementation example:

resource.TestCase{
  // ... other fields ...
  Steps: []resource.TestStep{/* ... */},
  RemoveStateBeforeDestroy: []string{/* resource address strings */},
}

The testing framework would inspect this TestCase field to call state removal commands prior to executing its deferred "destroy step". This has the benefit of always running should any TestStep fail, but potentially hard to discover, and does not enable provider developers to customize when the command is called.

State Removal TestStep Field and "Type"

Provider implementation example:

resource.TestCase{
  // ... other fields ...
  Steps: []resource.TestStep{
    // ... other steps ...
    {
      RemoveState: []string{/* resource address strings */},
    },
    // ... potentially other steps ...
  }
}

The testing framework would execute state removal commands at the time of encountering this type of TestStep. I would encourage that this field be treated as its own TestStep "type" different from other TestStep implementations for apply, import, and refresh steps and validate it as such. The Taint field, which is similar in nature, is awkward to reason about as part of the same step as an apply.

This TestStep-based approach has the benefit of allowing provider developers to customize when this logic runs as part of the TestCase but has the downside of not being automatically executed before the testing framework's implied destroy step should any TestStep fail. The TestCase logic could peek ahead after a TestStep failure to attempt to find and call these state removals, but it would have to be okay with ignoring command failures in that case (which might be an acceptable failure mode). Provider developers would not have a good way to ignore that failure mode behavior though, should they want to actually destroy a resource after a failure, although that does feel like a niche use case.

In this scenario, the following validation should be performed:

  • RemoveState is not present in the same TestStep as Config: "not empty", ImportState: true, or RefreshState: true
  • RemoveState is not present in the first TestStep of TestCase.Steps, similar to RefreshState

State Removal on Destroy TestStep Field

Provider implementation example:

resource.TestCase{
  // ... other fields ...
  Steps: []resource.TestStep{
    // ... potentially other steps ...
    {
      Config: "...",
      RemoveStateOnDestroy: []string{/* resource address strings */},
    },
    // ... potentially other steps ...
  }
}

This would combine ideas from the first two proposals, but make it so the testing framework caches the state removals so they are always executed on before destroy, even on step failures. This may be difficult for provider developers to reason about though, similar to the Taint field, since the actual state removal action may happen "at a distance" compared to where it is defined.

Generic Terraform Command TestStep Field

Provider implementation example:

resource.TestCase{
  // ... other fields ...
  Steps: []resource.TestStep{
    // ... other steps ...
    {
      RunCommand: func(ctx context.Context, tf *tfexec.Terraform) error {
        return tf.StateRm(ctx, "resource address")
      },
    },
    // ... potentially other steps ...
  }
}

Another option would be to take the TestStep approach and make it generic for any Terraform command supported by the underlying terraform-exec. The testing framework would just pass the already configured Terraform instance to the provider-defined logic. While this would allow provider developers to do essentially anything without testing framework changes, it could allow provider developers to introduce unexpected and unverified behaviors into their testing. This includes but is not limited to:

  • Adjusting environment variables
  • Working outside the intended working directory
  • Working with data in separate Terraform files in the working directory (we'd need to expose the current files somehow)

Providing a limited interface to *tfexec.Terraform that prevents these issues would be a lot of effort in the context of this feature request. The current implementation of the testing framework is based on higher level abstractions, so keeping that pattern seems ideal. A future iteration of the testing framework can tackle this sort of solution.

@bflad
Copy link
Contributor

bflad commented Feb 28, 2023

With the introduction of the terraform-plugin-testing module (migration guide), we are transferring feature requests relating to the helper/resource "testing framework" in the terraform-plugin-sdk repository over to the new terraform-plugin-testing repository.

@bflad bflad transferred this issue from hashicorp/terraform-plugin-sdk Feb 28, 2023
@vmanilo
Copy link

vmanilo commented Mar 3, 2023

hey guys 👋

I have one recent use-case with testing resource logic, which requires to import existing resource to the acc test flow, then I'm trying to modify it and checking the error message. Due to backend logic I can't create such resource type directly from terraform, but as this resources existing in the system, potentially user may be able to import this resource to the terraform and then try to modify it.
That's why I was looking for how to setup test in the way that I can import resource, but when test finished - prevent it from destroying.

Please take a look at this test https://github.com/Twingate/terraform-provider-twingate/pull/286/files#diff-520bbd3087681ff4b701d292654e17eb48c163fd43a3cf1f87994fcef3614817

@austinvalle
Copy link
Member

Bubbling up this info from the recent closed PR:

In the near/medium term (current forecast looks like 1.7, so later this year), terraform will support the ability to unmanage resources (e.g. the equivalent of state rm) by putting something in the configuration and it’ll be part of the plan like everything else, similar to the import block introduced in 1.5. This means it would be feasible to write another Config step that applies the state removal without any testing framework changes.

@bflad
Copy link
Contributor

bflad commented May 30, 2024

Hi folks 👋

To close the loop here, Terraform 1.7 and later do include the ability to unmanage resources via the removed configuration block. To utilize this in your acceptance testing, add a TestStep with a configuration that does not include the resource block of the resource to unmanage but does include the removed block with destroy = false for that resource.

As a quick example:

[]resource.TestStep{
  {
    Config: `
resource "examplecloud_thing" "test" {}
`,
  },
  // ... potentially other TestStep ...
  {
    Config: `
removed {
  from = examplecloud_thing.test

  lifecycle {
    destroy = false
  }
}
`,
  },
}

This will remove the state for the resource, meaning that the implicit destroy ran at the end of the test steps will no longer try to destroy that resource.

If you have your acceptance testing setup to run across many Terraform versions, there is support for Terraform version checks, such as tfversion.SkipBelow(tfversion.Version1_7_0), which can prevent testing errors from earlier versions of Terraform which do not support the removed block.

Hope this helps!

@bflad bflad closed this as not planned Won't fix, can't repro, duplicate, stale May 30, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
4 participants