Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI Plan Output Not Showing null to "" Differences for Non-Legacy Type System Providers #31887

Open
bflad opened this issue Sep 28, 2022 · 15 comments
Assignees
Labels
bug cli UX User experience enhancements v1.2 Issues (primarily bugs) reported against v1.2 releases

Comments

@bflad
Copy link
Contributor

bflad commented Sep 28, 2022

Terraform Version

Terraform v1.2.9
on linux_amd64

(although this has probably been happening for a very long time to hide terraform-plugin-sdk legacy behaviors)

Terraform Configuration Files

terraform {
  required_providers {
    random = {
      source  = "hashicorp/random"
      version = "3.4.2"
      # version = "3.3.2"
    }
  }
}

resource "random_password" "test" {
  length = 20
}

Debug Output

Can provide if needed.

Expected Behavior

Either no plan difference output or a plan difference output including every attribute change triggering the difference for providers not using the legacy type system flag in the protocol.

Actual Behavior

$ terraform apply 
random_password.test: Refreshing state... [id=none]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # random_password.test will be updated in-place
  ~ resource "random_password" "test" {
        id          = "none"
        # (12 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Steps to Reproduce

  1. terraform init with version 3.3.2
  2. terraform apply
  3. terraform init -upgrade with version 3.4.2
  4. terraform plan

The issue references also contain a few other reproductions of similar behaviors.

Additional Context

Terraform Cloud UI seems like it was actually able to pick up the "addition" of the attribute being set to "" from null, so I suspect this is just a CLI plan output issue.

We understand why the provider bug occurred, it was an issue on our side of the protocol in the provide code. Inspecting the differences in state after applying the "unexpected" plan showed the change:

4c4
<   "serial": 1,
---
>   "serial": 3,
28c28
<             "override_special": null,
---
>             "override_special": "",
33,34c33
<           "sensitive_attributes": [],
<           "private": "eyJzY2hlbWFfdmVyc2lvbiI6IjIifQ=="
---
>           "sensitive_attributes": []

Aside: terraform-plugin-sdk also would do some potentially unexpected things with what it decided to store in state (empty vs null), especially with imported resources and with maps that have null values, so there's other similar reports we've been working through with migrating providers from that to terraform-plugin-framework. This isn't core's fault, however I'm saying this out loud in case there's other bug reports that wind up in this issue tracker about "strange" plans as folks migrate.

References

@bflad bflad added bug UX User experience enhancements new new issue not yet triaged labels Sep 28, 2022
@jbardin
Copy link
Member

jbardin commented Sep 28, 2022

Thanks @bflad!

Do you have any examples of non-legacy providers with issues?
The protocol shims were not sufficient to hide the legacy SDK behavior entirely, so IIRC there are some heuristics in the diff rendering to ignore these minor differences too. There might be some way to plumb the "legacy" flag through to the final rendering, but maybe some examples would show an easier workaround?

@bflad
Copy link
Contributor Author

bflad commented Sep 28, 2022

Hi @jbardin 👋 Could you redescribe what you are looking for? Do you mean start to finish reproduction with the protocol flag never involved? I'm not sure if I have one of those readily available. I was raising this to prevent confusion with migrations and slight differences that might arise.

The reproduction here and some of the similar references are for the random provider, which was migrated off the legacy type system flag in version 3.4.0.

@bflad
Copy link
Contributor Author

bflad commented Sep 28, 2022

From an out-of-band conversion, one thing I didn't say out loud here is the schema changes that occurred here:

  1. It was originally Optional only in the sdk
  2. It was errantly introduced in the framework version as Optional+Computed+Default plan modifier of “” (timeframe of this bug)
  3. It was fixed back to Optional only afterwards.

So the missing plan difference was specifically a null configuration and Computed difference.

@jbardin
Copy link
Member

jbardin commented Sep 28, 2022

Thanks, that makes sense now why it was not reported as an error during the plan.
The existing plan renderer already needs some extensive work, so we can probably plan to consider this type of change when we get there. Since the CLI will have to deal with legacy providers for quite a while, defaulting to hiding these minor differences may take precedence, but since this shouldn't be a common occurrence, perhaps an option like #27547 to show the plan in complete detail would suffice.

@jbardin jbardin added cli and removed new new issue not yet triaged labels Sep 28, 2022
@apparentlymart
Copy link
Contributor

In case it's useful context for someone who works on this in future, I believe this is the current location of the legacy SDK workaround that the view layer implements (as opposed to the one implemented in Terraform Core itself):

// We currently have an opt-out that permits the legacy SDK to return values
// that defy our usual conventions around handling of nesting blocks. To
// avoid the rendering code from needing to handle all of these, we'll
// normalize first.
// (Ideally we'd do this as part of the SDK opt-out implementation in core,
// but we've added it here for now to reduce risk of unexpected impacts
// on other code in core.)
changeV.Change.Before = objchange.NormalizeObjectFromLegacySDK(changeV.Change.Before, schema)
changeV.Change.After = objchange.NormalizeObjectFromLegacySDK(changeV.Change.After, schema)
return changeV

The comment here suggests that we did at one point imagine there being a way to handle this in Terraform Core (where it would have access to the "legacy type system" flag) but did this simpler solution to reduce the already-high risk for the v0.12 release. I don't actually remember what that other way was though, so I think we'll have to design it again from scratch if we want to address it now.

@apparentlymart
Copy link
Contributor

Interestingly, the current implementation of NormalizeObjectFromLegacySDK seems to focus only on normalizing the representation of nested blocks, which I don't think explains why override_special (which is an attribute of type string) would be subject to this normalization. I guess that means there's some other normalization going on somewhere else.

@bflad
Copy link
Contributor Author

bflad commented Oct 11, 2022

I'm not sure if it matters much, but hashicorp/terraform-provider-tls#284 is an example of the opposite issue where "" to null is missing from CLI terminal plan output for a provider without the legacy type system flag:

terraform {
  required_providers {
    tls = {
      source  = "hashicorp/tls"
      version = "3.4.0"
      # version = "4.0.3" # upgrade after initial apply to show missing plan output
    }
  }
}

resource "tls_private_key" "example" {
  algorithm = "ECDSA"
}

resource "tls_self_signed_cert" "example" {
  private_key_pem   = tls_private_key.example.private_key_pem
  is_ca_certificate = true

  subject {
    organization = "Example"
  }

  # 3 Years
  validity_period_hours = 24 * 365 * 3

  allowed_uses = [
    "cert_signing",
    "crl_signing",
  ]
}
$ terraform init
$ terraform apply
$ terraform init -upgrade
$ terraform apply
...
      ~ subject { # forces replacement
            # (1 unchanged attribute hidden)
        }
59,61c59,61
<                 "common_name": "",
<                 "country": "",
<                 "locality": "",
---
>                 "common_name": null,
>                 "country": null,
>                 "locality": null,
63,66c63,66
<                 "organizational_unit": "",
<                 "postal_code": "",
<                 "province": "",
<                 "serial_number": "",
---
>                 "organizational_unit": null,
>                 "postal_code": null,
>                 "province": null,
>                 "serial_number": null,

These attributes are Optional: true only within a block. Please let me know if you'd prefer a separate issue for tracking.

@bflad
Copy link
Contributor Author

bflad commented Dec 14, 2022

It looks like this is causing confusion when these "hidden" value differences are underneath collection types as well: https://discuss.hashicorp.com/t/plugin-framework-state-upgraders/47699

@jbardin
Copy link
Member

jbardin commented Jun 13, 2023

For reference, the new renderer inherited the old behavior which is now implemented here.

@jbardin
Copy link
Member

jbardin commented Jun 21, 2023

@bflad, do you have any example of particularly noisy resource configurations which we could use to evaluate how much output this still suppresses? Now that providers are migrating to the framework where these can be meaningful diffs, I wonder if it's time to just accept the inconvenience that SDK resources might occasionally show larger than expected diff output.

@bflad
Copy link
Contributor Author

bflad commented Feb 21, 2024

I'm not sure about "noisy" resource configurations, but this is trivial to reproduce via the built-in terraform_data managed resource. I'm guessing we would want this behavior conditionalized on the legacy type system flag, if possible, to limit the plan renderer causing much more noise with the legacy SDK.

terraform {
  required_version = "~> 1.7"
}

resource "terraform_data" "test" {
  # apply like this, then comment or set to null
  input = ""
}

e.g.

$ terraform apply -auto-approve

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
  + create

Terraform will perform the following actions:

  # terraform_data.test will be created
  + resource "terraform_data" "test" {
      + id     = (known after apply)
      + output = (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.
terraform_data.test: Creating...
terraform_data.test: Creation complete after 0s [id=f95b13bf-5a6b-f206-3604-c97a8f4c7c5c]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

### comment/nullify input

$ terraform plan               
terraform_data.test: Refreshing state... [id=f95b13bf-5a6b-f206-3604-c97a8f4c7c5c]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # terraform_data.test will be updated in-place
  ~ resource "terraform_data" "test" {
        id     = "f95b13bf-5a6b-f206-3604-c97a8f4c7c5c"
      + output = (known after apply)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

@akinross
Copy link

akinross commented Feb 22, 2024

Hi @jbardin,

Currently with the provider we are developing with plugin framework explicit empty strings are not displayed correctly in plan, see hashicorp/terraform-plugin-framework#921 for the configuration and plan output details. From plugin framework side the issue has been closed and not planned, but I was forwarded upstream to this issue.

From @bflad I understood that this is something this code base sees as non-trivial and is not something that will be pursued, what is the reasoning behind this? Reason I am asking is that to me it seems that explicit empty string in plan output is not working as it should.

Would it be possible to reconsider making changes to plan renderer that would fix this specific case?

@jbardin
Copy link
Member

jbardin commented Feb 28, 2024

Hi @akinross, Unfortunately there is no absolute way to differentiate between changes which were intentional, and changes which were because of a misbehaving provider. Completely removing this bit of shim code for legacy provider compatibility would be nice, but it's not something we can do until most of those resources are updated from the SDK to fix the behavior.

In the meantime I have a minor proposal which can detect some cases where the output can be displayed, and we can evaluate if that complexity is worth adding for the partial solution it offers.

@akinross
Copy link

Hi @jbardin, Thank you for your response. I understand that you do not want to introduce breaking change regarding legacy providers. Completely removing would impact our provider also, since we are still in a muxed state and working on migration of all the resources. Would it be a possibility to detect in the rendering the type of provider "legacy SDK" vs "plugin framework", to differentiate to allow both to render properly?

@jbardin
Copy link
Member

jbardin commented Feb 28, 2024

@akinross, the changes are not recorded with any indication of which SDK the provider was using (both internally, and in the serialized plan), so the renderer has no way to differentiate the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug cli UX User experience enhancements v1.2 Issues (primarily bugs) reported against v1.2 releases
Projects
None yet
Development

No branches or pull requests

4 participants