Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache the Terraform instance state returned from schema.Resource.Apply even if the returned diagnostics contain errors #313

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

ulucinar
Copy link
Collaborator

Description of your changes

Possibly related issues: crossplane-contrib/provider-upjet-aws#1018, crossplane-contrib/provider-upjet-aws#1010. We are still in the process of reproducing these issues and trying to validate the proposed fix actually resolves them.

This PR proposes a change in which upjet's Terraform plugin SDK-based external-client now caches the Terraform instance state returned from schema.Resource.Apply in the external-client's Create function even if the returned diagnostics contain errors. Previously, we were just dismissing this instance state in case of errors.

In most cases, the Terraform plugin SDK's create implementation for a resource comprises multiple steps (with the creation of the external resource being the very first step). In case, the creation succeeds but any of the subsequent steps fail, then upjet's TF plugin SDK-based external client will not record this state losing the only opportunity to associate the MR with the newly provisioned external resource in some cases. We now put this initial state into the upjet's in-memory state cache so that it's now available for the external- client's next observe call.

Please note that the safe thing to do in this situation from the Crossplane provider's perspective is to set the MR's crossplane.io/external-create-failed annotation because the provider does not know the exact state the external resource is in and a manual intervention may be required. But we currently believe associating the external-resource with the MR will just provide a better UX although the external resource may not be in the expected/desired state yet. We are also planning for improvements in the crossplane-runtime's managed reconciler to better support upjet's async operations in this regard. The managed reconciler is more strict for errors encountered during the creation because such errors can potentially leave the external resource in consistent state and the MR might not have been associated with the newly provisioned external-resource yet. But in our case, we do know that an external resource has been provisioned and we have enough information to associate the MR with the external-resource. The caveat is we cannot be sure whether the external-resource is in the expected state. We may need to revisit this in the future according to the feedback we receive.

I have:

  • Read and followed Upjet's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

Tested with a custom upbound/provider-aws build consuming this change and another one that returns an error from the underlying Terraform AWS provider after a successful create SDK call (right after the first external-create step described above).

in external-client's Create even if the returned diagnostics contain
errors.

- In most cases, the Terraform plugin SDK's create implementation
  for a resource comprises multiple steps (with the creation of
  the external resource being the very first step). In case, the
  creation succeeds but any of the subsequent steps fail, then
  upjet's TF plugin SDK-based external client will not record
  this state losing the only opportunity to associate the MR
  with the newly provisioned external resource in some cases.
  We now put this initial state into the upjet's in-memory
  state cache so that it's now available for the external-
  client's next observe call.

Signed-off-by: Alper Rifat Ulucinar <[email protected]>
Copy link
Collaborator

@erhancagirici erhancagirici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is a great catch!

Copy link
Member

@sergenyalcin sergenyalcin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ulucinar LGTM!

@ulucinar ulucinar merged commit 9543be9 into crossplane:main Dec 18, 2023
12 checks passed
@ulucinar ulucinar deleted the capture-id branch December 18, 2023 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants