Cache the Terraform instance state returned from schema.Resource.Apply even if the returned diagnostics contain errors #313
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of your changes
Possibly related issues: crossplane-contrib/provider-upjet-aws#1018, crossplane-contrib/provider-upjet-aws#1010. We are still in the process of reproducing these issues and trying to validate the proposed fix actually resolves them.
This PR proposes a change in which upjet's Terraform plugin SDK-based external-client now caches the Terraform instance state returned from
schema.Resource.Apply
in the external-client'sCreate
function even if the returned diagnostics contain errors. Previously, we were just dismissing this instance state in case of errors.In most cases, the Terraform plugin SDK's create implementation for a resource comprises multiple steps (with the creation of the external resource being the very first step). In case, the creation succeeds but any of the subsequent steps fail, then upjet's TF plugin SDK-based external client will not record this state losing the only opportunity to associate the MR with the newly provisioned external resource in some cases. We now put this initial state into the upjet's in-memory state cache so that it's now available for the external- client's next observe call.
Please note that the safe thing to do in this situation from the Crossplane provider's perspective is to set the MR's
crossplane.io/external-create-failed
annotation because the provider does not know the exact state the external resource is in and a manual intervention may be required. But we currently believe associating the external-resource with the MR will just provide a better UX although the external resource may not be in the expected/desired state yet. We are also planning for improvements in the crossplane-runtime's managed reconciler to better support upjet's async operations in this regard. The managed reconciler is more strict for errors encountered during the creation because such errors can potentially leave the external resource in consistent state and the MR might not have been associated with the newly provisioned external-resource yet. But in our case, we do know that an external resource has been provisioned and we have enough information to associate the MR with the external-resource. The caveat is we cannot be sure whether the external-resource is in the expected state. We may need to revisit this in the future according to the feedback we receive.I have:
make reviewable
to ensure this PR is ready for review.backport release-x.y
labels to auto-backport this PR if necessary.How has this code been tested
Tested with a custom
upbound/provider-aws
build consuming this change and another one that returns an error from the underlying Terraform AWS provider after a successful create SDK call (right after the first external-create step described above).