-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce aws-3880 in a test #1917
Conversation
@@ -102,17 +102,23 @@ func (ta *typeAdapter) NewValue(value any) tftypes.Value { | |||
switch v := value.(type) { | |||
case map[string]any: | |||
values := map[string]tftypes.Value{} | |||
for k, el := range v { | |||
values[k] = fromType(aT[k]).NewValue(el) | |||
for key, expectedType := range aT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted that tftypes.Value representation insists (via panics) that every attribute has an entry, even if it's a nil entry. It also insisted on no optional attributes. This is now the case in this adapter, it frees the test writer from writing explicit nulls for attributes that do not matter.
} | ||
return n | ||
}, | ||
// Set: func(v interface{}) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was really here for debugging only. The original in AWS does not specify a custom Set.
for _, v := range value.AsValueSet().Values() { | ||
newBlock := body.AppendNewBlock(key, nil) | ||
writeBlock(newBlock.Body(), elem.Schema, v.AsValueMap()) | ||
if !value.IsNull() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tolerate missing values encoded by nulls - as required by tftypes.Value (and translated to cty.Value).
@@ -58,8 +58,8 @@ func runDiffCheck(t T, tc diffTestCase) { | |||
tfwd := t.TempDir() | |||
|
|||
tfd := newTfDriver(t, tfwd, providerShortName, rtype, tc.Resource) | |||
_ = tfd.writePlanApply(t, tc.Resource.Schema, rtype, "example", tc.Config1) | |||
tfDiffPlan := tfd.writePlanApply(t, tc.Resource.Schema, rtype, "example", tc.Config2) | |||
_ = tfd.writePlanApply(t, tc.Resource.SchemaMap(), rtype, "example", tc.Config1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource may specify Schema or SchemaFunc, and SchemaMap() normalizes to access either, this is the way.
Alright, there is something fairly unexpected going on with what the hash function receives under TF proper for this example during a normal TF lifecycle. I have instrumented the set function:
These are the invocations under TF itself.
|
Under Pulumi there is a new and exciting combination I have not seen under TF:
|
I think this new log line comes from
This sort of makes sense, the makeDetailedDiff/visitPropertyValue call chain assumes that it can compute TF representations directly off |
Let me play a bit more to be absolutely sure that fixing this would resolve the bug. |
Turns out to be a red herring - not the root cause; disabling detailed diff is not affecting the test result here.
So the root cause is closer to the other avenue - Pulumi is confusing 643481770 and 4129856294 hashes So this is about whether custom_response is populated or not under rule[0].action.block.
|
To be continued here - curious at which earliest point Pulumi starts to diverge from TF on the custom_response. In the meanwhile we can try to teach the AWS provider to disregard the custom_response difference for the purposes of hashing. This might workaround the problem for this resource in particular. |
This kind of looks like #1915 |
{ | ||
"action": { | ||
"block": { | ||
"customResponse": null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the smallest repro so far. Very interesting. You can omit "customResponse" here and it still thinks this is a change.
Spelunking further. The custom_resource null entry got injected here by provider2.InstanceState
|
Hmm, so I'm pretty sure this the place we hit. NormalizeObjectFromLegacySDK is the place where we go from |
}) | ||
} | ||
|
||
func TestAws3880Minimal(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's quite a few levels of nesting - if required then we need to update the property based tests to go deeper - this could not have been generated there.
This gets more interesting. I've instrumented resource.SimpleDiff. This is the method at which resource planning bottoms out in both scenarios.
the only change is around rawconfig.. I can fix that by a small change, however this does not fix the test. |
The change is this:
|
This is super fascinating, so under both Pulumi and TF, even if we pass the exact same data into res.SimpleDiff and get exact same InstanceDiff back we get different results. The InstanceDiff object printout is the same in both cases:
|
I need to take a break from this, but my last hypothesis is that TF just ignores the IntanceDiff .. the only reason we use it are legacy reasons. TF might be computing cty.Value PlannedState and then diffing existing state vs PlannedState to show detailed diff. This is kind of implied in the gRPC protocol of PlanResourceChange that does not expose InstanceDiff on the wire. I had to patch the terraform-plugin-sdk to get a hold of this object which clearly is playing against the rules. The reasoning was that our detailedDiff computation was very difficult to edit for historical reasons, and it is written against the InstanceDiff type. I think if this hunch is right here, we need to finally rewrite detailed diff. It can be done by comparing two cty.Value instances - one from prior state and one from PlannedState and translating the diff back to Pulumi domain (this assumes there are no Pulumi properties without matching TF properties). This can clean out a lot of things. |
Or we can indeed diff Pulumi representations as outlined in #1895, with the caveat that we likely will need to special-case set diffing and the set typing information is only available in TF world. |
Interesting. I can confirm that in this case in PlanResourceChange we get:
And yet we have non-zero DiffAttributes
Consequentially this heuristic is the root cause of the wrong decision here, as And this is the root-cause fix: |
With AWS 3880 there is some evidence (derivation in #1917) that sometimes TF has entries in the InstanceDiff.Attributes while still planning to take the resource to the end-state that is identical to the original state. IN these cases, TF does not display a diff but Pulumi does. The root cause here remains unfixed (#1895) - Pulumi bridge is editing terraform-pulgin-sdk to expose the InstanceDiff structure to connect it to the makeDetailedDiff machinery. Pulumi should, like TF, stick to the gRPC protocol and rely only on the PlannedState value. We can incrementally approach the desired behavior with this change though which detects PlannedState=PriorState case and suppresses any diffs in this case. Fixes: - pulumi/pulumi-aws#3880 - pulumi/pulumi-aws#3306 - pulumi/pulumi-aws#3190 - pulumi/pulumi-aws#3454 --------- Co-authored-by: Venelin <[email protected]>
The AWS issue got fixed! Closing this. |
Toward a min-repro for pulumi/pulumi-aws#3880
WebACL Rule attribute has a complicated schema and it looks like something is getting distorted under the Pulumi translation with no user changes, so that a different set element identity is computed under Pulumi during the first and the second
pulumi up
.This resource is already under PlanResourceChange flag.
Narrowing it down from here.