Replies: 3 comments
-
@mbbush In the previous architecture, the problem was that in some cases, even though there was a need for an update (due to diff in the desired state and the actual state), the provider did not trigger the update and acted as if everything was normal and waited for the reconciliation in the poll-interval. Therefore, update requests were made in the next reconciliation loop, not immediately, and ideally, you want this update request to be made immediately. After discovering this problem, we talked about it and the team addressed it in the new architecture. So this feels more like a feature that fixes a bug, rather than a new bug. |
Beta Was this translation helpful? Give feedback.
-
I'm inclined to agree that this is a feature, but it was a surprising one to discover, with some negative consequences in certain cases. I think the benefits are greater than the problems, so I think the only "fix" needed is better documentation/communication about this change, and the scenario that could be a problem. |
Beta Was this translation helpful? Give feedback.
-
We are aiming to make documentation improvements as part of the Upjet 1.2 release and will include this in that scope. |
Beta Was this translation helpful? Give feedback.
-
What happened?
The no-fork architecture seems to put itself in an infinite loop of updating the same resource as soon as possible when the state after apply is not the desired state. In the fork architecture, this only updated once every 10 minutes. In both architectures, this is invisible as far as the resource's status is concerned, but it does fire an event "Successfully requested update of external resource" for every update.
How can we reproduce it?
When applied, AWS sets the assumeRolePolicy to
(ignoring whitespace differences that the terraform provider correctly ignores)
In provider-aws version 0.43, this update happens once every 10 minutes (the default drift detection interval). In provider-aws version 0.45 (using upjet 1.0.0) it happens roughly once per second, about as fast as the api calls to update the resource take to complete.
I'm honestly not sure if this is a feature or a bug. Certainly the best outcome would be for me to not have resources like this, with missing defaults that create a constant diff, and I will certainly do that. But I think this should at least be called out in migration documentation, as users may not be aware of resources which were doing a no-op update every 10 minutes (which is mostly harmless), but the tight loop seems like more of a resource issue.
Beta Was this translation helpful? Give feedback.
All reactions