-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF tries to destroy/recreate older projects with GAE enabled upon upgrade #1561
Comments
(assigned to both paddy and vincent since vincent is onduty this week, up to both of you who wants to take it) |
Hi, I'm having the same issue as @stevewolter in #1503 (comment). Could the |
Good idea, will do! |
Hi @stevewolter / @endemics! This is just happening because there's no App Engine block defined inside your |
Hi @paddycarver, got it. I knew how to fix this, and the fix is fine, but my problem is rather the surprise impact. My Terraform GCP plugin updated automatically, my config didn't change, and TF would have suddenly deleted all my projects because it thought it needed to disable GAE. That is an unpleasant surprise, and any mitigation of the surprise would be appreciated. |
Hi @stevewolter, thanks for the clarification. And my sincerest apologies for the surprise. I'll be honest, it was just an oversight on my part, and had I properly considered that scenario, I probably would have done the release a bit differently (either holding it for 2.0.0, or messaging it much more strongly). I can definitely understand how that's an unpleasant surprise to find after an upgrade. As for mitigation of the surprise, I'm weighing my options, but I'm not seeing any super good ones:
If there's a possibility I'm missing, I'd definitely love to hear about it. And again, my sincerest apologies for overlooking that scenario during the release, that was a far more unpleasant surprise than I had intended. |
Same for us here, the main issue is that it is a breaking change in a minor release, and was not advertised. |
@paddycarver My apologies if I came across too strongly. My world is a much better world with the App Engine support, I'm seriously happy about it being launched, and I just had some feedback about the rollout. Thank you for working on this. On my end, a well-placed "prevent_destroy" flag on the project prevented real damage. I realize that all of the options have some drawbacks. Might it be feasible to select between two options with a flag? e.g. have a "turn_down_unexpected" flag that controls the behavior when GAE is found but not configured in TF? The flag might be false by default for now and the default could be switched to true after some announcement and lead time. I'm not sure whether that's an option with the way TF plugins interoperate with TF's core, though. |
Hi, I've checked adding
to my While that breaking change was unfortunate I understand that fixing the situation in code at this point is not desirable, especially since the affected user population is probably quite limited. Hence, I wouldn't mind having this bug closed provided that the CHANGELOG's While not directly related to this issue, but somewhat tangent since related to #1503, I also believe that the resource documentation itself should be updated with a note about the need to add the "App Engine Admin API" to the list of enabled APIs if one uses |
Hey @endemics, I updated the wording of the warning in the CHANGELOG. Let me know what you think. |
No apologies necessary :) Honestly, this is just a really bad oversight to have on my part, because the consequences can be catastrophic. I feel bad, and want to fix the situation, but I'm not quite sure I can glue this vase that I broke back together. I may have to settle for just being more careful around vases in the future.
I am interested in this. Did To be clear, the question isn't to absolve us of responsibility for not shipping surprising things, but just so my mental model of user workflows is correct.
I'd be open to that, but that kind of exists today? That's what |
I could see an argument for this, but we don't really document any of the other APIs that our resources need, partially because they could change over time, and keeping on top of that is tricky. Is there something about the App Engine Admin API you believe makes it different and deserving of special handling? |
On Tue, Jun 19, 2018 at 10:57 PM Paddy ***@***.***> wrote:
My apologies if I came across too strongly. My world is a much better
world with the App Engine support, I'm seriously happy about it being
launched, and I just had some feedback about the rollout. Thank you for
working on this.
No apologies necessary :) Honestly, this is just a really bad oversight to
have on my part, because the consequences can be catastrophic. I feel bad,
and want to fix the situation, but I'm not quite sure I can glue this vase
that I broke back together. I may have to settle for just being more
careful around vases in the future.
On my end, a well-placed "prevent_destroy" flag on the project prevented
real damage.
I am interested in this. Did terraform plan (or apply, which now shows
the plan output and asks for confirmation) not show the re-creation? Are
you using a setup that doesn't show you plan output, effectively blindly
applying? (That sounds accusatory, but I don't mean it to be, I'm just
trying to make sure my assumptions about what feedback users are seeing are
accurate.)
To be clear, the question isn't to absolve us of responsibility for not
shipping surprising things, but just so my mental model of user workflows
is correct.
Happy to explain. We are using a setup that applies blindly. For
background, We are a dev team of ~20 people, each of which runs a personal
dev project, and then we have a couple of nightly setups for trusted
testers. We have no customer-facing prod setup yet. Of the devs, ~3 know
enough about Terraform to actually check the plan, and the rest just
learned to type "yes" and ask no questions, so we disabled the
confirmations and started to apply blindly both in the developer-run update
scripts and in the CI.
When we disabled confirmations, I added prevent_destroy flags on key
resources like GCP projects and databases. I guess this probably saved me a
few hours of fixing projects, but no other damage would have happened.
I realize that all of the options have some drawbacks. Might it be
feasible to select between two options with a flag? e.g. have a
"turn_down_unexpected" flag that controls the behavior when GAE is found
but not configured in TF? The flag might be false by default for now and
the default could be switched to true after some announcement and lead
time. I'm not sure whether that's an option with the way TF plugins
interoperate with TF's core, though.
I'd be open to that, but that kind of exists today? That's what
lifecycle.ignore_changes does. And I don't know that it would have
prevented the surprise, because you'd need to edit your config to take
advantage of it. Am I misunderstanding?
Maybe. So what I'm thinking is: Is there a way to make ignore_changes or
prevent_destroy the default behavior only for the case where the absence of
a GAE config would recreate the project? So that a legacy config that
doesn't mention GAE or additional flags at all wouldn't try to destroy a
project that has GAE enabled? I realize that the answer might be "no".
At this time, I guess all the potential damage is already done, so it might
not be worth it at all. Very few folks complained at all, so I guess it's
not a widespread phenomenon in the first place.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1561 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFgJgF4Kr4DFiiCfLBMMu1vDxpmMBUJpks5t-WW2gaJpZM4UR9yQ>
.
|
There is, but it changes the default behavior again, and I'm unsure about the potential for benefit here; I think, as you mentioned, the damage was kind of done at this point, unfortunately.
I do want to be really explicit that this is playing with fire and while it may be appropriate for a dev setup where downtime is tolerable, it's really not an appropriate or supported practice where occasional downtime is not acceptable. (No judgement on which your situation is, that's for you to decide, I just want to be really explicit on this point.) Terraform's All this being said, I'm struggling to come up with an action item for this issue, so I'm going to close it. Again, my sincerest apologies for the oversight. Thank you all for your patience and understanding. |
Sounds good to me, and thank you for the warning. Thank you so much for implementing this feature in the first place (the bumps in the rollout are really a minor issue in a great big picture), and for listening to me. I really appreciate it. |
I don't think recreating the project in order to "remove" app engine will actually work, as projects don't really get deleted from GCP for maybe ~30 days. When TF "destroys" a project it just gets moved to a pending deletion status, and the real kicker, you can't create a new project with the same ID, because the old project with that same ID still exists. I don't think the current behavior of trying to recreate a project in order to disable/remove app engine makes sense in any context.
This seems like the most reasonable action to me, although I agree it is not ideal, just that the other options make even less sense. [edit new issue added here #1973] |
Hey @sjungwirth, a new issue would be the best place to discuss this. |
@paddycarver thanks for the detailed explanation, I came here with the same problem of terraform suddenly trying to wipe out every project. Where's google take on this? Are they even interested or incentivized to make their cloud resources more suited for config management? terraforming resources in google cloud is a real pain atm. Its a mess with IAM, with "enabled" APIs, now mess with the projects and app engine. As a terraform user, I want as many resources managed in the provider as possible with the code. However, I'd much prefer that the features that can blow up high-level resources are not getting into the provider. |
Also, terraform behaves properly (not trying to re-create projects) after this snippet is added to project, terraform plan and then deleted.
PS never mind, it had app engine somehow enabled outside of terraform |
Google's been really receptive to making their resources work well with infrastructure as code tools, and has a team internally working on this provider to make it great. They have a tight working relationship with the people that build the products and APIs, and because of their involvement, we're able to do a lot more than we could with just HashiCorp and outside contributors working on this. A lot of our wins go unnoticed, but that tight partnership has warded off a bunch of user pain before it ever got released, and opened a lot of doors that we'd otherwise consider non-starters. That said, some of these APIs--for example, App Engine--predate Google Cloud itself, and so don't get to take advantage of some of the lessons learned there. Work is being done there, and the teams are doing what they can, but APIs are by nature slow-moving things, because it's important to not break consumer implementations. Overall, however, we're trending in the right direction.
I think that's fair, and is feedback we'll take under advisement. I'm currently circulating a proposal internally for another approach to this, which I'll post here as an issue to gather community feedback on. I'd love to hear your feedback on it. I'll try to link it to this issue when it's posted. |
For reference, if your terraform binary isn't updated to the latest, the provider may silently destroy your project without mentioning the change/intent in a plan. Just happened to me with 0.10.7 |
Is there a bug related to this? That is very surprising and sounds very different from this issue (where projects get recreated, but that is absolutely mentioned in the plan). |
I'm very interested in how that could happen as well, that's not intended behaviour at all. I'm sorry for the inconvenience, I had no idea that would happen. I'm curious, but I think we've also established the current design is not working. #2118 looks to be the way we're moving forward. I need to update the thread with the results of implementation, so #2147 isn't an entirely faithful implementation of that proposal, but it's the closest I could get. I'm hoping it's part of the next release. |
@morgante see #2118 (comment) for RCA. Likely the only bug relates to targeted apply not doing what was expected, and some PEBKAC on my part. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
See report at #1503 (comment)
Answer is probably just to set
app_engine
to computed in the project resource.The text was updated successfully, but these errors were encountered: