Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: allow apply on merge #2172

Open
jacekn opened this issue Mar 30, 2022 · 46 comments
Open

Feature request: allow apply on merge #2172

jacekn opened this issue Mar 30, 2022 · 46 comments
Labels
feature New functionality/enhancement help wanted Good feature for contributors

Comments

@jacekn
Copy link

jacekn commented Mar 30, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Overview of the Issue

Currently when a PR is merged without atlantis apply we end up in a situation where reality doesn't match desired state described with terraform. There is also no way to apply changes post-merge without raising a dummy PR.

At the same time, in GitHub, it's sometimes not possible to prevent people form merging PRs. For example anyone with repo admin access will get implicit push access and thus will be able to merge approved PRs without having to apply first.

In practice this means that there is currently no way to prevent human errors (people merging approved PRs without running atlantis apply first).

I'd like to propose optional functionality to allow atlantis to be configured to apply changes on PR merges. This would allow organizations who cannot remove push or admin access from repos to ensure changes are applied on merges without relying on humans to remember to atlantis apply

I realize that there is a possibility that apply fails and we'll end up being out of sync but this is a trade off that some organizations might be willing to accept.

Reproduction Steps

Open a PR and then merge it without running atlantis apply first. This will result in code being out of sync with reality and there is no way to retroactively apply changes.
The only workaround I found is to open 2nd dummy PR and ensure it's applied before merging.

Logs

Environment details

I'm on atlantis v 0.18.2 but I believe older/newer versions will also be affected by this.

Additional Context

This feature was requested in 2017 but wasn't implemented back then: #36

There is also discussion about the mergable limitations here: #1316

Apply on merge would allow us to create workflow similar to that hashicorp describe in a docs. This makes me think that the functionality is non-controversial as hashicorp themselves describe it.

@jacekn jacekn added the bug Something isn't working label Mar 30, 2022
@chenrui333 chenrui333 added feature New functionality/enhancement help wanted Good feature for contributors and removed bug Something isn't working labels Mar 31, 2022
@edbighead
Copy link
Contributor

will setting atlantis/apply as a required check in branch protection rules help in that case?
note: might be only available on latest version
image

@jacekn
Copy link
Author

jacekn commented Apr 6, 2022

will setting atlantis/apply as a required check in branch protection rules help in that case? note: might be only available on latest version image

No it won't work. For atlantis to respect GitHub branch protection rules we need to use mergable apply requirement. If we set up branch protection rule to require successful atlantis/apply then the PR will not become mergable until we apply and we can't apply until it's mergable...

@satyamz
Copy link

satyamz commented Apr 18, 2022

Hey all,
To add this feature can we do something like this (Please correct me if I am wrong)? :

  • Add a support for a flag apply-on-merge in the atlantis server with which users can specify if they want to enable apply post merge. Add pull_request_event type as merged to PullRequestEventType in server/events/models/models.go. Refer go-github API code
  • Add condition to allow atlantis to apply when there's flag apply-on-merge is set to true and PR is merged.
  • If flag apply-on-merge is set to true, bypass the rule for checking the PR state ctx.Pull.State != models.OpenPullState in the server/events/command_runner.go.
  • Make sure that post run hooks (server/events/post_workflow_hooks_command_runner.go) uses flag apply-on-merge to decide not to merge the PR (since its already merged) and continue with removing the atlantis locks.
  • If state of the PR is open and if atlantis apply comment event occurs, check the flag apply-on-merge and then reject the apply since flag apply-on-merge flag is set.
  • If user comments atlantis apply on the closed PR, return as usual.

There can be an alternate atlantis requirement for apply-on-merge flag. So that users can specify that via config file.
Does this make sense? What else can we improve in this to achieve apply post merge?

Thanks.

@jacekn
Copy link
Author

jacekn commented Apr 21, 2022

Would one of the maintainers be able to access above plan form @satyamz ? We're considering contributing to the project by adding this functionality but would like to know if the plan looks sound and whether a PR would be accepted.

@jghal
Copy link

jghal commented Apr 26, 2022

I'm interested in this due to the GitLab API bug mentioned in #1174 that is preventing Atlantis' mergeable apply requirement from functioning. Having apply-on-merge would let us make sure all the MR approvals are adhered to before the plan can be applied.

@fredcooke
Copy link

I'd like to take this issue one step further:

automatically plan on any change to the PR - failing on other parallel PRs is okay - but maybe make that optional for obvious reasons:

do NOT respond to atlantis apply manual commands on unmerged files

and finally, apply on merge

this relies on the plan being okay and the git provider approvals/results on the PR being acceptable and merge possible - so no smarts needed there, just dump plan output on every push and do apply and dump output on merge.

This tool is almost perfect on the face of it but the above work flow is the only one that makes sense to me for any serious use.

@jacekn
Copy link
Author

jacekn commented May 26, 2022

I'd like to take this issue one step further:

automatically plan on any change to the PR - failing on other parallel PRs is okay - but maybe make that optional for obvious reasons:

I believe plans are regenerated on PR changes already, there is no need to atlantis plan when new commits are added.

do NOT respond to atlantis apply manual commands on unmerged files

Yes absolutely, it's a good point that in this scenario atlantis apply should not do anything.

and finally, apply on merge

this relies on the plan being okay and the git provider approvals/results on the PR being acceptable and merge possible - so no smarts needed there, just dump plan output on every push and do apply and dump output on merge.

Yes exactly, this is exactly the workflow I'd like to use as well!

This tool is almost perfect on the face of it but the above work flow is the only one that makes sense to me for any serious use.

It's good to see that there are more people who have the same requirements!

@iandelahorne
Copy link

We would love something along these lines as feature, as this is something that happens on a regular basis with engineers new to our team, or for engineers from outside our organisation.

The ability to require the atlantis/apply check to pass for users to merge, and for atlantis to consider a PR mergeable if all required checks except for that check pass, would likely be enough for us.

@zepeng811
Copy link
Contributor

zepeng811 commented Aug 3, 2022

we would love to see this feature implemented for our team as well, many people forgot to comment atlantis apply and straight merged the PR. And we can't set the atlantis/apply status check due to the conflict with mergeable

@siggimoo
Copy link

siggimoo commented Aug 3, 2022

Speaking as an interested observer, I'm curious if there's something about Atlantis' design that requires it to only run during the PR phase. I ask because I recently came across a notice that said a failure occurring post-merge "would need" to be fixed by a subsequent PR. Is this really mandatory?

If I have a GitHub Action that fails post-merge, I have the option to just re-run it. Granted, if the fault is something within the repository that should never have been merged I'll most likely fix it in a new PR. But if the issue lies outside the repo, if it's something like an expired token or an intermittent network outage, I just need a button.

Is there something fundamental to Atlantis that prevents this sort of capability?

@jamengual
Copy link
Contributor

is this still an issue with v0.19.8?

@jamengual jamengual added the waiting-on-response Waiting for a response from the user label Aug 26, 2022
@jacekn
Copy link
Author

jacekn commented Sep 5, 2022

is this still an issue with v0.19.8?

As far as I can tell yes this is still a problem.
The tricky part about "apply on merge" functionality is that it's mostly GitHub limitation that we are dealing with. It is currently not possible to confirm whether GitHub merge requirements were met without actually allowing people to merge. And once people merge we have a drift between code and actual state.

@nitrocode
Copy link
Member

Add atlantis/apply as a required check

Set https://www.runatlantis.io/docs/server-configuration.html#gh-allow-mergeable-bypass-apply flag

@jacekn
Copy link
Author

jacekn commented Jan 17, 2023

@nitrocode regardless of the potential workaround I still think this feature would be very useful. Based on thumbs up there are at least 44 others who agree.

We actually added this capability to forked atlantis, we should be in a position to submit a PR soon so that everyone in the community can benefit from it.

Could this issue be reopened?

@jamengual jamengual reopened this Jan 17, 2023
@jamengual
Copy link
Contributor

jamengual commented Jan 17, 2023

this can be reopened but keep in mind that people tried to do this before in different ways and a lot of logic was added and broke other functionality since this part of the code is very tricky due to workarounds and limitations on VCSs.

PRs are welcome but please give it a try to the gh-allow-mergeable-bypass-apply that @nitrocode mentioned.

@nitrocode
Copy link
Member

nitrocode commented Jan 18, 2023

I don't understand what is wrong with the current solution or workaround of using https://www.runatlantis.io/docs/server-configuration.html#gh-allow-mergeable-bypass-apply. Please explain why this method is not what you prefer @jacekn

@Manan-Kothari
Copy link

@nitrocode Adding this flag still doesn't work, unless there's some other setting I'm missing. I've setup the atlantis apply check and adding the flag, but after getting an approval I get this message. Once I remove the atlantis/apply check, I'm able to apply normally.

Setting the flag

    - name: ATLANTIS_GH_ALLOW_MERGEABLE_BYPASS_APPLY
      value: "true"

This is what we have for our apply_requirements

      apply_requirements: [approved, mergeable]

Apply Failed: Pull request must be approved by at least one person other than the author before running apply.

@nitrocode
Copy link
Member

@Manan-Kothari

Apply Failed: Pull request must be approved by at least one person other than the author before running apply.

You have mergeable set and you have PR requirements of 1 approval or more... so you need an approval on the PR before you can merge the PR via atlantis apply.

@Manan-Kothari
Copy link

I did have that which is why I'm asking

image

@jacekn
Copy link
Author

jacekn commented Feb 20, 2023

I don't understand what is wrong with the current solution or workaround of using https://www.runatlantis.io/docs/server-configuration.html#gh-allow-mergeable-bypass-apply. Please explain why this method is not what you prefer @jacekn

Here are the reasons I can think of:

  1. If atlantis just ignores "atlantis/apply" status check then we would require users to know internals of atlantis and know that changes can be applied without this particular status check. Basically there would be no green "merge" button to indicate all requirements are met.
  2. It's not unusual for projects to act on new commits in the main/master branch. For example those can trigger builds or deploys to dev. With this workflow being used in an organization it wold be nice for atlantis to behave in the same way. It's a matter of choice of course. Driving things from GH comments is a valid method to interact with systems too.
  3. The bypass method does not allow admins to restrict who can land changes. With apply on merge admins can control, via GitHub repo perms, who can merge changes thus apply them. People who can apply can be subset of those who can open PRs
  4. The "apply on merge" workflow is what hashicorp describe, for example here. It would be nice to be able to replicate this workflow in atlantis without the need for terraform cloud. Given that the workflow is described by hashicorp I think this workflow should not be considered controversial.

@jacekn
Copy link
Author

jacekn commented Feb 21, 2023

@nitrocode quick question about the --gh-allow-mergeable-bypass-apply flag as it's not clear from the docs. With the flag set will atlantis ensure that all non-status-check branch protection rules are passing? For example branches might require conversation resolution or signed commits. Will those be taken into account and will atlantis only allow changes to be applied if all rules are met?

@fredcooke
Copy link

@jacekn number 4 exactly - terraform cloud works quite nicely in this regard, though it'd be nice to have some alternative to it that isn't proprietary and pay per view.

This stuff is trivial to achieve in something like Jenkins. You need some sort of call back event or webhook on new updates to the default branch ref spec in the remote - eg direct push or PR merge of any kind.

The comment in the docs about lack of status checks in the branch protection rules seems bizarre at first - you can configure them to require any/all status checks and then when some kind of CI/CD registers one against it, suddenly that becomes required, for example.

PR created = plan
Any update to default branch refspec = apply

How hard can it be? I don't understand how that option helps at all, @nitrocode / @jamengual ? Seems to be orthogonal to the request here.

@jamengual
Copy link
Contributor

we are not opposed to add a new functionality that can solve this problem.

there is many ways mentioned that try in one way or another to solve this issue but it looks like at the end of the day something like - - apply-on-merge flag is necessary for those admins that do not like to apply their changes before merging.

@ervin-pactum
Copy link

disclaimer: i did not yet install or use atlantis, rather waiting for this feature to be implemented before trying out.

question: could apply-on-merge be somehow achieved with existing atlantis version, if we would trigger some github action on "push to master" which would then do an api call to atlantis? in this case it could be possible to say that merge condition is "atlantis plan passed without errors, and codeowner gave approval to merge" (that would be taken care of completely at github side) and then after merge, push to master would happen triggering "something" via atlantis http api. I can imagine that main problem could be to somehow infer which exactly prepared plan needs to be run at atlantis side, and then it would just boild down to streaming atlantis api output as github action output for real-time progress report and merge result outcome (pass|errored)

@jamengual
Copy link
Contributor

jamengual commented Feb 23, 2023 via email

@dimisjim
Copy link
Contributor

still an issue with 0.27.2

@jacekn I tried to answer your questions here: #2172 (comment) (i.e have a required atlantis/apply status check + atlantis having ATLANTIS_GH_ALLOW_MERGEABLE_BYPASS_APPLY set to true + apply_requirements: [approved, mergeable, undiverged]), and it didn't respect the fact that there was no approving PR review in the branch protection rule :/

cc @nitrocode

@stasostrovskyi
Copy link
Contributor

stasostrovskyi commented Mar 23, 2024

@dimisjim I think you need to open a separate bug for your case and attach your branch protection configuration. Note that currently that flag in Atlantis only works with branch protection but not repository rulesets and also it only takes into account review state and required status checks and contains a couple of bugs, but there is an open PR to fix them.

@dimisjim
Copy link
Contributor

Indeed: #4193. Thanks for the tip

@mwos-sl
Copy link

mwos-sl commented May 6, 2024

In GitHub there is feature that can be enabled per repo - merge queue, which allows running commands just in the middle between PR schedulled for merge and before the actual merging.
We think this is the best place for applying terraform.
Is there anyone who successfully configured atlantis to achieve this?

@brandon-fryslie
Copy link

ATLANTIS_GH_ALLOW_MERGEABLE_BYPASS_APPLY works great along with the settings to restrict by GitHub team / GitHub user. You can allow 1 team to approve, 1 team to plan, 1 team to apply, users can belong to multiple teams, mix and match as needed (this is on GitHub Enterprise with dozens/hundreds of orgs, so anyone working with those systems needs to be added to the appropriate teams in that org). We have it set up so developers can apply against dev environments as long as another developer approves it, but no one can apply to prod except a small subset of admins. PR must be mergable which requires the PR to be approved by the admin group. Turning on automerge makes the process pretty seamless (once your Terraform runs successfully, of course).

If atlantis just ignores "atlantis/apply" status check then we would require users to know internals of atlantis and know that changes can be applied without this particular status check. Basically there would be no green "merge" button to indicate all requirements are met.

Having a big green merge button is not actually what you want - the PR is NOT mergable until the state has been applied, if you want the state of the repo to match reality. Ideally, merges would be locked while TF running, and the PR is merged immediately upon success. In theory it would be nice if Atlantis could optionally rerun the apply from master but there is some complexity that makes this not desired much of the time (In my experience you want to fix forward for the sake of time).

The bypass method does not allow admins to restrict who can land changes. With apply on merge admins can control, via GitHub repo perms, who can merge changes thus apply them. People who can apply can be subset of those who can open PRs

Github teams and Gihub branch protection rules cover this use case completely as far as I can tell. Have your PR only be mergable if approved by a specific team. Have a specific team be the only ones allowed to run apply (or plan, because plan is dangerous and allows arbitrary code execution). And that's it. Your PR will never be mergable if not approved by your designated team(s) and no one except your configured teams can run plan/apply. Note branch protection rules are configured in GHE, and the Atlantis server side config --gh-team-allowlist="myteam:plan, secteam:apply, DevOps Team:apply controls this in Atlantis.

The "apply on merge" workflow is what hashicorp describe, for example here. It would be nice to be able to replicate this workflow in atlantis without the need for terraform cloud. Given that the workflow is described by hashicorp I think this workflow should not be considered controversial.

Hashicorps recommendations are questionable IMO. I've created some pretty large Terraform setups and in no way would I ever recommend any team starting with Atlantis to start with one module per repo. At a certain point it makes sense, but trying to put one module per repo and versioning everything individually from the beginning is incredibly painful and completely unnecessary.

Hashicorp clearly designed Terraform to run with manual intervention. Running it fully via automation is counter to the workflow Hashicorp requires.


The main reason applying on merge is a bad idea is simple - what if your Terraform apply fails for any reason? If it ran as a server side per-merge hook, I could see that. If the apply fails, do not merge the commit. But any other situation will leave you in a bad position. A stale plan (which can happen for a number of reasons), a name is too long for some research, your AWS IAM policy is too large, duplicate name on s3 bucket, any number of reasons. plan doesn't guarantee that apply will succeed. And a failed apply along with this setting leaves you with half the changes applied and needing to open another PR to fix it, which loses all of the original context. In some situations (enterprise), you may also have a requirement that each PR to have an individual ticket number and now you need to create another ticket, open a new PR on that ticket, and try again.

There is a lot of complexity and if this feature was implemented it would become very clear for most people why it isn't supported. There are too many undesired behaviors that lead to bad outcomes.

@stasostrovskyi
Copy link
Contributor

I can't stress enough how much I agree with post above!

@jacekn
Copy link
Author

jacekn commented May 29, 2024

If atlantis just ignores "atlantis/apply" status check then we would require users to know internals of atlantis and know that changes can be applied without this particular status check. Basically there would be no green "merge" button to indicate all requirements are met.

Having a big green merge button is not actually what you want - the PR is NOT mergable until the state has been applied, if you want the state of the repo to match reality.

This was explained in this issues somewhere but tl;dr; is that you can still end up with out of sync situation even with the atlantis apply workflow if the apply fails. That's because terraform apply is not atomic - apply can succeed only partially.
What the apply on merge feature would do is allow users to choose workflow they prefer.

The bypass method does not allow admins to restrict who can land changes. With apply on merge admins can control, via GitHub repo perms, who can merge changes thus apply them. People who can apply can be subset of those who can open PRs

Github teams and Gihub branch protection rules cover this use case completely as far as I can tell. Have your PR only be mergable if approved by a specific team. Have a specific team be the only ones allowed to run apply (or plan, because plan is dangerous and allows arbitrary code execution). And that's it. Your PR will never be mergable if not approved by your designated team(s) and no one except your configured teams can run plan/apply. Note branch protection rules are configured in GHE, and the Atlantis server side config --gh-team-allowlist="myteam:plan, secteam:apply, DevOps Team:apply controls this in Atlantis.

It's possible that things have changed on GitHub side but historically there was no API to confirm all branch protection rules are met. Approvals are easy but branches might require other things than just approvals (for example you may enforce commit signing at branch protection rule level)

The "apply on merge" workflow is what hashicorp describe, for example here. It would be nice to be able to replicate this workflow in atlantis without the need for terraform cloud. Given that the workflow is described by hashicorp I think this workflow should not be considered controversial.

Hashicorps recommendations are questionable IMO. I've created some pretty large Terraform setups and in no way would I ever recommend any team starting with Atlantis to start with one module per repo. At a certain point it makes sense, but trying to put one module per repo and versioning everything individually from the beginning is incredibly painful and completely unnecessary.

Hashicorp clearly designed Terraform to run with manual intervention. Running it fully via automation is counter to the workflow Hashicorp requires.

While I agree that some recommendations might be questionable they also are what they are and they come from the same people who created terraform. Not allowing your users to follow them seems like overstepping on the atlantis project side. Having said that I think there is a place for challenging hashicorp's recommendations in the terraform project.

The main reason applying on merge is a bad idea is simple - what if your Terraform apply fails for any reason?

To answer in short - similar thing will happen if you run atlantis apply that partially fails. You end up with a drift that needs to be rectified. There is no way I can think of to guarantee no drift without changing how terraform itself works.
This feature would allow users to choose what kind of failure mode they prefer based on their individual circumstances and choose how they prefer to rectify the drift. What seems to be the feeling from you is that the atlantis project should impose certain workflow on users even if it goes against upstream recommendations. I personally think that giving the users choice is better for the project and the users themselves.

@GreyTeardrop
Copy link

The main reason applying on merge is a bad idea is simple - what if your Terraform apply fails for any reason?

I'd like to add my 2c too. I can totally understand that this feature might not be a priority, or, perhaps, is hard to implement due to some limitations of the architecture. However, I can't really accept the argument that this feature is a bad idea cause Terraform apply might fail after the code is merged, at least without some extra arguments. IMO, there's no major difference between the Terraform code as applied to infrastructure by Atlantis/Terraform and any application code as deployed by CI/CD pipelines. Companies that follow continuous deployment GitHub flow (single main branch, pull requests, changes merged to main branch deployed immediately) face the same issue: code changes merged into main branch could break when deployed to production due to migration failures, intermittent networking issues, etc; I would argue that those companies trade the chance of those issues (and the need of manual intervention in that case) for the simplicity and agility of the process when everything works fine.

Having a big green merge button is not actually what you want - the PR is NOT mergable until the state has been applied, if you want the state of the repo to match reality.

From a formal point of view, repository state can never match real state 100% of the time. It is either occasionally ahead of the real state (if the PR is merged and then applied), or it is occasionally behind the real state (if the PR is applied and then merged).

@stasostrovskyi
Copy link
Contributor

It's worth mentioning that Atlantis is "Terraform Pull Request Automation", not a "Terraform git automation", so it makes sense that atlantis is scoped to a Pull request.

This was explained in this issues somewhere but tl;dr; is that you can still end up with out of sync situation even with the atlantis apply workflow if the apply fails. That's because terraform apply is not atomic - apply can succeed only partially.
What the apply on merge feature would do is allow users to choose workflow they prefer.

Yes, and it's the whole point that it will be possible to see result directly in PR and either easily retry/fix whatever is broken in PR. To add to all things here, apply on merge will require some other communication/alerting mechanism. In PR you can at least have email/slack notifications for comments.

It's possible that things have changed on GitHub side but historically there was no API to confirm all branch protection rules are met. Approvals are easy but branches might require other things than just approvals (for example you may enforce commit signing at branch protection rule level)

Yes, GitHub still doesn't have a single endpoint to check mergeability status unfortunately. However, there is a PR that tries to check various rules and decide if PR is mergeable or not. We haven't implemented signed commits check, but I guess it can added afterwards as any other check. It sucks that we need to re-implement github, but it is what it is.

While I agree that some recommendations might be questionable they also are what they are and they come from the same people who created terraform. Not allowing your users to follow them seems like overstepping on the atlantis project side. Having said that I think there is a place for challenging hashicorp's recommendations in the terraform project.

Most of the beauty and power of Atlantis, at least how I see it, is to be able to do planning and applying via comments/commands in addition to what standard PR experience provides. If the desire is to have apply on merge (which we for example used to have with Jenkins), I personally don't see much reason to use Atlantis at all. It's pretty trivial to create github workflow for plan/apply, that will output plan as comment and rely on standard PR experience. It seems overkill to use Atlantis for this flow. You can actually come up with a better DX for yourself by using GH workflows.

IMO, there's no major difference between the Terraform code as applied to infrastructure by Atlantis/Terraform and any application code as deployed by CI/CD pipelines.

When we are talking about cloud native, I think there is a huge difference between terraform and k8s/argo/flux/whatever. In k8s case it's safe to push a change that will break deployment, because old, working, deployment will still be there and while there will be a difference between git and actual state, it's not that important. In terraform case, it may make sense to consider doing more "risky" operation first - which would be applying terraform - and do merge after, because that one almost never fails.

What seems to be the feeling from you is that the atlantis project should impose certain workflow on users even if it goes against upstream recommendations. I personally think that giving the users choice is better for the project and the users themselves.

This seems to be a strange thing to say about an open-source project. I think that tools should be picked according to your needs, not the other way around and there are other terraform/infrastructure automation tools out there that work with applying terraform after merge.

@topikachu
Copy link

In our R&D department, we use Terraform extensively to manage our resources, which works well for us. However, given the nature of R&D, there are times when we need to quickly delete and recreate resources, like when an EC2 instance is misconfigured by an engineer.

Currently, the workaround is to create an empty pull request and run atlantis apply on it, which feels a bit cumbersome. Is there a better way to rerun the atlantis apply command after merging without needing to create these dummy PRs?

Any suggestions or improvements on this workflow would be greatly appreciated.

@alex067
Copy link

alex067 commented Sep 3, 2024

The main reason applying on merge is a bad idea is simple - what if your Terraform apply fails for any reason?

I cannot disagree with this enough.

Terraform applies are not atomic. Rarely does an apply fail without changing internal state and your environment. Therefore an apply failing after merge, leaves your trunk/main branch intact with the latest changes.

That's much more desirable and reliable than a pull request failing to apply, without merging. That leaves your state in a transitional period where your main branch doesn't reflect the true nature of your environment.

@shinebayar-g
Copy link

What if terraform apply fails after merge? You answered it yourself. 🍭

That leaves your state in a transitional period where your main branch doesn't reflect the true nature of your environment.

Both approaches have its own pros and cons :)

@alex067
Copy link

alex067 commented Sep 3, 2024

What if terraform apply fails after merge? You answered it yourself. 🍭

That leaves your state in a transitional period where your main branch doesn't reflect the true nature of your environment.

Both approaches have its own pros and cons :)

You're misquoting me.

I explicitly said, if your apply fails after merge, your trunk branch is still intact with the latest changes since applies are not atomic.

Your quote is me talking about what happens if your apply fails before merge.

@stasostrovskyi
Copy link
Contributor

When apply fails after the merge, your trunk branch is NOT intact with the state of your infrastructure. Both approaches (before and after merge) have the same drawback, because terraform does not have reconciliation/drift detection built in. So the question here is which situation is easier to recover from - failed to apply on PR or on the trunk. Since PRs do have a "user interface" in the form of comments, it's easier to understand an error and try to fix/retry quickly. But then again, it heavily depends on who you are trying to optimize the experience for. For regular developers, I would argue that a "nice" UI is more important than trying to dig through logs of terraform. For infra/platform people, sacred trunk and using API to trigger atlantis can be more natural and important.

@alex067
Copy link

alex067 commented Sep 3, 2024

@stasostrovskyi not understanding how your trunk branch is not intact.

If your apply fails after merge, then your code is already committed to trunk. Your trunk reflects what's in your environment because applies are not atomic. If your apply fails and creates X resources but fails Y, your trunk branch reflects those X resources.

This is inherently more reliable than a pull request applying X resources and failing on Y, with your main branch not reflecting on any applied resources.

@stasostrovskyi
Copy link
Contributor

It doesn't matter if the trunk branch reflects your environment 99% or 1%. Both of those cases need to be resolved ASAP and it's a matter of which way would be easier to perform recovery. There are also cases when you don't really know how to configure a specific resource because documentation is meh or you don't even know what it is you want to achieve and applying from PRs gives you a much faster feedback loop without polluting the trunk with random commits. Sure there is a case for applying after merge and we also do for some things that atlantis cannot manage (ex. atlantis deployment itself), but for that we are more than happy with a simple github workflow because in this scenario you are not using all the features of atlantis commands, locks and outputs anyway.

@zeemy23
Copy link

zeemy23 commented Nov 5, 2024

github's merge queues solve all of these issues by backing out upon failure and it would be great to see support for this as an option.

@shinebayar-g
Copy link

github's merge queues solve all of these issues by backing out upon failure and it would be great to see support for this as an option.

Interesting. How does that solve the problem? Could you elaborate? Just curious.

@zeemy23
Copy link

zeemy23 commented Nov 7, 2024

github's merge queues solve all of these issues by backing out upon failure and it would be great to see support for this as an option.

Interesting. How does that solve the problem? Could you elaborate? Just curious.

if the apply is a required check, it reruns before merging, and if it fails, it backs out and re-opens the PR

@stasostrovskyi
Copy link
Contributor

Well, that's not how merge queue would work with terraform. Whole idea of atlantis is that when you have a PR, you approve both the code change and the plan and then apply that specific plan. With merge queue you will get something like that:

  • PR 1 enters queue with the same name and starts applying
  • PR 2 will create a plan with combination of changes in PR 1 and PR 2 <- already different from one that was approved
  • PR 1 fails half way, some resources have applied and some haven't
  • PR 1 is removed from queue
  • PR 2 starts planning again, now it will destroy half-applied changes and plan changes for itself
  • Now PR 2 just wreaking havoc on your infra. Especially if PR2 also fails..

@zeemy23
Copy link

zeemy23 commented Dec 5, 2024

Well, that's not how merge queue would work with terraform. Whole idea of atlantis is that when you have a PR, you approve both the code change and the plan and then apply that specific plan. With merge queue you will get something like that:

  • PR 1 enters queue with the same name and starts applying
  • PR 2 will create a plan with combination of changes in PR 1 and PR 2 <- already different from one that was approved
  • PR 1 fails half way, some resources have applied and some haven't
  • PR 1 is removed from queue
  • PR 2 starts planning again, now it will destroy half-applied changes and plan changes for itself
  • Now PR 2 just wreaking havoc on your infra. Especially if PR2 also fails..

its possible to get around this by failing and backing out the PR if the branch requires an update. the half applied resources are an issue, but i complex enough workflow could solve for that as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality/enhancement help wanted Good feature for contributors
Projects
None yet
Development

No branches or pull requests