Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] Automate merging of branch version bump PRs #37

Open
dbwiddis opened this issue Aug 9, 2024 · 17 comments
Open

[PROPOSAL] Automate merging of branch version bump PRs #37

dbwiddis opened this issue Aug 9, 2024 · 17 comments
Labels
enhancement New feature or request

Comments

@dbwiddis
Copy link
Member

dbwiddis commented Aug 9, 2024

What/Why

What are you proposing?

We should enable repo/admin-level automation to merging branch version bump PRs because maintainers aren't doing it.

What users have asked for this feature?

I requested it in the 2.14.0 retrospective.

I have repeatedly nagged other plugin maintainers to merge upstream dependencies so I can close mine.

I have received pushback from other plugin maintainers for the above nagging, blocking my efforts to actually close open PRs on repos that I maintain.

Other mentions of automation or enforcement of version bumping:

What problems are you trying to solve?

This, on a repo I maintain (source).
Screenshot 2024-08-08 at 10 46 48 PM

And this on an upstream dependency, blocking me from merging mine (source).
Screenshot 2024-08-08 at 10 47 20 PM

And this on another upstream dependency, blocking me from merging mine (source).
Screenshot 2024-08-08 at 10 49 17 PM

Organization-wide, opensearch-project has 204 unmerged version bump PRs.

These are literally mouse clicks away from being closed, but it takes upstream repos to lead the way.

What is the developer experience going to be?

For maintainers like me, the ability to merge PRs on their repo because upstream repositories have appropriate versioning.

For maintainers who don't care about these PRs, they won't have to lift a finger. How awesome is that?

More seriously, minor version bump PRs are assumed to have been cut as part of automation (Autosync) and multiple developers spent multiple hours dealing with the aftermath of a branch cut prior to a version bump in the 2.14 release. That's at least an hour of my time wasted, multiplied by at least 4 other developers.

Are there any security considerations?

Patch version bumps are one of the biggest culprits here, because patch releases rarely happen.

However, when they do, it's usually for a very important issue that can't wait until the next release cycle. In this case, having plugins wait for multiple upstream dependencies to merge their patch version bumps could slow our ability to react to these.

Are there any breaking changes to the API

Nope.

What is the user experience going to be?

Seeing plugin repositories that are well-maintained without a huge backlog of ignored PRs that discourage them from contributing.

Are there breaking changes to the User Experience?

Nope.

Why should it be built? Any reason not to?

It should be built because automation already exists to create the PRs; they can be auto-merged by a bot with appropriate powers to do so, when all dependencies are met.

It should be built to save the time and effort of maintainers to do the same. In particular, GitHub action retries expire after 30 days, requiring a minute or so of effort to re-try a version bump PR after a month of upstream repos ignoring it... or similar effort to search and identify whether it's able to be merged. It's a distraction and complete waste of developer time.

What will it take to execute?

Whatever automation creates the PRs can be given the power to merge them if CI checks are gren.

Any remaining open questions?

Why don't we just require maintainers to do this? Actually, we do, in Release Checklists, which specify merging these version bumps as part of the checklist. There are 453 open Release Checklist issues and while 3.0.0 and 2.17.0 are a couple hundred of those, there are far more than that.

@dblock
Copy link
Member

dblock commented Aug 9, 2024

I notice that many of these do not have passing CI :(

@dblock dblock added enhancement New feature or request and removed untriaged labels Aug 9, 2024
@dblock dblock transferred this issue from opensearch-project/.github Aug 9, 2024
@dbwiddis
Copy link
Member Author

dbwiddis commented Aug 9, 2024

I notice that many of these do not have passing CI :(

for many that is because an upstream dependency hadn’t bumped when CI first ran. And they don’t push/retry.

@gaiksaya
Copy link
Member

Auto-merge workflow that would automatically merge these PRs if the CI checks pass: https://github.com/opensearch-project/opensearch-build/blob/main/.github/workflows/automatic-merges.yml

@zelinh
Copy link
Member

zelinh commented Aug 19, 2024

Linking some approaches to rerun failing CIs.
opensearch-project/opensearch-build#2706

@prudhvigodithi
Copy link
Member

Following are my thoughts on having the version increment PR's auto merged.

  • Coming from [Enhancement] Automatically run a min distribution build when a version increment is merged into OpenSearch core opensearch-build#4225 Automatically run a min distribution build when a version increment is merged into OpenSearch core.

  • Start by having a discussion on update the code freeze for core earlier than the plugins. This way the code freeze for core is honored and everything is ready before plugin can finalize the changes. This should eliminate the last minute breaking changes and the plugins can use the finalized core artifact for version increment CI's.

  • Today we have an automation that creates the version increment PR's. Its upto the plugin teams to take care of the CI's related to the version increment PR's and get the PR's merged, a release manager will follow up until the PR's are merged (This needs to be automated).

  • For the automation we can have a workflow to ensure the plugin dependencies are build first:

    • Create a dependency tree which has information about the plugin dependencies.
    • For every plugin core is the dependency, the previous step [Enhancement] Automatically run a min distribution build when a version increment is merged into OpenSearch core opensearch-build#4225 solves the problem.
    • Now get the version increment PR's of the default plugins like job-scheduler, common-utils, security and ensure they are auto-merged once the CI's are passed. These default plugins uses core as dependency which is solved in the previous step.
    • Next, for the other plugins identified in the dependency tree, obtain the version increment pull requests for the dependent plugins, auto-merge them, and start building the artifacts. Once the dependent plugins are clean, retry the plugin CI processes, ensure they pass, and then merge the version increment pull requests.

A rough workflow explaining the above points:

             Create Dependency Tree
                      ↓
              Core Dependency Solved (Issue 4225)
                      ↓
   ┌──────────────────────────────────────────┐
   ↓                                          ↓
Default Plugins                              Other Plugins
(Dependent only on core)                     (Dependent on core + upstream plugins)
   ↓                                          ↓
Fetch Version Increment PRs                Check Dependencies
   ↓                                          ↓
Run CI                                     Fetch Version Increment PRs for the Dependencies
   ↓                                          ↓
┌───────────────┐                          Ensure CI Passes and Build Dependencies (if needed)
↓               ↓                                  ↓
CI Passed    CI Failed                    ┌───────────────┐
   ↓               ↓                    ↓                 ↓
Auto-Merge PRs   Investigate & Fix     CI Passed       CI Failed
                    ↓                   ↓               ↓
                 Retry CI            Auto-Merge     Investigate & Fix → Retry CI → CI Passed → Auto-Merge PRs
                    ↓                   PRs             ↓
                 CI Passed              ↓               ↓
               Auto-Merge PRs       Once merged, build the artifacts
                                           ↓
                                  Re-run the (current) plugin CI's since Dependencies are merged and built
                                           ↓
                                      ┌───────────────┐
                                      ↓               ↓
                                    CI Passed       CI Failed
                                      ↓               ↓
                                  Auto-Merge       Investigate & Fix
                                      ↓               ↓
                                     End           Retry CI
                                                    ↓
                                                 CI Passed
                                                    ↓
                                             Auto-Merge PRs
                                                    ↓
                                                   End

Adding @gaiksaya @dblock @getsaurabh02 @dbwiddis @peterzhuamazon

@peterzhuamazon
Copy link
Member

peterzhuamazon commented Sep 9, 2024

Thanks @prudhvigodithi for this detailed graph.

The automation app should become handy when trying to compose all the information and decision in one place, while access to all the repos at once with admin permissions.

We can use the input manifest to find the dependency tree between plugins, and decide on which plugins to take care before others. If a plugin is a dependency of the other, and it failed the checks, it should be hard block before moving to the next plugin.

We can then use metrics cluster and release dashboards to know where we are and even compose a dependency tree as a pre-requisite as the entry criteria.

Please let me know what you think.

Thanks.

@gaiksaya
Copy link
Member

gaiksaya commented Sep 9, 2024

All we need are passing CIs. Something like this opensearch-project/opensearch-migrations#940 (comment) from bot's perspective for hard merges (which we should not) or have any auto-merge workflow added like commented above.

For CIs that have expired run (I believe after 30 days) opening and closing the PR should help.

@prudhvigodithi
Copy link
Member

Thanks @gaiksaya and @peterzhuamazon for your inputs.

@gaiksaya for this All we need are passing CIs. we need the dependencies to be build 1st and for this we need the dependency version increment's to be merged before we can build the dependencies, once we have this yes I'm fine with the flow on re-trying and merging. What @peterzhuamazon added is also a good idea to leverage bot and metrics cluster if it can make the automation easy :).

@peterzhuamazon
Copy link
Member

Thanks Both.

I feel like github actions is suited for individual workflow but not really suited to combined actions among multiple repos. Especially when we already have so much plugins, it is not easy to update all the repos if there is any changes, or need a centralized call on what to proceed next.

Therefore I was raising the use of automation app combined with metrics cluster to do the hard work and easier to maintain over time.

Thanks.

@gaiksaya
Copy link
Member

I don't believe app would have the permission to re-run CIs.
@prudhvigodithi I agree we need dependent components to build first. But opening and closing the PR once a day till they get merged is one of the solution to re-run the CIs. If the dependent components are ready by then, great! Else try again tomorrow.

@peterzhuamazon
Copy link
Member

I don't believe app would have the permission to re-run CIs. @prudhvigodithi I agree we need dependent components to build first. But opening and closing the PR once a day till they get merged is one of the solution to re-run the CIs. If the dependent components are ready by then, great! Else try again tomorrow.

The app does have full access to re-run and trigger workflows:
Workflows, workflow runs and artifacts

@minalsha
Copy link

Thank you @dbwiddis for the proposal.

Hi @peterzhuamazon , @gaiksaya , @prudhvigodithi , @getsaurabh02: How should we proceed with this forward? We did see with 2.18 version bump as well where until and unless upstream have taken care of version bump, dependency plugins are unable to do theirs.

@gaiksaya
Copy link
Member

Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ?
We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc.
WDYT?

@peterzhuamazon
Copy link
Member

Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?

I think it does make sense tho we need to understand the scale of the problem.
We might need to create an additional app just to handle all PR related activities.

Thanks.

@gaiksaya
Copy link
Member

gaiksaya commented Nov 14, 2024

Hi @peterzhuamazon Does it make sense to move this issue to https://github.com/opensearch-project/automation-app ? We can start with merging PRs that have passing CIs and then iterate on the logic for re-running the CI's timely, etc. WDYT?

I think it does make sense tho we need to understand the scale of the problem. We might need to create an additional app just to handle all PR related activities.

Thanks.

Thanks! We can use this issue as a problem statement for the same. Coming from opensearch-project/opensearch-build#5171 auto-merging of the version bump PRs also applies to the core repos. See this comment. Instead of implementing an individual solution for both core repos, I think having a generic one would be great.
Moving this issue to automation-app repo to discuss the in and out of scope requirements, approach, and implementation of the same.

@gaiksaya gaiksaya transferred this issue from opensearch-project/opensearch-build Nov 14, 2024
@dblock dblock removed the untriaged label Nov 18, 2024
@dblock
Copy link
Member

dblock commented Nov 18, 2024

[Catch All Triage - 1, 2, 3, 4, 5]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 📦 Backlog
Status: Action items ✍
Development

No branches or pull requests

7 participants