Fully-automate merges for the simplest class of package publications PRs #26106

shonfeder · 2024-06-17T19:08:02Z

Even in the imperfect state of our current CI results, there are some package updates that are straightforward enough that we could fully automate their publication, provided we organize our CI sufficiently. We should do so.

Benefits

I proposed this in a maintainer meeting, and the sentiment was that the packages that currently fall into this bucket are so easy to review, that it should not be a high-priority. I think I disagree here: it is easy to underestimate the benefits of automating away small friction points, and it sets us up the processes for something we are going to need before long anyhow.

Main benefits I have in mind:

Better publication experience for package authors making these kinds of updates.
Less busywork for maintainers, allowing us to focus on the packages that actually need human review.
Lays the basis of a fully automated publication pipeline, upon which we can gradually expand.

Criteria

The current requirements for a fully automated publication would be, at least:

The PR upates an existing package.
All CI checks pass
The author of the PR is a known maintainer/owner of the existing package

(3) Could be addressed via something like a .codeowners file in the package dir, or a linting check that compares the package data with the author of the PR triggering the CI process. Either of these would amount to piggybacking on github's auth, and using the repo contents as our source for permissioning configuration.

Required

Add linting check for sensitive updates to existing packages ocurrent/opam-repo-ci#309
A way to verify that the author of the PR is an owner/maintainer of the package being updated
A workflow that will auto-merge PRs meeting the criteria.

My though is: has opam-repo-ci been reviewed and developed in a style to fail on errors? I remember there have been CI issues where it resulted in "ok" but didn't do anything. With the perspective of an attacker, if I manage to figure out something that leads to such CI confusion, and auto-merge is enabled -- I could easily take over the opam-repository. It may help to have a separate CI system, e.g. using GitHub action or likewise, that does a very small amount of checks (such as maintainer metadata, the diff is straightfotward (doesn't touch too many packages, checksums aren't changed, ...). You can see that I have some reservations with "ocurrent" and "opam-repo-ci" -- which is great to have, but at the same time a pretty big chunk of code - and I'm not aware of any design documents, neither a full system specification.

With the idea of "author/maintainer" of a package, this is still very much undefined. I worked years ago to find out "who's actually maintaining package X" by using the metadata from git. I guess this is a much more resilient approach than trying to decode the "maintainer" and "author" fields of opam -- which are very inconsistent. Turns out, in the long run, for conex we need that information about packages in any case.

shonfeder · 2024-06-20T02:50:33Z

Thank you for the considerations :)

I'm not sure why #23789 is related.

I guess you have a narrower topic in mind with the question in the title, "how does opam-repository scale", but the amount of manual work required is a bottleneck to scaling for a more active repo.

opam-repo-ci been reviewed and developed in a style to fail on errors?

That is certainly the hope! I'm sure there is room for improvement.

It may help to have a separate CI system, e.g. using GitHub action or likewise, that does a very small amount of checks

I agree that it would make sense for this kind of check to be in a simple, dedicated CI job, that was very easy to reason about.

With the idea of "author/maintainer" of a package, this is still very much undefined.

I don't think I get this point. We have a maintainer field, no? But using something like a .codeowners file may be better for the reasons you mention. One way of thinking about this effort is as a way of giving established package authors more autonomy over package publication, when everything is on the happy path. So I think it is worth considering adding support for that in way that may be orthogonal to opam's package meta data.

shonfeder changed the title ~~Fully-automate merges for the simpelest class of package publications PRs~~ Fully-automate merges for the simplest class of package publications PRs Jun 24, 2024

shonfeder mentioned this issue Oct 2, 2024

Add a way to semi-automate support for first-time package authors ocurrent/opam-repo-ci#372

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fully-automate merges for the simplest class of package publications PRs #26106

Fully-automate merges for the simplest class of package publications PRs #26106

shonfeder commented Jun 17, 2024

hannesm commented Jun 19, 2024

shonfeder commented Jun 20, 2024 •

edited

Loading

Fully-automate merges for the simplest class of package publications PRs #26106

Fully-automate merges for the simplest class of package publications PRs #26106

Comments

shonfeder commented Jun 17, 2024

Benefits

Criteria

Required

Related

hannesm commented Jun 19, 2024

shonfeder commented Jun 20, 2024 • edited Loading

shonfeder commented Jun 20, 2024 •

edited

Loading