-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup the release process #130
Comments
I believe one thing that we have to do under Apache-related steps is that we have to copy the sources (not the generated binaries, i.e. jars in our case) to an SVN repo. In Apache lingo I think this is called a "source package", see https://www.apache.org/legal/release-policy.html#source-packages. The sources also need to be signed. In our case it would just be the contents of the git repo. |
Just creating the source archive shouldn't be a big problem. One of the main questions is if we can trust GHA enough to make releases there and also sign those releases (it probably makes sense to do both on one machine). Also, even if Apache considers binaries only a convenience, the reality is that > 99% of all users will use the binaries from Maven Central which after all carry a much bigger risk than the distributed source files. If we are not allowed to release from GHA or decide against it, we would have to setup something that would allow to run safe and reproducible releases from the release manager's machine (e.g. using docker). |
If by release you are talking about a Maven release, I don't think there is any issue in GHA doing this. Many other Apache projects do this, and I think that everything aside from signing has already been setup with the #129. For context, in order to get the snapshot deploys working I had to make an INFRA ticket to get credentials added as github secrets and in general I have seen that Apache is facilitating doing as much as possible via GHA but judging from the fact that Nexus username/passwords are being stored as secrets it would be surprising if keys are treated any differently. What I predict could be annoying is the official Apache release (even though 99% of users won't use it). I have spoken to some people involved Apache and apparently the proper way to do this is to manually sign it on a machine with a key that is supposed to be stored externally (i.e. via usb) and then upload it. Such info can be outdated though (and it has been in the past) |
I would say that this issue can work as a general epic meta issue where we can track the other related issues. What would be handy is if we can update the original checklist and reference these specific issues. @jrudolph Do you want to do this? I can also just edit your post. |
Please edit yourself, everyone. :) |
Of course, we can just follow the rules as given but let's at least note how paradox the situation is: we would go to great lengths to sign source code securely which no one will use or look at (and which might be signed much more easily e.g. by signing the release tag in git). On the other hand, the binaries which everyone will be running directly and which will be much harder to verify will be released on third-party machines in a process which can much more easy be convinced to tamper with the binaries or leak the secrets... |
Oh definitely the irony is not lost on me whatsoever especially considering that Pekko is a library and not an application. |
HI,
Users need to pointed to the offical releases hosted by the ASF. This may help [1][2][3] and of historical interest. [4]
Kind Regards,
Justin
1. https://infra.apache.org/release-distribution#channels
2. https://infra.apache.org/release-distribution#public-distribution
3, https://infra.apache.org/release-distribution#download-links
4. https://cwiki.apache.org/confluence/display/INCUBATOR/Distribution+Guidelines
|
Hi,
I should point out why that ASF does this is that it provides you with legal protection and means you are covered by the insurance the ASF has. Go outside these boundaries and you may not have that legal protection.
Kind Regards,
Justin
|
@justinmclean As you stated (and I suspected) such policies are likely in place due to legal reasons but as @jrudolph said, especially in the case of Pekko and its modules there is an extremely strong disconnect behind the policy and what happens in reality/practice 99% of the time (I can confirm that for the users of Pekko, almost no one is going to download/test the raw source package, they will add it as a dependency to their build tool that will be resolved via Apache's Nexus repo and if they are going to get the source its going to be via git on github). Of course we are going to follow this rule, this isn't up to debate however is there a general avenue where this can be discussed/raised? |
Hi,
In that case the project might have some work to do to change the user perceptions on where they obtain the software from. Even if they obtain it from elsewhere it must be based on an official ASF release.
Kind Regards,
Justin
… On 28 Jan 2023, at 9:45 pm, Matthew de Detrich ***@***.***> wrote:
I should point out why that ASF does this is that it provides you with legal protection and means you are covered by the insurance the ASF has. Go outside these boundaries and you may not have that legal protection.
@justinmclean <https://github.com/justinmclean> As you stated (and I suspected) such policies are likely in place due to legal reasons but as @jrudolph <https://github.com/jrudolph> stated, especially in the case of Pekko and its modules there is an extremely strong disconnect behind the policy and what happens in reality/practice 99% of the time (I can confirm that for the users of Pekko, almost no one is going to download/test the raw source package, they will add it as a dependency to their build tool that will be resolved via Apache's Nexus repo).
Of course we are going to follow this rule, this isn't up to debate however is there a general avenue where this can be discussed/raised?
—
Reply to this email directly, view it on GitHub <#130 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABDI6BX3IMNU5ZHQXWWTWDWUT2E5ANCNFSM6AAAAAAUHNYHDE>.
You are receiving this because you were mentioned.
|
If "based on" includes users downloading generated JVM artifacts from the same source package as the official ASF release (which will also be the same as the git repo at the same checksum of the tagged release) then like almost every other Apache JVM project that publishes JVM artifacts then yes that will be the case. I think the point being raised is that the Apache software package particularly for libraries that are using git is practically ceremonial/checkboxing. As pointed out earlier, pushing a signed git tag to signify a release (which then triggers a pipeline to upload artifacts to repositories generated from that exact source code for that release) technically achieves the exactly same goal, especially with github repo's being synced with gitbox. |
Hi,
Releases are not based off "pushing a signed git tag”, releases need to manually voted on by the (P)PMC and placed in the offical ASF distribution area.. Please read the links I posted earlier.
Justin
|
I am aware that releases need to be voted on by (P)PMC, I am talking about the steps after a release is voted on (which does currently require placing software in the Apache Distribution area). To clarify, I am talking about hypothetical alternative for distribution after a (P)PMC vote but I don't think this thread is a productive area for this conversation so I will leave it |
We will need something that essentially splits the release into 2 parts.
With the Nexus part of release, we can release to Nexus staging and then after the vote, we can abandon the staged release or complete its release to Maven Central using the Nexus Repository Manager. sbt plugins like sbt-release, sbt-ci-release, sbt-sonatype, etc. can be configured not to complete the releases - just to put them in staging. With the source and binary distributions, there are repositories where the files can be shared. If and when the release is approved, they can be uploaded to https://dlcdn.apache.org |
If this is the process that we use (and its a perfectly reasonable one) then assuming we only want to push a git tag for an actual release that has been successfully voted by (P)PMC on it becomes more complicated about which of these plugins to use and in what order. For example one way of doing things would be to just use If using This alternative method I think is cleanest and probably closest to the "Apache way", i.e. The main issue I foresee with these method/s is that we would likely have to resort to having a static value for the version of the project in |
ASF projects tend to use a concept of a release manager - an actual person, who can do some documented manual steps. We can start with having a few manual steps and automate more later. It's more important to define a process than to tailor a process to the way that a particular sbt plugin works. The artifacts that are voted on should be signed and we need to provide a KEYS file with the public key parts of any keys that have ever been used to sign our artifacts. From an sbt perspective, that means we need These keys are typically keys associated with actual people so signing the artifacts is more likely to be done on the release manager's computer than to be automated. I don't see any mechanism by which the release manager's signing key can be made available to a Github Action workflow. |
This is why I prefer sbt-release because it is manual, i.e. it has to be manually triggered on a machine by the release manager (i.e. via sbt-release also has some nice quality of life features, for example if your current git status is unclean (i.e. you have unstaged/committed changes) it will immediately halt the release process. I think that some of these preconditions can also be configured. The issue with sbt-ci-release is that its triggered by git tag pushes and not manually, which means in addition to not being interactive (meaning its quite limited in what we do) if we want to the git tags to be in sync with actual approved Apache releases then sbt-ci-release could only be used to promote a staging repo to release which is kind of overkill, it would be better to just use
I already started asking these questions in |
Adding this as another comment as you edited yours, but incase it's not clear sbt-release works by defining a set of already existing steps so nothing is stopped it from calling The more important point that we may be missing is that sbt-pgp (which is where |
👍 Yep, figuring out how we want and need to do the process is the most important part, we can always get the tooling to do what we want afterwards. Note, how in the happy case (release worked, positive vote), the procedure is not so much different from what we had for akka-http: https://github.com/apache/incubator-pekko-http/blob/main/scripts/release-train-issue-template.md#cutting-the-release. Here we also only automated until staging and then had some manual testing steps and a manual triggering of promotion to Maven Central. The question is how to deal with the unhappy cases where something goes wrong with a release or the vote fails. A few alternatives come to mind:
In the past, I have been usually quite pragmatic about it. Before a release had been announced, I was ready to just redo the tag in the git repo and restart the whole process after a fix (usually, mutating tags or main branch will only lead to short term hassles if done in a timely fashion). On the other hand, sometimes enough of a release had already slipped (e.g. to Maven Central) so that a new release version was necessary, in which case the process was just redone. In general, the most principled approached would be just to skip version numbers in case of a problem. That would also have the benefit of reusing most of the past processes. |
We should make sure that we also help people giving there approving vote according to https://www.apache.org/legal/release-policy.html#release-approval which require
Especially, we should clarify/understand what it means to "test the result on their own platform". |
On the same note, one thing I want to explore is that if we end up using sbt-release, adding steps to the release process (initiated by There are some things we can automate, i.e. compiling and running the test suite (which can be done with Note that the context behind the suggestions that I am making is two things
Note that regarding the release process, I have helped out a couple of times for Kafka release and there are lot of colleagues on my team who are Apache committers/PPMC members for various Apache TLP's so I am continuously speaking with them to get a general idea of how the release process works with other projects because while there is a the strict policy at https://www.apache.org/legal/release-policy.html#release-approval, there is some level of bespoke tailoring depending on the TLP project as long as it conforms to the ASF release process.
I think that for now what need to decide on is how we approach git tags because this will have an effect on how we design the release process, i.e. do we tag immediately when a release vote is started and remove it later if a release vote fails or do suspend the creation of the git tag only when a formal Apache release vote is approved? I have a personal process for suspending the creation of the git tag because its simpler in the exception case but its not a hill that I will die on if we go with other options. |
|
Is this because it breaks some part of a fundamental ASF process or because it complicates what you mention aftewards (in the sense that if we do git tag release candidates than the last release candidate would need to support multiple git tags if we promote it to release)?
My concern here with using RC's as a backdrop for making a release is that a project can make many release candidates before doing an actual release thats voted on, in which case because of the process it can be a bit unclear which release candidate counter can turn into an actual release while its happening. We would also have to mandate that we always have to a release candidate just before the release which in some cases can be excessive, i.e. think of the case where we have some rc-x that has been fully tested and is likely going to be the last rc before a proper voted release but someone makes some minor non breaking changes afterwards, i.e. some basic documentation is added after rc-x. With this process we would have to make another rc-(x+1) and go through the entire hoopla of getting everyone to test rc-(x+1) even though its kind of pointless because nothing of worth has changed. Even if we communicate that for that specific rc-(x+1) does not need to be tested because it only has minor documentation changes, it then becomes unclear (at least to me the primary goal of release candidates is to encourage the community to do manual testing on that rc throughout the lifecycle of that release to try and weed out any bugs/concerns). On a similar note, putting git tags onto release candidates can also help communicate the stage of the lifecycle of the release, i.e. generally speaking during RC's you don't want to merge major changes into the project (that happens after a release) and with git tags its quite clear if the project is in the stage of a release from a git log perspective (you just need to see whether the last git tag is an release candidate). |
Take a look at https://cwiki.apache.org/confluence/display/JENA/Release+Process. This is the release process for Apache Jena. It is a much simpler project and is strictly java based, However, if you look at what it does with respect to the git repository you can see the creation of tags, and roll back on failure and other automated steps. I recommend that Pekko adopt something similar and that it also be written out in the wiki. The verification is to verify that the build of the system matches the result built by the release. IMHO if you want to use GHA to build the release you could, but it will be a complex script. The verification of the build will still have to be done on other accounts, etc. |
That would mean that we would still have to rebuild because the version number is also included in the binaries. Is that what you mean? The source distribution could probably stay he same (Can it? The sources might also contain the version...), but the binaries would have to be rebuilt. While the ASF requires the sources distribution to be validated before voting, we also need to make sure that the staged binaries are valid, so that process could still fail after a positive vote (if it had to be rebuilt for a new version number)... |
Ultimately, we can define the version in the sbt file, we don't necessarily need to derive it from git tags. This might better suit an ASF compliant release process. |
AFAICT the policy says the verification needs to happen on (a) machine(s) owned and controlled by the committer, but the artifact-being-verified might come from another machine. There's some precedent for this approach in https://github.com/apache/logging-log4j-tools/blob/master/RELEASING.adoc though indeed it's early, and it might still be subject to change after this approach has been more formally described on Confluence and discussed further on the members@ list.
It seems for logging infra has been able to create a separate keypair for this purpose (https://issues.apache.org/jira/browse/INFRA-23996) and gave the individual PMC members revocation rights.
When the build is reproducible, the RM (and other voters) can independently build&verify the artifact before voting, which I'd say (but IANAL) should close the loop. Anyway, it's up to you which way to go, and I can imagine perhaps waiting until this process is more broadly vetted - just wanted to make you aware of the option ;) |
Thanks for the response, not trying to be abrasive/obtrusive but to me there is mixed signalling/disconnect going on here, specifically what is stated in the docs and what some projects do
This specifically sounds more inline with Apache's policies but is probably not too useful for us because at least for Pekko we would be publishing using
Good to know, I was thinking of doing this for
Many thanks for the help, I think that for now we will probably go with the solution described earlier but as you said there is always room to change especially if alternative solutions become more vetted/accepted. |
Just my view, but I would prefer if Pekko team don't try to lead on the release process side. It's much easier for a TLP to innovate on the release process. A podling, like Pekko, does not just need PPMC approval but also IPMC approval for the releases. And sbt also restricts our options. Maven has good tools to support SBOMs and other secure release innovations. sbt ecosystem is a bit behind the curve on this. Once we get a v1.0.0 release out, maybe we can review the release process. But for now, we can copy the processes used by other podlings. |
This is my view as well, the only exceptions which are practically infeasible for technical reasons (i.e. sbt plugins not working with source packages) but even then until we are a TLP we should be as accommodating as possible |
Well this a word storm if you have to pick it up. (I tried to consider all of the above statements, so apologies if I stupidly missed out on something with following reply). Just want to throw something on to the table, what if:
I know it doesn't naturally fall in to the SBT plugin ecosystem, but I can't help at feel as this being how the authors of the Apache process envisioned it? Clearly you should be damn sure about the release before setting the tag, because mutating tags seems really not done. So |
@spangaer all Pekko releases for the next while will need to be voted on the Pekko PPMC but also the Incubator PMC. The Incubator PMC have a lot of podlings. It is much simpler, for now, for us to follow a similar release process to everyone else. To summarise what a typical ASF release looks like:
Other than the binary zip/tgz files and setting up the the access to the staging and release web sites, we have basically everything ready for a release manager - particularly one like me who has done ASF releases before - to do a release. I don't see how adding lots of automation helps. The requirements currently call for a release manager who is going to have to do a lot of manual tasks. Adding automation to replace simple one liners like gpg signing a file and having those automations potentially be brittle - when we have loads of documentation to fix, loads of code to repackage, etc. - seems like a low priority to me. After we get a release or 2 under our belts, we can look at the release process again - but until we become a TLP, I think we are pretty limited in straying from the current ASF norms. |
When I am talking about automation, I am talking about checks that make sure release managers don't do something that is actually provably incorrect, i.e. an example of such a check would be making sure that the private/signing key you are using is registered to an apache id and is inside Apache's KEYS file, i.e. https://github.com/apache/kafka-site/blob/asf-site/KEYS And such a thing would definitely be helpful because accidental releases without the correct signing key (at least for binaries) has happened, even with manual checking. Humans tend to be more fallible then well written programs (or checks in this regard). Not saying that such checks are critical, but they are definitely helpful especially considering how manual and foreign the process is for the current Pekko community. |
Approx 10 git repos with dozens of modules overall to get redocumented, repackaged, retested. And a clock ticking with regards to the end of Akka v2.6 support - we really need to have the v1.0.0 releases started in a couple of months because the ecosystem of Akka-based libs are not likely to even look at Pekko until we get that out. |
I know that voters are meant to check the signatures, my point is there is no harm done in adding an additional automatic check which is much less likely to fail. Again re-iterating my point about humans making mistakes.
Yes this is a fair point, I am spending most of time on getting the package renaming/code changes done. |
Thanks for all that input. I fully agree we should make sure we stay on the critical path to get a first release out and optimize later on. It's quite useful to have multiple alternatives already evaluated here but let's stay mostly on the well-known path even if that requires a fair bit of manual work. In that regard, I would almost fully support @pjfanning's suggestion. What I would still like to avoid is for the release-manager to have to paste any commands. We can have a simple setup using shell scripts that does all the required steps and helps that no steps are missed and that they can easily be repeated. I would not worry about the technicalities of source releases, these require a few steps but they should be easily scripted. I don't think we should even build them into sbt because we just don't have to (because it requires absolutely no information that only the sbt build has). The script should use the same environment variables that sbt uses for providing the GPG keys but other than that it should be easy to do in a script (easier than in sbt). |
I think the issue here is that Apache release process expects release managers to manually paste commands. My own view is that I do want to automate this
One reason I wanted to use sbt is that I want to reuse sbt-pgp to sign both the source package and the JVM artifacts. This would enforce that the same (and correct) key is used for both and would also mean that the key information (i.e. how to lookup the private/signing key) only needs to be in one place. This would require some upstream changes to sbt-pgp but its actually not that difficult because sbt-pgp is just a wrapper around gpg anyways. I am willing to take this up but aside from scoping I have decided not to spend time on this before the 1.0.0 release because @pjfanning is correct here, the highest priority is to get that 1.0.0 release out even if the first release is manual. |
I think the way to proceed here is to put all the bits into open issues. Flag the ones critical for release 1 and let's work on them. (not that I actually do much work), my original thought was to put them into a project so that we can see what needs to be done and be very clear about what does not need to be in 1.0. I agree that the build should be manual to start. Only after it has been done a few times will it make sense to try to automate, as only after the first few times will we know where the pain points are. |
Hi,
If thats the case then as you said its kind of arbitrary and doesn't matter, as you would just create your own private key and publish it to some key repo. I just wanted to confirm with ASF if thats the case or whether we should use the same key as the source package for Maven releases (which does provide an actual benefit, as you can securely confirm that a Maven release artifact is signed with the same key as Apache's official source package).
The release manager uses their own KEY see https://infra.apache.org/release-signing.html <https://infra.apache.org/release-signing.html>
Justin
|
Yes this is clear, we are talking about enforcing the use of that same release managers key for signing JVM jar artifacts that will be published to Apache's Maven Repository (which is considered a convenience package) |
@mdedetrich I'm not dead set against having a docker image but I'd prefer to start by documenting what the release manager needs installed. It could be easier to just let the release manager check their computer. My main concern is gpg and the ~/.gpg folder. In practice, the release manager needs:
I'm not sure that a docker image makes this easier. I know you can mount local dirs when you start a docker container. |
Yeah this is the issue, not everyone uses skdman (for example I don't, I instead use jsenv to switch between JDK's). Release managers have different OS's, some which can handle having multiple JDK's installed at once and others not. At least having a basic A
This we can defer to the sbt installation documentation. Thankfully nowadays sbt is pretty ergonomic, i.e. it will automatically update and use the correct sbt version as per project
Can also defer to official Apache GPG documentation (i.e. https://infra.apache.org/openpgp.html)
I have done this before and its a lot easier then it sounds and Apache Daffodil already does this. The |
Hi there, It looks like we've got a clearer picture on the organizational / maven / apache related steps required here so the original description checklist should be updated. It also seems that discussions here may solve some of the questions in #78 , so that card could do with a status update. Echoing @Claudenw , could outstanding subtasks drawn up as issues in the 1.0.0 milestone?
|
This task is done by a PMC member. We have mentors and PMC members who have release managed Apache projects before. I'm not sure if this needs to be finalised before the first release. The release manager can readily write up they do. Releasing is not that complicated if you've done it before. |
One technical thing we can do is to integrate the docker image that @jrudolph helped setup at #188 . We can work more on his branch to clean it up/make it more professional. I would say this docker image is a requirement because of how complex the release process of Pekko core will be (i.e. requiring multiple jdk's etc etc). Also need to test that the signing works with gpg properly (I am in the process of setting up an Apache master key for releases but it will be stored on a yubikey). |
I don't understand 'master' key. The releases are signed by the release manager's personal key. The public parts of the keys that are used for signing have to be added to a KEYS file that we make accessible from our download page. The keys file is usually also checked into the main git repo for the project. Examples: |
Thats what I meant |
Hi,
Just a reminder that the Incubator PMC will need to vote on your release. They will also likely use their own tools and methods of checking rather than any automation/scripts that you provide. In my experience automation can be helpful, but people can put too much faith in it and it can easily miss issues in the release.
Kind Regards,
Justin
|
So the specific automation we are talking about right now is just about creating a reproducible environment so we can make deterministic builds for a release which I would argue is necessary for us considering how complex the setup for creating a Pekko build is. (if we don't do this at best we waste a lot of time for release managers to actually make a release and at worst we will create builds that differ in subtle ways depending on who's machine is making a release). As you pointed out however any additional automation is likely not necessary at least when it comes to the Incubator PMC voting on our release. |
@jrudolph @mdedetrich I've set these up for the RCs and releases, respectively.
You can use Have a look at the https://dist.apache.org/repos/dist/dev/incubator and https://dist.apache.org/repos/dist/release/incubator pages to look at other incubator projects and see what they have published. I still need to look into what if anything else needs to be done to link our release dir above so that everything that gets published to it gets properly loaded up to the Apache download and archive CDNs. It may be enough to have the dirs set up like this or I might need to find some config setting somewhere that I need to update to have the URL I've set up. |
I've created https://github.com/apache/incubator-pekko-site/wiki/Pekko-Release-Process (initially started in my fork but moved on request - to faciliate collaboration). There is a lot more work and detail needed. Initially, I'm focusing on the non-technical pieces like the sequence of events. Building the release artifacts is by far the easiest bit. |
@pjfanning Can you put it on https://github.com/apache/incubator-pekko-site so that others can edit it? |
sure |
Apache has certain requirements for releasing a public version. We should strive for as much automation as possible. This ticket should be an overview over all steps necessary and the progress on those items. Please add items as needed.
Let's only consider public releases like RCs and GAs here (but not snapshots).
(Please add your name and/or PRs and issues to the items as needed)
References:
Organizational steps
Maven-related steps
Apache-related steps
The text was updated successfully, but these errors were encountered: