Bootstrap reusable workflows for initial release #1

nwiltsie · 2024-08-01T22:25:59Z

Description

This PR adds reusable workflows to manage the release process for our pipelines and software packages (see https://github.com/uclahs-cds/group-software-best-practices/discussions/43).

There are two workflows:

wf-prepare-release.yaml is triggered manually (via a workflow_dispatch) and takes the following actions:

Compute the target version number based on existing tags and user input for major/minor/patch/prerelease.
Re-write the CHANGELOG.md file to move unreleased changes into a new dated release section.
Open a PR listing the target version number and release tag.
- See https://github.com/uclahs-cds/user-nwiltsie-pipeline/pull/37 and all of the other recent PRs in that repo.

wf-finalize-release.yaml, triggered when a release PR is merged, takes the following actions:

Create a new release with auto-generated notes and the target tag.
- By default the new release is a draft, so no public release or tag are created without user intervention.
Comment on the release PR with a link to the new release.
- Draft: https://github.com/uclahs-cds/user-nwiltsie-pipeline/pull/37#issuecomment-2263983248
- Release: https://github.com/uclahs-cds/user-nwiltsie-pipeline/pull/36#issuecomment-2263972963

Building blocks

Version definition

In all cases, semver and not, I'm assuming that git release tags begin with v[0-9]. The corresponding version number does not have the leading v (e.g. tag v1.2.3 and version 1.2.3).

Version computation

I'm using the semver python package to compute the next semantic version. That's based on the most recent ancestor tag of the main branch and user input (major/minor/patch/prerelease) for which bump type to perform.

Non-semver repos can use the bump type exact and supply the version with exact_version, in which case no version validation is done (other than asserting that the version does begin with a digit and does not begin with a v).

Changelog manipulation

I'm using markdown-it-py and mdformat to parse/standardize the CHANGELOG files. I've got some additional parsing logic built on top of that to massage everything into the keep-a-changelog format. Aside from the obvious renaming of the unreleased section and standardization, I perform the following adjustments:

Version numbers in H2 headings are linked (as footnotes) to the GitHub changes since the prior release.
Any H2 section that begins with "Add", "Fix", "Change", or "Remove" (case-insensitive) is turned into an H3 section.
Any H1 section that begins with [<digit> is turned into an H2 section.
Any non-list-items under an H2 are grouped and turned into common changelog notices. Common changelog is a different standard than keep-a-changelog, but I like the notices.
H3 sections within a version are reordered as "Added", "Changed", "Deprecated", "Removed", "Fixed", and "Security".
Repeated H3 sections (e.g. multiple "Added" sections) within a version are merged.
Non-list-items not under an H3 are moved to the "Changes" section.

The above manipulations should all be idempotent. That means a badly formatted CHANGELOG will have many changes the first time it runs through this workflow but very few on each subsequent release.

Workflow data handoff

There's not a great way to pass data between two arbitrary workflows in a repository, so I settled on encoding the target version number into the pull request's branch name (e.g. automation-create-release-<version>. Unlike the PR title or comment, I do not believe the branch can be renamed (intentionally or inadvertently), so that should be safe enough.

I built off of GitHub's recommendations for triggering the second workflow when the release PR is merged. GitHub's example would run whenever any pull request was merged, so I added more conditional checks in both the example calling and the reusable workflows. This has the effect that a "skipped" workflow run will appear under the Actions tab whenever other pull requests are merged.

Event	Workflow run appears?	Conclusion
Pull request opened	No	--
Pull request closed (no merge)	No	--
Non-release pull request merged	Yes	Skipped
Release pull request merged	Yes	Success / Failure

Reusable workflow hackery

Unlike Composite Actions, when a reusable workflow is called from another repository it only brings along the specific workflow file and no sidecar scripts. In order to get those you have to fetch information about the current run and parse out the reusable workflow SHA from the referenced_workflows section. You can then use that with actions/checkout to get the appropriate scripts. Yes, that's frustratingly difficult, there are multiple open issues about it.

Testing

I've been kicking the tires with my testing repository:

I've made a number of releases, but please feel free to make more! It's currently setup to create draft releases, so if you want them to "take" you'll need to find the release (follow the link posted in each release PR) and publish them.

Next steps

The majority of our things-to-be-released fall into one of three categories:

Nextflow pipelines
Docker images
Python packages

Taking Nextflow as an example, each pipeline stores its current version inside the nextflow.config file. These workflows should update that file automatically, but I haven't gotten there yet.

Checklist

This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem¹ - Even a small leak can be costly².
This PR does NOT contain germline genetic data³, RNA-Seq, DNA methylation, microbiome or other molecular data⁴.

This PR does NOT contain other non-plain text files, such as: compressed files, images (e.g. .png, .jpeg), .pdf, .RData, .xlsx, .doc, .ppt, or other output files.

To automatically exclude such files using a .gitignore file, see here for example.

I have read the code review guidelines and the code review best practice on GitHub check-list.
I have set up or verified the main branch protection rule following the github standards before opening this pull request.
The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have added the major changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records ↩
The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. ↩
Genetic information is considered PHI.
Forensic assays can identify patients with as few as 21 SNPs ↩
RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity. ↩

actions/checkout#1781

nwiltsie · 2024-08-01T22:27:59Z

Oh, the "tool-create-release" name and everything else about the branding is up for review too.

j2salmingo

Overall well done, the only major question which can be directed to the rest of the group is just the usage of asserts. The other questions can spark discussion but should not otherwise hold up the PR.

j2salmingo · 2024-08-02T00:45:13Z

bumpchanges/bump.py

+    if bump_type == "exact":
+        body_values["Exact version"] = exact_version
+
+    # Write the PR body into a temporary file


suggestion: non-blocking Is there a better way of doing this, with either a template or a different file type like JSON/YAML/etc?

"This" being the use of a temporary file, or the templating?

The temporary file is useful because multiline string outputs get complicated in Actions.

I'm open to suggestions about better ways of templating - this felt too trivial for something like Jinja2 (plus I'm not super familiar with it).

There is the string.Template class, but I am not sure if it just complicates it for no real readability benefit in this case, and I would agree that bringing in an entire templating engine like Jinja2 for something like this is too heavy for my tastes.

https://docs.python.org/3/library/string.html

bumpchanges/bump.py

j2salmingo · 2024-08-02T00:50:07Z

bumpchanges/changelog.py

+class Version:
+    "Class to help manage individual releases within CHANGELOG.md files."
+
+    link_heading_re: ClassVar = re.compile(


suggestion: non-blocking: Is there an alternative to regex for this problem or are we forced into it because we're parsing unknown markdown? I'm always a little apprehensive about using regex because it requires a bit of effort to understand when it's not my code.

Yeah, it's the unknown-ness of it that's the problem. I'm basically trying to extract a version name and a date from a string that should look like [1.2.3] - 2024-01-01. I don't know of an easier way of doing that using regexes.

j2salmingo · 2024-08-02T00:53:46Z

bumpchanges/bump.py

+
+
+def update_changelog(changelog_file: Path, repo_url: str, version: str, date: datetime.date):
+    "Rewrite a CHANGELOG file for a new release."


chore, non-blocking: Not to be that guy again but docstrings should be triple-quoted. I would also suggest that for non-obvious variables that they are included in the docstring, but I'll leave that up to you if you want to do that. In this instance I think it's fine.

Ahh, I really hoped that either black or ruff would handle this for me. You're right that PEP-257 says to always use triple-quotes, but I'm leery to entertain stylistic changes that can't be handled automatically.

I am displeased with this lack in our tooling - I'll fix it manually.

astral-sh/ruff#12637

Fixed with f66c0b3.

j2salmingo · 2024-08-02T00:55:54Z

bumpchanges/changelog.py

+            changelog_file.read_text(encoding="utf-8")
+        )
+
+        for token, nexttoken in itertools.pairwise(


praise: I am always impressed when people use itertools, I feel like there's so many good things in there that easily solve problems whenever I need it, but I can never remember to see what I can do with itertools.

bumpchanges/changelog.py

sorelfitzgibbon

This is great!

I created draft release PR. It was easy to do and easy for me to understand. I can't merge it myself to see the next step but seems it will be just as clear.

nwiltsie added 30 commits July 29, 2024 14:53

Add initial reusable workflow

a0df3d6

Add token to environment

449e2ae

Debug

83daec5

Actually reference the variables

7d40caf

Do it in one step again

4fb3029

Debug jq syntax

851d758

More debugging

1f24b4c

Fix bug in jq filter

b509d5f

Use the PAT to get the reusable repository

75ae17d

Mark the full JSON as debug output

e255274

Get the next version

f25cbd6

fetch-tags does not work correctly

3187e7f

actions/checkout#1781

Apparently both are needed

29acc5c

Properly handle the leading v

7ecdb6c

Include exact in the same workflow

546e865

Handle the first release

cb34d70

Add in python code

ceb9aa7

Update the workflow to install the python package

1e3461f

Remove python3.11 feature

21ce1c3

Bugfix

aaf608b

Bump to requiring 3.8

ee9de62

Update the CHANGELOG

b78a93d

Actually re-write the file

a572a6e

Create a pull request with the changes

829dfbc

Add logging levels for GitHub Actions

8d7e43e

Log more things

21178b8

Bugfix

beb5ca0

Logging bugfixes

9e2335a

The filter has to be set on the handler

bfc0078

Capture expected 'FATAL' output from tag check

10362ba

nwiltsie added 10 commits July 30, 2024 17:09

Rename scripts and workflows

3b9058a

Add create-draft argument, post comment with link

ac07fff

Bugfix

76c59f5

See if the dash is the problem

a804356

Use correct URL

ec0bb30

Log the release data

29cccce

Bugfix

73e602b

Get the proper URL

6b6aa0c

Update CHANGELOG

85feb1c

Update README

7719e91

nwiltsie requested a review from a team August 1, 2024 22:25

Fix lint

7bd3032

yashpatel6 assigned aholmes, sorelfitzgibbon, dan-knight, Faizal-Eeman, yashpatel6, j2salmingo and kiarod Aug 1, 2024

j2salmingo requested changes Aug 2, 2024

View reviewed changes

nwiltsie added 4 commits August 2, 2024 09:32

Run code through ruff

644ce8e

Format docstrings to use triple-quotes

f66c0b3

Replace asserts with exceptions

8fd9cc9

Silence warning about extra branches from asserts

eca4a83

sorelfitzgibbon approved these changes Aug 2, 2024

View reviewed changes

j2salmingo approved these changes Aug 2, 2024

View reviewed changes

nwiltsie merged commit 7c83849 into main Aug 2, 2024
9 checks passed

nwiltsie deleted the nwiltsie-reusable-workflows branch August 2, 2024 23:19

nwiltsie mentioned this pull request Aug 2, 2024

Manage releases with this repo #2

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bootstrap reusable workflows for initial release #1

Bootstrap reusable workflows for initial release #1

nwiltsie commented Aug 1, 2024 •

edited

Loading

nwiltsie commented Aug 1, 2024 •

edited

Loading

j2salmingo left a comment

j2salmingo Aug 2, 2024

nwiltsie Aug 2, 2024

j2salmingo Aug 2, 2024

j2salmingo Aug 2, 2024

nwiltsie Aug 2, 2024

j2salmingo Aug 2, 2024

nwiltsie Aug 2, 2024

nwiltsie Aug 2, 2024

nwiltsie Aug 2, 2024

j2salmingo Aug 2, 2024

sorelfitzgibbon left a comment



		def update_changelog(changelog_file: Path, repo_url: str, version: str, date: datetime.date):
		"Rewrite a CHANGELOG file for a new release."

Bootstrap reusable workflows for initial release #1

Bootstrap reusable workflows for initial release #1

Conversation

nwiltsie commented Aug 1, 2024 • edited Loading

Description

Building blocks

Version definition

Version computation

Changelog manipulation

Workflow data handoff

Reusable workflow hackery

Testing

Next steps

Checklist

Footnotes

nwiltsie commented Aug 1, 2024 • edited Loading

j2salmingo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sorelfitzgibbon left a comment

Choose a reason for hiding this comment

nwiltsie commented Aug 1, 2024 •

edited

Loading

nwiltsie commented Aug 1, 2024 •

edited

Loading