-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bootstrap reusable workflows for initial release #1
Conversation
Oh, the "tool-create-release" name and everything else about the branding is up for review too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall well done, the only major question which can be directed to the rest of the group is just the usage of asserts. The other questions can spark discussion but should not otherwise hold up the PR.
if bump_type == "exact": | ||
body_values["Exact version"] = exact_version | ||
|
||
# Write the PR body into a temporary file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: non-blocking Is there a better way of doing this, with either a template or a different file type like JSON/YAML/etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This" being the use of a temporary file, or the templating?
The temporary file is useful because multiline string outputs get complicated in Actions.
I'm open to suggestions about better ways of templating - this felt too trivial for something like Jinja2 (plus I'm not super familiar with it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is the string.Template
class, but I am not sure if it just complicates it for no real readability benefit in this case, and I would agree that bringing in an entire templating engine like Jinja2 for something like this is too heavy for my tastes.
class Version: | ||
"Class to help manage individual releases within CHANGELOG.md files." | ||
|
||
link_heading_re: ClassVar = re.compile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: non-blocking: Is there an alternative to regex for this problem or are we forced into it because we're parsing unknown markdown? I'm always a little apprehensive about using regex because it requires a bit of effort to understand when it's not my code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's the unknown-ness of it that's the problem. I'm basically trying to extract a version name and a date from a string that should look like [1.2.3] - 2024-01-01
. I don't know of an easier way of doing that using regexes.
|
||
|
||
def update_changelog(changelog_file: Path, repo_url: str, version: str, date: datetime.date): | ||
"Rewrite a CHANGELOG file for a new release." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chore, non-blocking: Not to be that guy again but docstrings should be triple-quoted. I would also suggest that for non-obvious variables that they are included in the docstring, but I'll leave that up to you if you want to do that. In this instance I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I really hoped that either black or ruff would handle this for me. You're right that PEP-257 says to always use triple-quotes, but I'm leery to entertain stylistic changes that can't be handled automatically.
I am displeased with this lack in our tooling - I'll fix it manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with f66c0b3.
changelog_file.read_text(encoding="utf-8") | ||
) | ||
|
||
for token, nexttoken in itertools.pairwise( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: I am always impressed when people use itertools
, I feel like there's so many good things in there that easily solve problems whenever I need it, but I can never remember to see what I can do with itertools
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
I created draft release PR. It was easy to do and easy for me to understand. I can't merge it myself to see the next step but seems it will be just as clear.
Description
This PR adds reusable workflows to manage the release process for our pipelines and software packages (see https://github.com/uclahs-cds/group-software-best-practices/discussions/43).
There are two workflows:
wf-prepare-release.yaml
is triggered manually (via aworkflow_dispatch
) and takes the following actions:major
/minor
/patch
/prerelease
.CHANGELOG.md
file to move unreleased changes into a new dated release section.wf-finalize-release.yaml
, triggered when a release PR is merged, takes the following actions:Building blocks
Version definition
In all cases, semver and not, I'm assuming that git release tags begin with
v[0-9]
. The corresponding version number does not have the leadingv
(e.g. tagv1.2.3
and version1.2.3
).Version computation
I'm using the semver python package to compute the next semantic version. That's based on the most recent ancestor tag of the
main
branch and user input (major
/minor
/patch
/prerelease
) for which bump type to perform.Non-semver repos can use the bump type
exact
and supply the version withexact_version
, in which case no version validation is done (other than asserting that the version does begin with a digit and does not begin with av
).Changelog manipulation
I'm using markdown-it-py and mdformat to parse/standardize the CHANGELOG files. I've got some additional parsing logic built on top of that to massage everything into the keep-a-changelog format. Aside from the obvious renaming of the unreleased section and standardization, I perform the following adjustments:
[<digit>
is turned into an H2 section.The above manipulations should all be idempotent. That means a badly formatted CHANGELOG will have many changes the first time it runs through this workflow but very few on each subsequent release.
Workflow data handoff
There's not a great way to pass data between two arbitrary workflows in a repository, so I settled on encoding the target version number into the pull request's branch name (e.g.
automation-create-release-<version>
. Unlike the PR title or comment, I do not believe the branch can be renamed (intentionally or inadvertently), so that should be safe enough.I built off of GitHub's recommendations for triggering the second workflow when the release PR is merged. GitHub's example would run whenever any pull request was merged, so I added more conditional checks in both the example calling and the reusable workflows. This has the effect that a "skipped" workflow run will appear under the Actions tab whenever other pull requests are merged.
Reusable workflow hackery
Unlike Composite Actions, when a reusable workflow is called from another repository it only brings along the specific workflow file and no sidecar scripts. In order to get those you have to fetch information about the current run and parse out the reusable workflow SHA from the
referenced_workflows
section. You can then use that withactions/checkout
to get the appropriate scripts. Yes, that's frustratingly difficult, there are multiple open issues about it.Testing
I've been kicking the tires with my testing repository:
I've made a number of releases, but please feel free to make more! It's currently setup to create draft releases, so if you want them to "take" you'll need to find the release (follow the link posted in each release PR) and publish them.
Next steps
The majority of our things-to-be-released fall into one of three categories:
Taking Nextflow as an example, each pipeline stores its current version inside the
nextflow.config
file. These workflows should update that file automatically, but I haven't gotten there yet.Checklist
This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem1 - Even a small leak can be costly2.
This PR does NOT contain germline genetic data3, RNA-Seq, DNA methylation, microbiome or other molecular data4.
.png
, .jpeg
),.pdf
,.RData
,.xlsx
,.doc
,.ppt
, or other output files.To automatically exclude such files using a .gitignore file, see here for example.
I have read the code review guidelines and the code review best practice on GitHub check-list.
I have set up or verified the
main
branch protection rule following the github standards before opening this pull request.The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have added the major changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.Footnotes
UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records ↩
The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. ↩
Genetic information is considered PHI.
Forensic assays can identify patients with as few as 21 SNPs ↩
RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity. ↩