Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tool to create and render stageable changelog entries #2462

Merged
merged 2 commits into from
Nov 21, 2024

Conversation

JordonPhillips
Copy link
Contributor

@JordonPhillips JordonPhillips commented Nov 13, 2024

Writing changelog entries is not super fun. Writing them after the fact, especially when you didn't write the original feature, is even less fun. And merge conflicts because of synchronizing a changelog file are SUPER not-fun.

This PR solves those problems. It introduces a changelog tool similar to what many of the SDK teams use (and indeed cribs some code from them).

Workflow

  • When creating a pull request, you create a changelog entry by running .changes/new-change. This creates a JSON file in .changes/next-release/ containing the changelog entry and its metadata (e.g. whether it's a documentation change, feature change, etc.)
  • When the pull request is opened a GitHub action will automatically add a PR comment with a suggestion that you can click on to add the PR number.
  • During a release, the changelog is updated by running .changes/render --release-version 1.2.3. This compiles all pending changes in .changes/next-release/ into a single file in the .changes/releases/ directory. It then renders all of the releases in .changes/releases/ into a markdown file that can be written to CHANGELOG.md. If for some reason the changelog needs to be re-rendered out of band, ./change/render can be called without a version to just do the rendering.

In effect, the canonical source for the changelog becomes the changelog JSON files, and the actual markdown representation is just a derived artifact.

Why?

Using distinct files for changelog entries solves a major problem: merge conflicts. These are horribly annoying to deal with during development, especially on a file that is essentially documentation and semantically has no reason to ever conflict. This problem is less bad for Smithy itself than it is for SDKs that have daily releases, but it's a problem for us too with our relatively large team.

What's different?

This differs from what other SDKs do in a few ways. Firstly, and most notably, we aren't deriving a version number from these. We could! It would not be hard! But I elected not to do that in this PR since it's not as big a problem for us.

We also have a habit of linking the PRs for all of our changes in the changelog. This is actually kinda annoying because it creates a bit of a chicken-egg scenario. This is why I added the github action to automatically add it. If the action is annoying we can always get rid of it and just stop relying on the links. But that would be a shame.

Mechanically there's also a change in that I used modern python to write this, and updated any borrowed code to also be modern python.

Why not use jmeslog?

jmeslog is a standalone tool derived from the python team's changelog tool (which started it all). It is essentially exactly what they do and want, but there's a few things that prevent us from using it. Notably, we need the current date and pr links. jmeslog doesn't have that and doesn't currently allow for easy extension in that way.

It also has things we don't necessarily want, like a category. That's really mostly used in the sdks to indicate changes made for a particular service. With hundreds of services, that's an issue worthy of calling out. But smithy itself doesn't need it.

But also, it's a dependency, and it has dependencies. I'd rather not have dependencies if I can avoid it. And what we need is ultimately not that complex. And doing it ourselves gives us some flexibility.

All that said, there's definitely a world where jmeslog is updated to a place where we can more comfortably use it.

Why not derive changelogs from commits?

The audience for changelogs and the audience for commit messages are two separate (though sometimes intersecting) audiences. Commits may contain conext about technical junk that only mattters to people working on the code itself.

There is also not necessarily a 1-1 relationship between a changelog entry and a commit. Large features are often broken up into logical commits that make reviewing and code diving easier, but aren't important to the changelog. You might also have commits that make changes unimportant to the changelog audience, such as formatting changes or changes to CI configurations.

Commits are also, critically, immutable. Did you make a typo in your commit? Now it's preserved forever in your changelog. With JSON files, you can just change them. Those changes are now part of the commit record!

Also, on a more subjective level, have you seen repos that have semantic commits? They're ugly. Having more than half your commits starting with chore just makes it look like your project is in maintenance mode and/or like you hate your job. And if you're following best practices for commit messages, that's eating into your 50 character limit for the title.

Any unfortunate bits?

There's no line wrapping, so you better believe the changelog is gonna have some long lines. It doesn't really matter though, because it'll all be natural when you're seeing it fully rendered. The line boundary thing these days mostly helps with readability when you're editing a file, but you will never manually edit the file.

Also the automated commit will hide the status checks in GitHub's UI for some reason. That's why it's only added after all the other checks and isn't run if said checks fail. Also why you can do it manually.

What's next?

Provided this goes through, the next step will be to backfill all the current changelog entries. That'll certainly be a task.

A github action to create the version bump pr would be sweet. We'd need to add the ability to derive a version number from the feature list, which isn't hard. Then the script could just aslo perform the version bump by writing that one file. Then it could be just a button press away in github. Maybe as a fully automated release?


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

.changes/tool/render.py Show resolved Hide resolved
.changes/amend Show resolved Hide resolved
.changes/new-change Show resolved Hide resolved
Writing changelog entries is not super fun. Writing them after the
fact, especially when you didn't write the original feature, is even
less fun. And merge conflicts because of synchronizing a changelog
file are SUPER not-fun.

This change solves those problems. It introduces a changelog tool
similar to what many of the SDK teams use (and indeed cribs some code
from them).

> Why?

Using distinct files for changelog entries solves a major problem:
merge conflicts. These are horribly annoying to deal with during
development, especially on a file that is essentially documentation
and semantically has no reason to ever conflict. This problem is
less bad for Smithy itself than it is for SDKs that have daily
releases, but it's a problem for us too with our relatively large
team.

> What's different?

This differs from what other SDKs do in a few ways. Firstly, and
most notably, we aren't deriving a version number from these. We
could! It would not be hard! But I elected not to do that in this
PR since it's not as big a problem for us.

We also have a habit of linking the PRs for all of our changes in
the changelog. This is actually kinda annoying because it creates
a bit of a chicken-egg scenario. This is why I added the github
action to automatically add it. If the action is annoying we can
always get rid of it and just stop relying on the links. But that
would be a shame.

Mechanically there's also a change in that I used modern python to
write this, and updated any borrowed code to also be modern python.

> Why not use jmeslog?

jmeslog is a standalone tool derived from the python team's
changelog tool (which started it all). It is essentially exactly what
they do and want, but there's a few things that prevent us from using
it. Notably, we need the current date and pr links. jmeslog doesn't
have that and doesn't currently allow for easy extension in that way.

It also has things we don't necessarily want, like a category. That's
really mostly used in the sdks to indicate changes made for a
particular service. With hundreds of services, that's an issue worthy
of calling out. But smithy itself doesn't need it.

But also, it's a dependency, and it has dependencies. I'd rather not
have dependencies if I can avoid it. And what we need is ultimately
not that complex. And doing it ourselves gives us some flexibility.

> Why not derive changelogs from commits?

The audience for changelogs and the audience for commit messages are
two separate (though sometimes intersecting) audiences. Commits may
contain conext about technical junk that only mattters to people
working on the code itself.

There is also not necessarily a 1-1 relationship between a changelog
entry and a commit. Large features are often broken up into logical
commits that make reviewing and code diving easier, but aren't
important to the changelog. You might also have commits that make
changes unimportant to the changelog audience, such as formatting
changes or changes to CI configurations.

Commits are also, critically, ***immutable***. Did you make a typo
in your commit? Now it's preserved forever in your changelog. With
JSON files, you can just change them. Those changes are now part of
the commit record!

Also, on a more subjective level, have you seen repos that have
semantic commits? They're ugly. Having more than half your commits
starting with `chore` just makes it look like your project is in
maintenance mode and/or like you hate your job. And if you're
following best practices for commit messages, that's eating into
your 50 character limit for the title.

> Any unfortunate bits?

There's no line wrapping, so you better believe the changelog is
gonna have some long lines. It doesn't really matter though, because
it'll all be natural when you're seeing it fully rendered. The line
boundary thing these days mostly helps with readability when you're
editing a file, but you will never manually edit the file.

> What's next?

Provided this goes through, the next step will be to backfill all
the current changelog entries. That'll certainly be a task.

A github action to create the version bump pr would be sweet. We'd
need to add the ability to derive a version number from the feature
list, which isn't hard. Then the script could just aslo perform the
version bump by writing that one file. Then it could be just a button
press away in github. Maybe as a fully automated release?
This adds a GitHub action that checks for staged changelogs. If they
aren't present, it posts a reminder comment to add them. It does
*not* fail the build because there are valid cases where a changelog
is not desired.

If there are changelog entries present, it ensures that they have a
pull request associated with them. If they don't, it posts a pr
comment with a change suggestion that adds it in. It still does not
fail the build because maybe there's a reason not to add it.
@JordonPhillips
Copy link
Contributor Author

JordonPhillips commented Nov 20, 2024

Since initially opening this PR I had to change the way the GitHub action worked. Previously when it detected an introduced changelog without an associated PR it would just amend it and push a commit. It turns out that this is a permissions nightmare on a PR from a fork that isn't easily resolveable as it would need permissions from each of the forks. A bit annoying, to say the least.

So I changed it to instead post a review comment. The genius of this being that whoever opens the PR is doing the committing, so no permission issues! Except there were permission issues. Solveable this time though. The new problem was related to the old - when handling a PR from a fork, posting comments would fail. In this case, I lacked permissions for the main repo. I lacked those permissions because the maximum permissions level for a pull_request GitHub action triggered by a fork is read.

The workaround is to instead use pull_request_target, which runs the action from the context of the target (i.e. main) instead of the PR. With this, we can have those elevated permissions. The downside is that the action won't run on this PR because it doesn't exist on main yet. You can however see how it works in this closed testing pr.

I split the PR into two commits. One of those commits is just the action, so that it can be reverted if it ends up being a menace without having to revert the entire tool.

Since I was posting comments anyway, I went ahead and added the reminder comment. It includes a copyable snippet to add a changelog entry with the correct argument for the pr number and with the description defaulting to the PR's title. This only gets posted if the PR doesn't have a changelog entry. Example here.

These comments are never duplicated. The reminder comment will only ever be posted once per PR, and the amend comments will only ever be posted once per file.

@JordonPhillips JordonPhillips merged commit 50fa20e into smithy-lang:main Nov 21, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants