Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(github)!: support integration with GitHub repos #363

Merged
merged 68 commits into from
Dec 26, 2023

Conversation

orhun
Copy link
Owner

@orhun orhun commented Dec 2, 2023

Resolves #119 #132 #112 #12

Description

This PR adds GitHub integration to git-cliff. The basic goal is to support generating the same changelog as the built-in GitHub changelog generator (with PR links and contributors). This means that we need to query the GitHub API for certain information and incorporate them in the changelog context.

Usage

There is a new section in the config file for setting the GitHub remote:

[remote.github]
owner = "orhun"
repo = "git-cliff"
token = ""

These values are also passed to the template (via additional_context param of Changelog::render function):

**Full Changelog**: https://github.com/{{ remote.github.owner }}/{{ remote.github.repo }}/compare/{{ previous.version }}...{{ version }}\

There are new values in the changelog context which you can use to add GitHub usernames / pull request numbers to the changelog:

{{ commit.github.username }}
{{ commit.github.pr_number }}

Also, there is a new variable for contributors (github.contributors):

{% for contributor in github.contributors | filter(attribute="is_first_time", value=true) %}
  * @{{ contributor.username }} made their first contribution in #{{ contributor.pr_number }}
{%- endfor -%}

One cool thing, no network requests are made if no GitHub-related variables are used in the template.

Example

cargo run -- --unreleased --config examples/github.toml

## What's Changed

- feat(github): support integration with GitHub repos
- fix(changelog): set the correct previous tag when a custom tag is given by @orhun
- docs(website): add installation instructions for Homebrew by @sh-cho in #357
- feat(args): set `CHANGELOG.md` as default missing value for output option by @sh-cho in #354
- refactor(config): remove unnecessary newline from configs by @orhun

## New Contributors

- @sh-cho made their first contribution in #357

<!-- generated by git-cliff -->

Implementation

There seems to be 2 ways of interacting with GitHub programmatically:

  • REST API: works without a token but there is a rate limit. Has pagination.
  • GraphQL: only works with a repository token. Has pagination.

I'm using the REST API for the current implementation and I'm fetching all the commits and pull requests. This sounds suboptimal but unfortunately we need all the commits to generate the required release context. For example:

git cliff -u: only generates the changelog for the unreleased changes. If we want to have a section there as "first contributors" then we need the whole commit/PR history to check if the user has contributed before. Unfortunately GitHub contributors API does not give us the SHA1 of the commits that belong to the contributors so I'm using the pull requests API.

On top of that, there is pagination (max 100 entries in a page) so we need to send multiple requests to fetch all the history.

TL;DR: This is bad and it vastly slows down the changelog generation. But I don't know how to improve it. I have experimented with the GraphQL API but it also has pagination although you can filter stuff before hand. (also it requires a token at all times as mentioned before)

Ping top contributors @MarcoIeni @kenji-miyake @joshka @alerque for opinions/discussion.

TODO List

  • Find out upstream remote from the Git repository (no configuration)
  • Add progress bar for network requests
  • Add caching to network requests (?)
  • Gate the implementation behind github feature (?)
  • Add new examples
    • github.toml: same format as GitHub
    • Grouped GitHub changelog, something like Pixi's changelog
  • Add command-line arguments for configuring the remote
    • --github-repo
    • --github-token
  • Add tests (unit/integration/fixtures)
  • Update documentation

@codecov-commenter

This comment was marked as off-topic.

@alerque
Copy link
Contributor

alerque commented Dec 3, 2023

I like the idea of this being available. I've wanted contributor and PR callouts in changelogs before and not having them has kept me manually writing changelogs for some projects.

Have you looked into testing whether gh can be used for some of the queries, or at least help with Auth issues when available to be able to use the GraphQL API? I just wonder how hard it would be to use those credentials if found and fall back to the Rest API if not. Besides that thought some sort of caching for the network results sounds like it might be in order, perhaps using Git's object storage?

If the features are not used if not configured my thought would be that it doesn't need gating, but then again for use in CI sometimes it's nice to be able to compile as trimmed down a binary as possible for the specific job at hand.

@orhun orhun mentioned this pull request Dec 3, 2023
@orhun orhun force-pushed the feat/github_integration branch from ab2cef9 to 41801dd Compare December 7, 2023 23:28
Copy link
Contributor

@kenji-miyake kenji-miyake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I noticed my comments added a few days ago are still pending. I'm submitting them now.)

Great feature! I added several comments that I found with a quick look.

examples/github.toml Show resolved Hide resolved
git-cliff-core/src/lib.rs Outdated Show resolved Hide resolved
git-cliff-core/src/release.rs Show resolved Hide resolved
@orhun
Copy link
Owner Author

orhun commented Dec 9, 2023

Hey @alerque

Have you looked into testing whether gh can be used for some of the queries, or at least help with Auth issues when available to be able to use the GraphQL API? I just wonder how hard it would be to use those credentials if found and fall back to the Rest API if not.

Just tried that and gh also has pagination which means it does not really help with anything. Fallback to gh means additional command execution so I would rather do that directly with Rust. Good idea though, thanks for sharing it.

At this point I start to doubt if I'm doing things efficiently because it takes around 10 seconds to complete the both /commits and /pulls requests. I will also need to send requests to /pulls/<number>/commits which means additional N seconds.

I need a Rust guru to tell me if I'm properly parallelizing the requests or give me some feedback on this 💀

@alerque
Copy link
Contributor

alerque commented Dec 9, 2023

Just tried that and gh also has pagination which means it does not really help with anything. Fallback to gh means additional command execution so I would rather do that directly with Rust. Good idea though, thanks for sharing it.

Makes sense to not add external dependencies if it doesn't actually gain you anything functionally or performance wise.

@orhun
Copy link
Owner Author

orhun commented Dec 20, 2023

@alerque
Copy link
Contributor

alerque commented Dec 25, 2023

Off topic of this PR but on the topic of conventional commits and generated changelogs: my understanding of fix: is that it is only relevant if the commit fixes something in the previous release version, not if it fixes some feature addition that hasn't been released yet. As an example I would have used chore: for 32858b7 so that when I generate a change log for the next release it does not show up (I also typically don't even list chore commits in the changelog). This is almost certainly a personal preference to have a whole class of commits not listed at all, but not having things in the "bug fixes" section that are not bugs in the previous release(s) seems to make sense to me. Is your usage in this PR deliberate to include fixes of the in-progress feature or no?

@orhun
Copy link
Owner Author

orhun commented Dec 25, 2023

Is your usage in this PR deliberate to include fixes of the in-progress feature or no?

Yes, the changes that are in this PR is only scoped this feature. I think I'm going to squash the commits with a message such as "feat!: add github integration" or something.

@orhun
Copy link
Owner Author

orhun commented Dec 26, 2023

alright, here we go

@orhun orhun changed the title Add GitHub integration feat(github)!: support integration with GitHub repos Dec 26, 2023
@orhun orhun merged commit 5238326 into main Dec 26, 2023
40 of 41 checks passed
@orhun orhun deleted the feat/github_integration branch December 26, 2023 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add author/contributor name to the changelogs
6 participants