-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
should Renovate rewrite / sanitize full github links from a changelog ? #14804
Comments
Yes, we should try to sanitize all. Can you create a reproduction public repo which generates a PR with non-sanitized changelog? |
Hi there, Help us by making a minimal reproduction repository. Before we can start work on your issue we first need to know exactly what's causing the current behavior. A minimal reproduction helps us with this. To get started, please read our guide on creating a minimal reproduction to understand what is needed. We may close the issue if you (or someone else) have not provided a minimal reproduction within two weeks. If you need more time, or are stuck, please ask for help or more time in a comment. Good luck, The Renovate team |
@rarkins although we use Renovate ourselves, it's an issue that manifests in other projects using Renovate. I cannot really perform a release just to reproduce the issue, and I have edited the referenced release on GitHub to avoid future back links piling up. The issue is not as clear cut as I initially thought. From what I can see the PRs that link back to us are either from a self-hosted instance, or a commit that was made with a message mirroring the PR body (the rendered body, not the source 😧). (note: in this issue I'm either linking to PRs/issues using togithub links, or escaping snippets to avoid back linking myself). We can take the following PR as an example of one of our issues that get spammy back links: https://togithub.com/reactor/reactor-core/pull/2934 The release note excerpt I'm using for reference below is originally the following markdown (generated by github): ### :sparkles: New features and improvements
* Improve BoundedElasticScheduler to be less blocking by @simonbasle in https://github.com/reactor/reactor-core/pull/2909 I'm looking at 3 cases: case 1https://togithub.com/mulesoft/apikit-rest-module/pull/545 This one seems to have been edited / updated to reflect my own edit of our release notes. Last edit as of this writing 11hrs ago. Excerpt from the old PR body:
Probably caused the back link: see how the second link is a raw github one. In the last edit, it was changed to:
Now rendered as a case 2https://togithub.com/mulesoft/mule-validation-module/pull/191 Also from Mulesoft (running self hosted I'm guessing). That one has no edit unlike the first one (at least as of this writing). Excerpt is :
Again, plain raw github link for the PR. case 3That one is a little bit out of scope I think, after looking closer at it. It looks like the commit was made with a copy of Renovate PR's body, as in "the rendered body" (and not the markdown source. The PR behind this commit is https://togithub.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/pull/778 If we look at it it was last edited 7 days ago (before I "fixed" the release notes) and contains the following excerpt:
See how the rendered text is a github URL but the underlying link is a togithub URL? since the commit used the rendered text, it contains github URLs :( For that one I guess there's nothing that can be done. Hope this helps discovering if there is a regression or room for improved sanitizing, especially knowing that the "Generate changelog" feature from GitHub produces full github.com links. |
I'm on mobile so can't do this myself just yet, but next step would be to check for other public renovate PRs other than mulesoft's which link to the same changelog. It would be useful to confirm if the hosted service was sanitising correctly before you made the edit. |
mmh yeah: I've found at least one in https://togithub.com/Ezzahhh/teku/pull/78 Note that the PR body has since been updated by the hosted service to reflect my release note edits, but the previous version was displaying textual ✨ New features and improvements
Improve BoundedElasticScheduler to be less blocking by [@simonbasle](https://togithub.com/simonbasle) in [https://github.com/reactor/reactor-core/pull/2909](https://togithub.com/reactor/reactor-core/pull/2909) |
@rarkins looks like the hosted service behaves correctly, sorry for the noise. I have reached out to people having approved the problematic PRs to see if they can improve the situation on their side. Maybe this could be turned into a Discussion if said people want to chime in? |
There's likely still something we can do or fix with Renovate code - once we can work out why it's going wrong in certain cases - so I'm happy to keep this open. In general Renovate has a test of "am I on GitHub.com? If so then sanitize all github.com links" so it's either the "am I on github.com?" test which is malfunctioning or it's the sanitizing. |
Here's the relevant entry point: renovate/lib/modules/platform/github/index.ts Lines 1663 to 1676 in 25d793f
So for self-hosted users I'm wondering if the "isGhe" is somehow wrong, if they're actually on github.com. renovate/lib/modules/platform/github/index.ts Lines 101 to 102 in 25d793f
|
When a bug has been marked as needing a reproduction, it means nobody can work on it until one is provided. In cases where no reproduction is possible, or the issue creator does not have the time to reproduce, we unfortunately need to close such issues as they are non-actionable and serve no benefit by remaining open. This issue will be closed after 7 days of inactivity. |
about the |
Here is another public reproduction of this issue in the wild, pertaining to improper sanitization of the release changelog over at Rich release v12.3.0. You can see that every issue/PR linked in the changelog gets mention spammed by Renovate, for example see the mention spam at the bottom of Textualize/rich#2200 triggered by text in the body of the changelog reproduced by Renovate. Dependabot bypasses this auto-mention behavior by changing any https://github.com link to a https://github-redirect.dependabot.com link. A similar approach could be taken for Renovate. For example, Dependabot can bump the very same release mentioned above without accidentally mentioning the issues/PRs in the changelog, see blakeNaccarato/gradedoc#19 (comment) for example. You can see how the PR link in Rich's changelog is changed to https://github-redirect.dependabot.com/Textualize/rich/pull/2200. |
@blakeNaccarato 99% of Renovate's links to GitHub are already rewritten to togithub.com so it's not a matter of what to do but just an edge case of some links being missed. |
I see. I wonder what the edge-case is here? It looks like it happens again back in Rich release v9.13.0 as well. In the issues/PRs linked there, we can see mention spam from both Dependabot and Renovate, so there must be something in this particular dev's release workflow that occasionally tricks dependency managers into noisy backlinking. |
we still need a reproduction repo to debug what's going on. so please don't post new public repos which aren't minimal. if you're hit by this issue vote via Emoji and subscribe but don't post me too comments. thank you 🤗 |
The challenge is that GitHub do some "transforming" of markdown to embellish it with links. E.g. when you type In the case you linked to I see only 2 links from Renovate so I would say:
So we need to predict all of GitHub's embellishments/transforms and replace them with explicit togithub.com links so that GitHub doesn't insert github.com links. If there's any format or new transform we've missed, then the direct linking happens, causing backlinks. As @viceice mentioned, the most important thing would be a reproduction repo so that a developer can debug it locally. I'll try forking one of the "spamming" repositories and see if it also triggers a backlink from my fork. |
This comment was marked as spam.
This comment was marked as spam.
When a bug has been marked as needing a reproduction, it means nobody can work on it until one is provided. In cases where no reproduction is possible, or the issue creator does not have the time to reproduce, we unfortunately need to close such issues as they are non-actionable and serve no benefit by remaining open. This issue will be closed after 7 days of inactivity. |
Fetched from https://api.github.com/repos/Textualize/rich/releases I see that the body content for 12.3.0 is:
So the URL which seems to trigger the backlink is |
I have reproduced this locally (renovate-reproductions/14804#2). Excerpt from release notes where you can see most are transformed but some are not:
|
🎉 This issue has been resolved in version 32.50.3 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
How are you running Renovate?
WhiteSource Renovate hosted app on github.com
If you're self-hosting Renovate, tell us what version of Renovate you run.
No response
Please select which platform you are using if self-hosting.
No response
If you're self-hosting Renovate, tell us what version of the platform you run.
No response
Was this something which used to work for you, and then stopped?
It used to work, and then stopped
Describe the bug
It has recently come to our attention that projects that use Renovate and depend on our project generate noisy backlinks to Renovate PRs in our issues.
I was under the impression that this was sanitized by Renovate (using the
togithub.com
domain), but there is maybe something that we missed.Namely, isn't Renovate supposed to sanitize GitHub URLs like
https://github.com/reactor/reactor-core/pull/2934
?A couple month ago we switched to the GitHub provided "Generate a changelog" button when publishing release notes, and it uses the aforementioned format. Previously, we were manually curating the release notes and using the
#123
short format, which was sanitized by Renovate.I'm guessing this is not a regression, but would you consider sanitizing full github URLs like above to
togithub.com
URLs?Relevant debug logs
Logs
Have you created a minimal reproduction repository?
No reproduction, but I have linked to a public repo where it occurs
The text was updated successfully, but these errors were encountered: