Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce cross-repository issue reference noise #217

Closed
rpkilby opened this issue Apr 25, 2017 · 8 comments
Closed

Reduce cross-repository issue reference noise #217

rpkilby opened this issue Apr 25, 2017 · 8 comments

Comments

@rpkilby
Copy link

rpkilby commented Apr 25, 2017

Hello. I've noticed that the pyup-bot embeds a dependency's changelog in pull requests. For example, this version pinning PR of Django Rest Framework for the patchwork project. Because DRF's changelog references issues/PRs, your version pinning PR in turn creates a ton of cross-repository issue references. For example, you can see encode/django-rest-framework#3288 which currently has 5 issue references from 5 separate pyup-bot PRs.

This isn't a huge problem yet since usage seems to be limited, but this will scale poorly at even tens or even hundreds of projects. A few possible fixes:

  • don't embed the changelog (after all, the PR already includes a link to the actual changelog).
  • wrap the embedded changelog in triple back ticks so GitHub doesn't create the issue references.
  • wrap the embedded hyperlinks and issue numbers in backticks (this seems like it would be error-prone).
@jayfk
Copy link
Contributor

jayfk commented Apr 25, 2017

Yep, this is a problem. The bot is already stripping away @mentions, but that isn't enough.

I'm inclined to add another workaround:

  • Write your own link autoexpander, redirecting to the issue/pr/commit.

@rpkilby
Copy link
Author

rpkilby commented Apr 25, 2017

Write your own link autoexpander, redirecting to the issue/pr/commit.

What do you mean here? Is this the responsibility of pyup-bot, the monitored project (eg, patchwork), or the dependency (eg, DRF)?

@jayfk
Copy link
Contributor

jayfk commented Apr 25, 2017

This would be the bots responsibility.

Instead of letting GitHub auto expand an issue like #11, the bot could pick it up and rewrite it as markdown link to a page that redirects to the issue (#11)[https://pyup.io/links/user/repo/issues/11].

@rpkilby
Copy link
Author

rpkilby commented Apr 25, 2017

It's possible, but I honestly don't really think it would be really worth the effort, not to mention the potential for error. The safe and simple options are:

  • link to the changelog on pyup.io (as already exists)
  • wrap in triple backticks and strip triple backticks from the changelog text

Given the simplicity and fool-proofness of these options, I don't see a justifiable benefit of a more complex option that just renders the changelog as markdown. The utility of it seems nominal, as it's not something that a lot of developers will ultimately read. And if a developer really does want to read the changelog in markdown, there's the link for that.

Sorry if I'm coming off as skeptical, but what I'm seeing is a potential source of a lot of noise that's going to reduce the readability of old issues. Right now, one project has generated a handful of unnecessary issue references for each of several hundreds issues. Imagine how this would look for a couple hundred projects with this same dependency.


Either way, some thoughts on auto-expanding the issue references...
You'd need to account for all possible cases where GitHub automatically creates an issue reference. Off the top of my head:

  • Issue reference shorthand (#123 - in this case, the dependency would inadvertently reference the repo's own issues)
  • External repository issue reference (encode/django-rest-framework#123)
  • Bare URL issue reference. (https://github.com/encode/django-rest-framework/pull/3288 is converted to encode/django-rest-framework#123 and referenced)

A potential edge case may be that the changelog contains an already expanded URL such as

[#11](https://pyup.io/links/user/repo/issues/11)

The bot would need to make sure it didn't accidentally expand this to

[[#11](https://pyup.io/links/user/repo/issues/11)]([https://pyup.io/links/user/repo/issues/11](https://pyup.io/links/user/repo/issues/11))

Additionally, the bot can't assume the changelog is properly formatted, and would need to gracefully handle incorrect syntax. eg, your example URL is actually not correct.
(#11)[https://pyup.io/links/user/repo/issues/11] should be
[#11](https://pyup.io/links/user/repo/issues/11).

@rpkilby
Copy link
Author

rpkilby commented May 24, 2017

Hi. I'm still noticing this cross-repository chatter. If you can point me to the appropriate module, I'd be willing to submit a PR.

@jayfk
Copy link
Contributor

jayfk commented May 24, 2017

hey @rpkilby. The changlog parsing is done on pyupio/changelogs, here: https://github.com/pyupio/changelogs/blob/master/changelogs/parser.py#L33-L34

It's probably a good idea to split this out in a separate function and "sanitize" it based on the full log instead of going over it line by line.

@jayfk
Copy link
Contributor

jayfk commented May 24, 2017

Additionally, in case you want to test against real world changelogs, there are hundreds of pre recorded changelogs being tested here: https://github.com/pyupio/changelogs/blob/master/tests/test_changelogs.py

And here is the changelog fire hose: https://pyup.io/static/changelogs.json

@jayfk
Copy link
Contributor

jayfk commented Apr 4, 2018

This should be good now.

@jayfk jayfk closed this as completed Apr 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants