Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Generic Grouping Rules #11092

Open
kevincox opened this issue Aug 4, 2021 · 8 comments
Open

Support Generic Grouping Rules #11092

kevincox opened this issue Aug 4, 2021 · 8 comments
Labels
priority-3-medium Default priority, "should be done" but isn't prioritised ahead of others type:feature Feature (new functionality)

Comments

@kevincox
Copy link

kevincox commented Aug 4, 2021

What would you like Renovate to be able to do?

I would like to be able to tell Renovate to group packages based on rules such as the following:

  1. Group all packages that share a source repo.
  2. Group all packages that share a GitHub user.
  3. Group all packages that share a homepage.
  4. Groups all packages in an NPM org.

This would allow obsoleting most of the monorepo rules while supporting more monorepos without further configuration.

Did you already have any implementation ideas?

{
	"packageRules": [{
		"groupName": "{sourceUrl}",
		"groupSlug": "{sourceUrl}",
	}, {
		"groupName": "{1}",
		"groupSlug": "{1}",
		"matchDatasources": ["npm"],
		"matchSourceUrlPrefixes": [
			"^([^/]*)$",
			"^@([^/]*)/",
		],
	}],
}

Something similar to this would work. The key is the groupSlug item, it allows any packages which evaluate to the same key to be groups. Alternatively the groupName could be used but it may be useful to put something more readable in there.

It isn't perfectly clear how these combine though. In this example you would have two candidate groupSlugs for each NPM package. One option is to merge on any match, but what if you match different packages with different groupSlugs?

@kevincox kevincox added priority-5-triage status:requirements Full requirements are not yet known, so implementation should not be started type:feature Feature (new functionality) labels Aug 4, 2021
@HonkingGoose
Copy link
Collaborator

I've taken the liberty to put your JSON through Prettier, as that makes things easier to read/understand. 😉

{
  "packageRules": [
    {
      "groupName": "{sourceUrl}",
      "groupSlug": "{sourceUrl}"
    },
    {
      "groupName": "{1}",
      "groupSlug": "{1}",
      "matchDatasources": ["npm"],
      "matchSourceUrlPrefixes": ["^([^/]*)$", "^@([^/]*)/"]
    }
  ]
}

@rarkins
Copy link
Collaborator

rarkins commented Aug 4, 2021

@kevincox I agree completely with this idea. I've had something similar my head for a long time but can't find any open issue so maybe I didn't write it down.

In short, the biggest advantage we could deliver is "automatic" monorepo grouping like you point out, without the need for static definitions. But also there will be times where people want to group by "scope" (e.g. @foo for npm, or groupId for maven). I'm not sure grouping by home page, but there'd be no harm in that once we do it.

The way I envisioned this was having a new field like groupPreference, which let's you define how you want PRs grouped (via branch names). So right now the way it works is:

  • If a groupName is defined then we use it as the branch topic/key. Typically this is done using package rules
  • If no groupName then we fall back to a prettified/slugified depName value, i.e. grouping per-dependency (meaning "no" grouping in most cases if a dependency is only present in the repo once)

In future it might look like this: "groupPreference": ["groupName", "sourceRepo", "scope", "depName"]

This would mean "if the user has configured a groupName via packageRules, then use that. If not then use the package's source repo to group, or its scope/groupId, but finally fall back to depName"

If we migrate to this approach then it would mean:

  • Any existing groupName rules would be honored, although most of the monorepo names can be dropped
  • If we know the sourceUrl then generate a short sourceRepo value (e.g. https://github.com/foo/bar becomes foo-bar) and use that
  • If no source URL, but the package is scoped (e.g. @foo/bar) then use foo
  • If none of the above, fall back to default behaviour of using the depName

@rarkins
Copy link
Collaborator

rarkins commented Aug 4, 2021

The valid values for groupPreference would need to be an "enum", and yes we could include sourceOrg and homepage too if you wish, although I don't anticipate them being heavily used.

@kevincox
Copy link
Author

kevincox commented Aug 4, 2021

That approach looks good too. It is a bit more high level which is probably a good thing in this case. It also provides a nice preference order which solves the issue I outlined in my original proposal.

I think scope is the same thing I meant when I said org. Scope is the NPM terminology but you get the same idea on the Docker Hub, GitHub...

Homepage was part of making it generic. I don't know how commonly useful it would be compared to the repo so maybe it could wait until there is demand fond.

One suggestion is just to use the names already used in the configuration. So: sourceUrl instead of sourceRepo. and packageName instead of depName.

@rarkins
Copy link
Collaborator

rarkins commented Aug 4, 2021

I think scope is the same thing I meant when I said org. Scope is the NPM terminology but you get the same idea on the Docker Hub, GitHub...

We just need to make sure it's clear which is the scope if e.g. it's an npm package @foo/bar which is noted on github.com/foojs/foo. i.e. scope is foo and not foojs.

One suggestion is just to use the names already used in the configuration. So: sourceUrl instead of sourceRepo. and packageName instead of depName.

Using sourceUrl would mean unnecessarily long branch names, e.g. renovate/https-github-com-foojs-foo-2.x. I intend sourceRepo to be the foojs/foo bit, making the branch renovate/foojs-foo-2.x. It would be highly unlikely that someone would have dependencies from different servers (e.g. github.com and gitlab.com) with the exact same org/repo and not want them grouped.

We use depName internally even though the corresponding field in packageRules is matchPackageName. Maybe we should rename that though prior to this feature

@kevincox
Copy link
Author

kevincox commented Aug 4, 2021

We just need to make sure it's clear which is the scope if e.g. it's an npm package @foo/bar which is noted on github.com/foojs/foo. i.e. scope is foo and not foojs.

I imagine that that would depend on the package system. For NPM it would mean foo for @foo/bar, for docker it would be the "user" and for GitHub (like golang packages) it would be the repo owner. We could have a separate token for sourceUrlOrg if desired, but I would advise against that because it requires adding heuristics for various source hosts.

Using sourceUrl would mean unnecessarily long branch names, e.g. renovate/https-github-com-foojs-foo-2.x. I intend sourceRepo to be the foojs/foo bit, making the branch renovate/foojs-foo-2.x. It would be highly unlikely that someone would have dependencies from different servers (e.g. github.com and gitlab.com) with the exact same org/repo and not want them grouped.

Is the intention to parse out the org/repo from various source hosts? What about ones that don't have an "org" concept? What happens for unknown hosts? It would probably be better to have the grouping rely on the full sourceUrl then have the default slugification intelligently drop "likely unintereting" strings such as a https://github.com/prefix.

@rarkins
Copy link
Collaborator

rarkins commented Aug 5, 2021

Is the intention to parse out the org/repo from various source hosts? What about ones that don't have an "org" concept? What happens for unknown hosts? It would probably be better to have the grouping rely on the full sourceUrl then have the default slugification intelligently drop "likely unintereting" strings such as a https://github.com/prefix.

Ultimately, grouping is based on branch name so your suggestion isn't feasible in our code base. The scenarios where you have an unknown host without a concept of org/repo structure is pretty easy to fall back to using a slugified version of the URL anyway (e.g. dropping the https:// at least). I don't believe in multiplying the time of features and complexity of the code base for edge cases which have a suitable workaround.

@kevincox
Copy link
Author

kevincox commented Aug 6, 2021

Ah, I didn't realize that grouping and branch name were intrinsically linked. In that case it sounds reasonable to do as you proposed.

@rarkins rarkins added priority-3-medium Default priority, "should be done" but isn't prioritised ahead of others status:ready and removed priority-5-triage status:requirements Full requirements are not yet known, so implementation should not be started labels Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority-3-medium Default priority, "should be done" but isn't prioritised ahead of others type:feature Feature (new functionality)
Projects
None yet
Development

No branches or pull requests

3 participants