-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
solver: tune heuristics for choosing the next dependency to resolve … #8255
Conversation
…n a way that dependencies that are not required by another unsatisfied dependency are resolved first
Have you thought about the approach being explored in #8191, which simply reverses the preference for "packages with fewer versions"? That is tempting to me in that it is a single character fix - just add a minus sign. And presumably it will also handle the cases described in your footnote. I suppose all heuristics are in the end subject to bad cases - I'm pretty sure that the problem ought to be NP-hard. While I see the logic in trying to force conflicts sooner rather than later: perhaps that heuristic is optimising for a case that turns out to be not very important in the python ecosystem, whereas "packages with most versions" would be helpful in the world as it actually is. My experience is that: in most solutions most packages are at the latest allowed version anyway. So simply eliminating the known pathological case might be a pretty sensible try. |
No, I haven't really thought about it. I just assumed that there is a good reason for choosing dependencies with less versions first - there's even a test ( Choosing packages with more versions first, of course, solves the boto3/urllib3 issue completely and also the Sphinx/docutils example. Testing the projects from the description, I can measure a slight performance regression for the shootout example with the additional urllib constraint: When choosing packages with less versions first it takes 8 s, when choosing packages with more versions first it takes 10 s. Since we have no example yet where it's worse, maybe, we should risk it... |
the original algorithm describes "fewest versions first" - https://github.com/dart-lang/pub/blob/master/doc/solver.md#decision-making
I suspect the testcase was written just because that's how the algorithm is described. While that text acknowledges room for improvement, it would be amusing if that improvement was doing the exact opposite! The way I'm thinking about it is that - in cases where there are real conflicts to resolve - reversing this heuristic probably loses some average performance, but helps with worst-case performance
But also most of the time there just aren't that many conflicts anyway - as I say, mostly most things end up at the latest version - so my hope is that the damage this does to the average case isn't much to worry about. However I've a feeling that I could tell myself a story justifying almost any heuristic! Perhaps there's no way really to know except to ship it and see what the new bad cases are... |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
… in a way that dependencies that are not required by another unsatisfied dependency are resolved first
Alternative: #8256
Pull Request Check List
Resolves: (partially1) boto3/botocore vs. urllib3 issue
Closes: #8256
Currently, we choose to resolve dependencies with less possible versions first. We can improve the heuristics, which dependency to resolve first, by considering relations between dependencies:
If dependency
A
depends on dependencyB
, we should resolve dependencyA
first no matter which dependency has more versions. Of course, different versions of a package can have different dependencies, but most often dependencies are at least similar and don't change much between versions. So I think the new heuristics will not worsen the performance in most cases (and of course improve it in some cases) - at least if the heuristics itself can be calculated fast enough.Example: If latest
A
requiresB<2
, then older versions ofA
will typically not allow newer versions ofB
and if they do it's only because the incompatibility had not been known at the time they were released.Some measurements (with warm cache):
pyproject.toml
from ...urllib3<2
constraintAs can be seen, the performance of the shootout example improves dramatically. The performance of the #4870 example is worse, however, we also get some different (probably better) results:
docutils 0.18.1
docutils 0.17.1
The solution with the PR is probably better because an older Sphinx version probably does not work with a newer docutils version. The constraint is just missing because the incompatibility was not known when the older Sphinx version was released. Since that's a typical issue, the new heuristics may even lead to less surprising results for inexperienced users if there are several possible solutions.
1 This PR can solve the boto3/botocore vs. urllib3 issue in some cases (like the shootout example), but not in all cases. That's because boto3 does not depend directly on urllib3 but on botocore, which depends on urllib3. With this PR we only look forward one level. That means it doesn't help if we have to choose between boto3 and urllib3 with no other dependency that depends on urllib3 in the list of unsatisfied dependencies yet as in the example in #7950 with boto3 and urllib3 as the only top level dependencies.