For n packages there are O(n^2) calls to `_is_current_pin_satisfying` #147

notatallshaw · 2023-12-02T19:37:06Z

As the number of packages to resolve increases there appears to be an O(n²) calls to _is_current_pin_satisfying and I think, at least in some circumstances, it's possible to optimize this away.

I am using steps to reproduce from pypa/pip#12320 and the branch from this PR to avoid network calls in the call graph pypa/pip#12327 and not have O(n²) calls to Pip collecting packages from local directory. In the future I plan to be able to show this performance issue using https://github.com/pradyunsg/pip-resolver-benchmarks

In this example there are ~1300 packages to install and _is_current_pin_satisfying is called ~2.5 million times, I generated the call graph using cProfile and gprof2dot:

I have not yet investigated how one might optimize this.

This is motivated from a real world usage report: pypa/pip#12314.

The text was updated successfully, but these errors were encountered:

notatallshaw · 2023-12-02T21:00:52Z

An idea I will investigate if no one else pipes in, but maybe it's possible to keep track of satified and unsatisfied names in the state, rather than needing to recalculate them each time.

notatallshaw · 2023-12-03T17:09:30Z

An initial observation, unsatisfied_names:

unsatisfied_names = [
    key
    for key, criterion in self.state.criteria.items()
    if not self._is_current_pin_satisfying(key, criterion)
]

Is almost always equal to:

self.state.criteria.keys() - self.state.mapping.keys()

Very rarely there is an additional name due to this part of _is_current_pin_satisfying:

all(
    self._p.is_satisfied_by(requirement=r, candidate=current_pin)
    for r in criterion.iter_requirement()
 )

And this is what is the expensive part of the call is, but possible a trick that could be applied is just assume unsatisfied_names = self.state.criteria.keys() - self.state.mapping.keys(), and at completion this approximate unsatisfied names will equal the real unsatisfied names (i.e. it will be 0).

notatallshaw · 2023-12-15T16:34:15Z

FYI I have not made any meaningful progress on this and doesn't expect to any time soon.

I tried avoiding calling _is_current_pin_satisfying in most cases of calculating unsatisfied_names , but I got lots of test failures that I beleive to be valid, it feels like it should be possible so possibly there was just an unidentified bug in my code. I will take a look at this again when I have a chance.

notatallshaw · 2024-01-12T14:19:34Z

FYI, a workaround is for the provider to cache the calls, a PR has been raised on Pip side to do that: pypa/pip#12453

notatallshaw changed the title ~~For n packages there are O(n<sup>2</sup>) calls to _is_current_pin_satisfying~~ For n packages there are O(n^2) calls to _is_current_pin_satisfying Dec 2, 2023

notatallshaw mentioned this issue Dec 2, 2023

New resolver takes 1-2 hours to install a large requirements file pypa/pip#12314

Closed

1 task

notatallshaw mentioned this issue Dec 3, 2023

Speed up resolution by approximating unsatisfied names #148

Closed

sbidoul mentioned this issue Dec 28, 2023

Cache is_satisfied_by pypa/pip#12453

Merged

notatallshaw mentioned this issue Dec 28, 2023

path_to_url called millions of times for ~1000 offline wheel installs pypa/pip#12320

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For n packages there are O(n^2) calls to `_is_current_pin_satisfying` #147

For n packages there are O(n^2) calls to `_is_current_pin_satisfying` #147

notatallshaw commented Dec 2, 2023 •

edited

Loading

notatallshaw commented Dec 2, 2023

notatallshaw commented Dec 3, 2023 •

edited

Loading

notatallshaw commented Dec 15, 2023

notatallshaw commented Jan 12, 2024

For n packages there are O(n^2) calls to _is_current_pin_satisfying #147

For n packages there are O(n^2) calls to _is_current_pin_satisfying #147

Comments

notatallshaw commented Dec 2, 2023 • edited Loading

notatallshaw commented Dec 2, 2023

notatallshaw commented Dec 3, 2023 • edited Loading

notatallshaw commented Dec 15, 2023

notatallshaw commented Jan 12, 2024

For n packages there are O(n^2) calls to `_is_current_pin_satisfying` #147

For n packages there are O(n^2) calls to `_is_current_pin_satisfying` #147

notatallshaw commented Dec 2, 2023 •

edited

Loading

notatallshaw commented Dec 3, 2023 •

edited

Loading