Skip to content

Commit

Permalink
PERF: 5% faster pip install. algorithmic bug at the end of pip instal…
Browse files Browse the repository at this point in the history
…l in the code to display installed packages. O(n^2) to enumerate installed packages due to accidental loop in loop.

get_distribution(package_name) does a loop over all installed packages. that is quite surprising and unexpected. do not use inside a loop.

on a pip install run that takes 11 seconds:
the loop takes 0.735960 seconds on main branch.
the loop takes 0.064672 seconds with this fix.
  • Loading branch information
rmmancom committed Jun 24, 2024
1 parent 00c75c4 commit f023907
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 10 deletions.
2 changes: 2 additions & 0 deletions news/bugfix.12791.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Improve performance of pip install. Fix code to display installed packages at the end of pip install,
was O(n^2) to enumerate packages due to accidental loop in loop.
20 changes: 11 additions & 9 deletions src/pip/_internal/commands/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from optparse import SUPPRESS_HELP, Values
from typing import List, Optional

from pip._vendor.packaging.utils import canonicalize_name
from pip._vendor.rich import print_json

from pip._internal.cache import WheelCache
Expand Down Expand Up @@ -472,17 +473,18 @@ def run(self, options: Values, args: List[str]) -> int:
)
env = get_environment(lib_locations)

# Display a summary of installed packages, with extra care to
# display a package name as it was requested by the user.
installed.sort(key=operator.attrgetter("name"))
items = []
for result in installed:
item = result.name
try:
installed_dist = env.get_distribution(item)
if installed_dist is not None:
item = f"{item}-{installed_dist.version}"
except Exception:
pass
items.append(item)
expected_items = {
canonicalize_name(item.name): item.name for item in installed
}
for distribution in env.iter_all_distributions():
display_name = expected_items.get(distribution.canonical_name, None)
if display_name is not None:
item = f"{display_name}-{distribution.version}"
items.append(item)

if conflicts is not None:
self._warn_about_conflicts(
Expand Down
3 changes: 2 additions & 1 deletion src/pip/_internal/metadata/importlib/_envs.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,10 @@ def _iter_distributions(self) -> Iterator[BaseDistribution]:
yield from finder.find_linked(location)

def get_distribution(self, name: str) -> Optional[BaseDistribution]:
canonical_name = canonicalize_name(name)
matches = (
distribution
for distribution in self.iter_all_distributions()
if distribution.canonical_name == canonicalize_name(name)
if distribution.canonical_name == canonical_name
)
return next(matches, None)

0 comments on commit f023907

Please sign in to comment.