Skip to content

Commit

Permalink
PERF: faster pip install. algorithmic bug at the end of pip install i…
Browse files Browse the repository at this point in the history
…n the code to display installed packages. O(n^2) to enumerate installed packages due to accidental loop in loop.

get_distribution(package_name) does a loop over all installed packages. that is quite surprising and unexpected. do not use it inside a loop.

on a pip install run that takes 11 seconds:
the loop takes 0.735 seconds on main branch.
the loop takes 0.064 seconds with this fix.
  • Loading branch information
rmmancom committed Jun 25, 2024
1 parent 00c75c4 commit ea5a1ee
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 9 deletions.
2 changes: 2 additions & 0 deletions news/12791.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Improve performance of pip install. Fix code to display installed packages at the end of pip install,
was O(n^2) to enumerate packages due to accidental loop in loop.
18 changes: 10 additions & 8 deletions src/pip/_internal/commands/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from optparse import SUPPRESS_HELP, Values
from typing import List, Optional

from pip._vendor.packaging.utils import canonicalize_name
from pip._vendor.rich import print_json

from pip._internal.cache import WheelCache
Expand Down Expand Up @@ -472,16 +473,17 @@ def run(self, options: Values, args: List[str]) -> int:
)
env = get_environment(lib_locations)

# Display a summary of installed packages, with extra care to
# display a package name as it was requested by the user.
installed.sort(key=operator.attrgetter("name"))
items = []
for result in installed:
item = result.name
try:
installed_dist = env.get_distribution(item)
if installed_dist is not None:
item = f"{item}-{installed_dist.version}"
except Exception:
pass
installed_versions = {}
for distribution in env.iter_all_distributions():
installed_versions[distribution.canonical_name] = distribution.version
for package in installed:
display_name = package.name
version = installed_versions.get(canonicalize_name(display_name), None)
item = f"{display_name}-{version}"
items.append(item)

if conflicts is not None:
Expand Down
3 changes: 2 additions & 1 deletion src/pip/_internal/metadata/importlib/_envs.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,10 @@ def _iter_distributions(self) -> Iterator[BaseDistribution]:
yield from finder.find_linked(location)

def get_distribution(self, name: str) -> Optional[BaseDistribution]:
canonical_name = canonicalize_name(name)
matches = (
distribution
for distribution in self.iter_all_distributions()
if distribution.canonical_name == canonicalize_name(name)
if distribution.canonical_name == canonical_name
)
return next(matches, None)

0 comments on commit ea5a1ee

Please sign in to comment.