Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: do not copy libraries in site-packages #392

Closed
wants to merge 11 commits into from

Conversation

XuehaiPan
Copy link

@XuehaiPan XuehaiPan commented Aug 6, 2022

Add a new option --no-copy-site-libs to skip libraries in site-packages for auditwheel repair.

This feature will significantly reduce the wheel file size if the referenced libraries are shipped with other PyPI packages.

NOTE: The developers may have their own risks in using this feature (e.g. they need to add the skipped library as an install dependency in setup.py).

Resolves #391

@mattip
Copy link

mattip commented Aug 10, 2022

@mayeut any thoughts?

@mattip
Copy link

mattip commented Aug 15, 2022

Could a maintainer allow running the workflow for this? NumPy and SciPy would like to create an openblas wheel, and this is a necessary component to not vendor the shared object into the NumPy/SciPy wheels.

@auvipy
Copy link
Contributor

auvipy commented Aug 15, 2022

started

@auvipy auvipy self-requested a review August 15, 2022 15:56
@auvipy
Copy link
Contributor

auvipy commented Aug 16, 2022

builds are red sadly

@codecov
Copy link

codecov bot commented Aug 16, 2022

Codecov Report

Base: 92.42% // Head: 91.62% // Decreases project coverage by -0.79% ⚠️

Coverage data is based on head (a691a55) compared to base (ef280f2).
Patch coverage: 65.71% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #392      +/-   ##
==========================================
- Coverage   92.42%   91.62%   -0.80%     
==========================================
  Files          23       23              
  Lines        1267     1290      +23     
  Branches      311      318       +7     
==========================================
+ Hits         1171     1182      +11     
- Misses         55       63       +8     
- Partials       41       45       +4     
Impacted Files Coverage Δ
src/auditwheel/main_addtag.py 89.47% <ø> (ø)
src/auditwheel/repair.py 80.41% <64.51%> (-5.88%) ⬇️
src/auditwheel/patcher.py 97.22% <66.66%> (-2.78%) ⬇️
src/auditwheel/main_repair.py 90.76% <100.00%> (+0.14%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

rpath_set = OrderedDict([(old_rpath, "") for old_rpath in rpaths])
if rpath is not None:
rpath_set[rpath] = ""
rpaths = list(OrderedDict.fromkeys(rpaths))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just use dict instead here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Python 3.6, the dict keys are kept in insertion order. But I think explicitly using OrderedDict improves readability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for now

@mattip
Copy link

mattip commented Aug 17, 2022

Tests are needed to show this works. Perhaps you could parameterize the failing tests in test_manylinux.py:

  • test_build_wheel_depending_on_library_with_rpath`
  • test_build_repair_multiple_top_level_modules_wheel`

@XuehaiPan

This comment was marked as abuse.

@XuehaiPan XuehaiPan closed this Aug 17, 2022
@XuehaiPan XuehaiPan reopened this Aug 17, 2022
@XuehaiPan
Copy link
Author

XuehaiPan commented Aug 17, 2022

The tests passed for commit 6f6e921 (#392): https://github.com/pypa/auditwheel/runs/7854383503?check_suite_focus=true.

But failed after commit: 80b7152 (#392). This commits removes duplicate libs with the same soname. Only one lib is copied into the wheel file if there are multiple libs with the same soname.

@XuehaiPan
Copy link
Author

The tests failed for grouping duplicate libraries by real soname. I removed the grouping mechanism. Tests passed on my local machine. However, there may be duplicate libraries copied into the wheel file.

@mattip
Copy link

mattip commented Oct 3, 2022

It has been a while @mayeut can we get some eyes on this? It would help move openblas packaging as a wheel forward.

@@ -79,16 +84,35 @@ def repair_wheel(
% soname
)

if not copy_site_libs and "site-packages" in str(src_path).split(
Copy link
Member

@mayeut mayeut Oct 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee for the site-packages folder to have that name (in fact, it doesn't on debian python, its name is dist-packages).

action="store_false",
help=(
"Do not copy libraries located in the site-packages directory from "
"other packages. Just update the `rpath` only. The developer will "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, just updating the rpath is not enough.

@mayeut
Copy link
Member

mayeut commented Oct 8, 2022

As mentioned in #391 (comment), it's not as easy as just updating the rpath because both packages might not be in the same platlib folder.

Given rpath is not an option, I think we're kind of in the same situation as #368 with the difference that the dependency is not an arbitrary shared object but one provided by another package. It's thus the responsibility of the package requiring the dependency to ensure proper loading of the shared object it depends on. It's not enough to declare the package as a dependency in metadata (& this must be done with care regarding ABI compatibility between different versions).

Given specific steps are required to ensure proper loading of shared object, I'm not sure we want to silently filter out all shared objects in site-packages and thus we're a bit more like in #368 were exceptions are explicitly passed in one-by-one.

@lkollar, any thoughts ?

@mayeut
Copy link
Member

mayeut commented Oct 15, 2022

Closing this PR in favor of #368
Thanks for your efforts @XuehaiPan, despite the PR not getting merged, this allowed a broader discussion to get things moving forward I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow not copying .so files provided by other PyPI packages (e.g. libtorch*.so from torch)
4 participants