-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add parameter to select images to be removed #2214
Conversation
@MartinThoma |
Discovered while working on #2214
pypdf/constants.py
Outdated
INLINE_IMAGES = auto() | ||
DRAWING_IMAGES = auto() | ||
ALL = XOBJECT_IMAGES | INLINE_IMAGES | DRAWING_IMAGES | ||
IMAGES = ALL # for consistancy with ObjectDeletionFlag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo consistency
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2214 +/- ##
==========================================
+ Coverage 94.44% 94.46% +0.01%
==========================================
Files 43 43
Lines 7638 7641 +3
Branches 1511 1509 -2
==========================================
+ Hits 7214 7218 +4
Misses 262 262
+ Partials 162 161 -1
☔ View full report in Codecov by Sentry. |
@MartinThoma |
I've just noticed that you now went with the flag-based design. Also, the code looks more complex than I expected. I'll try to find an index-based approach + a simple way for users to get image data (including this flag) so that they have an easy time to decide which images (by index) they want to remove. The reasons why I prefer the index-based approach are:
|
I'm sorry that I communicated that plan / idea poorly :-/ |
We could even implement
so keys can either be an integer, a range-object or an iterable. |
I would suggest to do this in at least two PRs:
|
The current solution allows to delete drawings that the images/index will not be able to do it. the solution also addresses to delete all images (on all pages) at once which will be more complex for users to do
The index is not for me the only way to access data : we will have also to consider strings as indexes
|
I do not see what you mean by "exposes the type". If you want to distinguish between inline and xobjects it is already possible through the file name
except adding
this is more complex : str can be an index and images within forms have to be addressed |
@MartinThoma |
@pubpub-zz Yes, I've seen the comment. I was just trying to write the feature as I imagine it, but that takes quite some time. I think I first need to refactor
What do you mean by that? |
the images virtual table can be either get data either by numbers (position in the image list) but also by str (name of the object) There is also my comment about the drawings |
@pubpub-zz Please give me another week to think about how to proceed here. I fixed the merge conflicts so that I have an easier time merging. Currently, I tend to the following:
For those two reasons, I tend to merge it + make the clean solution whenever I have time (which could also be never) The main point I still want to check is if this adds too much maintenance complexity. |
Sorry that it took me so long to merge this 🙈 And thank you for your patience with me / for solving this issue 🤗 |
## What's new ### Security (SEC) - Infinite recursion when using PdfWriter(clone_from=reader) (#2264) by @Alexhuszagh ### New Features (ENH) - Add parameter to select images to be removed (#2214) by @pubpub-zz ### Bug Fixes (BUG) - Correctly handle image mode 1 with FlateDecode (#2249) by @stefan6419846 - Error when filling a value with parentheses #2268 (#2269) by @KanorUbu - Handle empty root outline (#2239) by @pubpub-zz ### Documentation (DOC) - Improve merging docs (#2247) by @stefan6419846 ### Developer Experience (DEV) - Test Python 3.7 with cryptopgraphy provider as well (#2276) by @stefan6419846 - Run CI with windows-latest (#2258) by @MartinThoma - Use pytest-xdist (#2254) by @MartinThoma - Attribute correct authors in the release notes (#2246) by @stefan6419846 ### Maintenance (MAINT) - Apply pre-commit hooks (#2277) by @MartinThoma - Update requirements + mypy fixes (#2275) by @MartinThoma - Explicitly provide Any for IO generic argument (#2272) by @nilehmann ### Testing (TST) - Fix test_image_without_pillow in windows environment (#2257) by @pubpub-zz ### Code Style (STY) - Remove unused import by @MartinThoma [Full Changelog](3.16.4...3.17.0)
closes #2208