Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PDF scanner + support image extraction #1

Merged
merged 6 commits into from
Oct 20, 2021

Conversation

cameron-dunn-sublime
Copy link
Member

@cameron-dunn-sublime cameron-dunn-sublime commented Oct 14, 2021

xref_length was erroring on the old version of PyMuPDF.

mupdf_display_errors errored with the new version, so it was removed.

Image extraction from PDFs is a new feature.

I don't know why GitHub says that I'm merging 6 commits. Local git log:

nothing added to commit but untracked files present (use "git add" to track)
➜  strelka git:(cd.images-from-pdf) ✗  glg
commit cda2495b24c2661f251288ea3ec1f191ade39bc9 (HEAD -> cd.images-from-pdf, sublime/cd.images-from-pdf)
Author: Cameron Dunn <[email protected]>
Date:   Wed Oct 13 18:57:47 2021 -0700

    Fix PDF scanner + support image extraction

    xref_length was erroring on the old version of PyMuPDF.

    mupdf_display_errors errored with the new version, so it was removed.

    Image extraction from PDFs is a new feature.

 build/python/backend/requirements.txt   |  2 +-
 src/python/strelka/scanners/scan_pdf.py | 19 ++++++++++++++++---
 2 files changed, 17 insertions(+), 4 deletions(-)

commit d9086f35d709592733ff690ed3a9ddeff5bbb433 (sublime/master, origin/master, origin/HEAD, master)
Author: Paul Hutelmyer <[email protected]>
Date:   Tue Oct 12 08:12:36 2021 -0400

It shouldn't matter though.

cameron-dunn-sublime and others added 6 commits October 4, 2021 14:00
Backend reported errors parsing previously.
xref_length was erroring on the old version of PyMuPDF.

mupdf_display_errors errored with the new version, so it was removed.

Image extraction from PDFs is a new feature.
@cameron-dunn-sublime cameron-dunn-sublime changed the title Cd.images from pdf Fix PDF scanner + support image extraction Oct 14, 2021
@cameron-dunn-sublime
Copy link
Member Author

@jkamdjou this is the change I made while we were pairing the other day. I'll probably try and cleanup further and give a PR to target/strelka but in the mean time we can commit this to our [public] fork.

@cameron-dunn-sublime cameron-dunn-sublime marked this pull request as ready for review October 18, 2021 19:13
@cameron-dunn-sublime cameron-dunn-sublime merged commit 3abbf1d into master Oct 20, 2021
@cameron-dunn-sublime cameron-dunn-sublime deleted the cd.images-from-pdf branch October 20, 2021 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants