-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split out IO logic from high level functions (#392) #393
Split out IO logic from high level functions (#392) #393
Conversation
a0aee18
to
eeb272a
Compare
I thought you were going to do something like this |
71ffd07
to
080df1a
Compare
@pietermarsman PR updated to keep the same function name and simply allow either a string or a file-like object. I've added an I've essentially followed advice from here about how to do this... https://stackoverflow.com/questions/7268353/how-to-accept-both-filenames-and-file-like-objects-in-python-functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks perfect!
I suggested 2 minor changes.
f1545ec
to
abd72c4
Compare
@pietermarsman thanks. I've updated the first commit to resolve contributing being out of date, and added a new commit moving the Please take another look when you get chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄
…er#393) * Allow file-like inputs to high level functions (pdfminer#392) * PR Review - move open_filename to utils (cherry picked from commit 1a4a06d)
Pull request
Adds two new functions (
extract_text_from_io
andextract_pages_from_io
) tohigh_level.py
, allowing you to callextract_text
andextract_pages
without having the file locally.Closes #392
Notes
*_from_io
to avoid breaking changes to the original functions. This is not very consistent with theextract_text_to_fp
at the top of the file, perhaps. Happy to implement alternatives.inf
as the new IO parameter, even though I don't really like the name. This is to be consistent withextract_text_to_fp
. Other places in the code usefp
. Happy to change.test_highlevel_extracttext.py
. Is this correct, or should this all be in onetest_high_level.py
?How Has This Been Tested?
I've added tests for
extract_text_from_io
, but couldn't find any existing tests forextract_pages
?Checklist
works
version
is not necessary (not necessary)
verified that this is not necessary (not necessary as the read the docs page on this is auto generated from docstrings)
CHANGELOG.md