Text Extraction: pdf --> txt #14

grahamsack · 2014-02-28T22:05:31Z

There are a few pre-existing python packages for this...

pypdf
slate
pdfminer

jonahsmith · 2014-03-01T17:38:31Z

FYI, I can't get Slate to work either. I might be missing something, but here is the error I'm getting:

  File "slateTest.py", line 1, in <module>
    import slate
  File "/Library/Python/2.7/site-packages/slate/__init__.py", line 48, in <module>
    from slate import PDF
  File "/Library/Python/2.7/site-packages/slate/slate.py", line 3, in <module>
    from pdfminer.pdfparser import PDFParser, PDFDocument
ImportError: cannot import name PDFDocument

Looks like there's a problem calling something in pdfminer? Graham, is this the issue you were having yesterday?

grahamsack · 2014-03-01T18:36:54Z

Yes. Same issue. I'm using pdfminer from command line now

Sent from my iPhone

On Mar 1, 2014, at 12:38 PM, jonahsmith [email protected] wrote:

FYI, I can't get Slate to work either. I might be missing something, but here is the error I'm getting:

File "slateTest.py", line 1, in
import slate
File "/Library/Python/2.7/site-packages/slate/init.py", line 48, in
from slate import PDF
File "/Library/Python/2.7/site-packages/slate/slate.py", line 3, in
from pdfminer.pdfparser import PDFParser, PDFDocument
ImportError: cannot import name PDFDocument
Looks like there's a problem calling something in pdfminer? Graham, is this the issue you were having yesterday?

—
Reply to this email directly or view it on GitHub.

aburkh · 2014-06-11T15:49:46Z

The problem is that slate tries to import PDFDocument from pdfminer.pdfparser.
The correct module is pdfminer.pdfdocument.

daryltucker · 2014-08-26T16:38:24Z

I still see this issue.

I was able to sudo pip install --upgrade --ignore-installed slate==0.3 pdfminer==20110515, which are compatible versions.

The slate devs are aware.

KurtOstergaard · 2015-11-02T20:54:30Z

I tried the slate==0.3 and pdfminer==20110515 line and I still get an error.
Any other workarounds?

tobiasmcnulty · 2016-01-26T04:08:06Z

Works with slate==0.3 and pdfminer=20110515 for me

tobiasmcnulty · 2016-01-26T04:08:25Z

If you're inside a virtualenv make sure not to use sudo

arderyp · 2016-03-10T03:05:06Z

@tobiasmcnulty's suggestion works for me too. Thanks!

grahamsack added the enhancement label Feb 28, 2014

grahamsack self-assigned this Feb 28, 2014

astreylabs assigned astreylabs and unassigned grahamsack Mar 16, 2014

daryltucker mentioned this issue Aug 26, 2014

Import error timClicks/slate#5

Open

jackfischer mentioned this issue Jan 20, 2015

fixed import issue. ourresearch/cv-parser#2

Merged

grahamsack unassigned astreylabs Apr 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text Extraction: pdf --> txt #14

Text Extraction: pdf --> txt #14

grahamsack commented Feb 28, 2014

jonahsmith commented Mar 1, 2014

grahamsack commented Mar 1, 2014

aburkh commented Jun 11, 2014

daryltucker commented Aug 26, 2014

KurtOstergaard commented Nov 2, 2015

tobiasmcnulty commented Jan 26, 2016

tobiasmcnulty commented Jan 26, 2016

arderyp commented Mar 10, 2016

Text Extraction: pdf --> txt #14

Text Extraction: pdf --> txt #14

Comments

grahamsack commented Feb 28, 2014

jonahsmith commented Mar 1, 2014

grahamsack commented Mar 1, 2014

aburkh commented Jun 11, 2014

daryltucker commented Aug 26, 2014

KurtOstergaard commented Nov 2, 2015

tobiasmcnulty commented Jan 26, 2016

tobiasmcnulty commented Jan 26, 2016

arderyp commented Mar 10, 2016