Skip to content

heussd/pdftotext-go

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdftotext-go

OpenSSF Scorecard

Extract texts with their corresponding page numbers from PDF files. Wraps the command line tool pdftotext (poppler-utils).

Usage

  1. poppler-utils (version >=22.05.0) must be installed and available in the path.
  2. go get "github.com/heussd/pdftotext-go"
  3. See tests for code examples.

Why poppler version >=22.05.0

Version 22.05.0 of poppler introduced a new parameter -tsv, which extracts PDF content with meta data as TSV. This functionality is essential for the operation of this library.

Thanks to

About

Extract texts + their page numbers from PDF

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published