Paperwork is a tool to make easily papers searchable.
The basic idea behind Paperwork is "scan & forget" : You should be able to just scan a new document and forget about it until the day you need it again. Let the machine do most of the work.
Paperwork also supports PDF and images import.
Papers are organized into documents. Each document contains pages.
It uses mainly 4 other pieces of software:
- Sane: To scan the pages
- Tesseract: To extract the words from the pages (OCR)
- GTK/Glade: For the user interface
- Whoosh: To index and search documents, and provide keyword suggestions
Page orientation is automatically guessed using OCR.
Since OCR is not perfect, and since some documents don't contain useful keywords, Paperwork allows also to put labels on each document.
GPLv3 or later. See COPYING.
Github can automatically provides .tar.gz and .zip files if required. However, they are not required to install Paperwork. They are indicated here as a convenience for package maintainers.
All the information can be found on the wiki