OCRTool

When applying fancy new AI models on top of documents, the first step in NLP pipelines is often to extract data out of myriad formats like PDFs and image files.

With the release of the Live Text feature on iOS and macOS, Apple also unveiled the VisionKit APIs for extracting text programmaticaly from documents using their same industry-leading OCR.

We find anecdotally that it's superior to running the same documents through Tesseract, so it seemed worth it to wrap into a little CLI tool.

Using

Grab the latest Universal macOS binary from the Releases page.

From there, you can run it on any document. Example:

$ OCRTool us_passport.jpg
$ OCRTool invoice.pdf

Building

Either open the project in XCode and click Build, or checkout the repo and from the root, run xcodebuild.

The output files will be generated in the build/Release/OCRTool directory.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
OCRTool.xcodeproj		OCRTool.xcodeproj
OCRTool		OCRTool
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCRTool

Using

Building

About

Releases 1

Packages

Languages

License

IntrinsicLabsAI/OCRTool

Folders and files

Latest commit

History

Repository files navigation

OCRTool

Using

Building

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages