You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We can remove the apache pdfBox dependencies as we can switch to Tika support only.
The advantage is the we reduce complexity. For this technology stack there is no need to assume any setup without a tika server instance. If someone needs this feature a custom implementation based on pdfBox can be used by the project.
So in future the OCRService will throw a exception if no Tika Service Endpoint is defined. All ocr functionality is handed over to tika only!
The text was updated successfully, but these errors were encountered:
rsoika
changed the title
OCR - determine DPIs for parsing
OCR - remove PDFBox dependencies
Nov 6, 2020
We can remove the apache pdfBox dependencies as we can switch to Tika support only.
The advantage is the we reduce complexity. For this technology stack there is no need to assume any setup without a tika server instance. If someone needs this feature a custom implementation based on pdfBox can be used by the project.
So in future the OCRService will throw a exception if no Tika Service Endpoint is defined. All ocr functionality is handed over to tika only!
The text was updated successfully, but these errors were encountered: