-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internal server error - Extracting tables from a PDF file #182
Comments
Thanks @skarampatakis ! Do you mind sharing the file that caused this error? |
It is this one, |
* build(release): bump unstructured-inference Related to downstream issue: Unstructured-IO/unstructured-api#182 And upstream PR: Unstructured-IO/unstructured-inference#165 --------- Co-authored-by: Shreya Nidadavolu <[email protected]>
Related to downstream issue: #182 And upstream PR: Unstructured-IO/unstructured-inference#165 * remove test_parallel_mode_correct_result * dropped the file_directory field from elements metadata
This is fixed as of 0.0.35! You can now get the latest image from quay, or pull the repo and rebuild. |
Thanks a lot for solving that issue quickly. I get no tesseract errors now. The problem is that I do not see the tables extracted properly, but I think this is already mentioned in #191 , seems like I have the same issue. |
Hi, I m using the following request to the API in order to extract some tables from a PDF file:
The request fails after 14 mins, I see the following on the logs:
If I run the same query but with the fast strategy, then everything works fine but the results are not acceptable.
I took a look on the tesseract repo but could not find anything relevant. Would be glad for any help.
I am running the api locally through docker, all on default settings.
The text was updated successfully, but these errors were encountered: