Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: too many tesseract errors are ignored #1086

Closed
cragwolfe opened this issue Aug 10, 2023 · 3 comments
Closed

bug: too many tesseract errors are ignored #1086

cragwolfe opened this issue Aug 10, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@cragwolfe
Copy link
Contributor

cragwolfe commented Aug 10, 2023

Describe the bug
While #1074 introduces an important fix where some PDF's or Images would fail to be partitioned, other valid Tesseract errors that should be bubbled up are also now ignored.

Definition of Done
test_unstructured/partition/test_image.py::test_partition_image_raises_with_invalid_language is no longer skipped (after #1074 merges)

@cragwolfe cragwolfe added the bug Something isn't working label Aug 10, 2023
awalker4 added a commit to Unstructured-IO/unstructured-inference that referenced this issue Aug 22, 2023
We've seen a 500 error in `unstructured-api` due to an uncaught TesseractError in the `entire_page`
path. I can't reproduce it, but we can at least add a try catch. The last fix was too aggessive,
which we're tracking [here][Unstructured-IO/unstructured#1086], so we may
need to adjust this fix as well.
@awalker4
Copy link
Contributor

Note - same adjustment should happen to Unstructured-IO/unstructured-inference#183

awalker4 added a commit to Unstructured-IO/unstructured-inference that referenced this issue Aug 22, 2023
We've seen a 500 error in `unstructured-api` due to an uncaught
TesseractError in the `entire_page` path. I can't reproduce it, but we
can at least add a try catch. The last fix was too aggessive, which
we're tracking
[here](Unstructured-IO/unstructured#1086), so
we may need to adjust this fix as well.

Closes #179
@orlandounstructured
Copy link

@cragwolfe @awalker4 can you please confirm if good to close this issue?

@orlandounstructured
Copy link

Closing because longer than 180 days;
reopen with comment if the issue is still relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants