-
Notifications
You must be signed in to change notification settings - Fork 820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug/partition-pdf-with-infer_table_structure #3252
Comments
we got the same issue too. is there any solution ? |
@vav1lo Currently, I have changed to another reader. Also can you attach the pdf which you are testing coz mine is bit confidential to share and with a sample pdf it would be easy for them to diagnose the error. |
Hi @vav1lo, Can you please attach the pdf that you are testing? |
uber_10q_march_2022.pdf `import os filename = "uber_10q_march_2022.pdf" elements = partition_pdf( |
@christinestraub Here is the pdf that i am testing |
I am also getting the error while partitioning pdf , and the error is with particularly this argument infer_table_structure=True,
---> 11 from cv2.typing import MatLike ModuleNotFoundError: No module named 'cv2.typing'; 'cv2' is not a package |
I think this has to do with the opencv installation |
This started happening to me when I upgraded from 0.12.6 to 0.14.6 |
i installed it as well, but what is being imported there needs to be changed actually |
Hi @DeepKariaX, @vav1lo, @hackpointt, @Nidhi2497, @nikklavzar Addressed on Unstructured-IO/unstructured-inference#359. You'll need to upgrade |
Closing this since it's assumed to be resolved, but feel free to reopen if you're still having this issue. |
@christinestraub This is resolved, thanks ! |
Describe the bug
Giving (ValueError: max() arg is an empty sequence) error when using partition pdf. When i keep the infer_table_structure = True parameter it is giving me this error and after removing this parameter it is working perfectly.
File which received bug
unstructured_inference/models/tables.py", line 667, in fill_cells
table_rows_no = max({row for cell in cells for row in cell["row_nums"]})
Expected behavior
Even if we keep the infer_table_structure = True parameter it should be able to partition the pdf without any errors. (Maybe add error handling when receiving the none value)
The text was updated successfully, but these errors were encountered: