Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: bump inference to 0.6.6 #1563

Merged
merged 19 commits into from
Sep 29, 2023
Merged

chore: bump inference to 0.6.6 #1563

merged 19 commits into from
Sep 29, 2023

Conversation

badGarnet
Copy link
Collaborator

  • bump unstructured-inference to 0.6.6
  • specify default model name for element detection to be detectron2_onnx to keep current behavior
  • NOTE: the updated inference package by default would use yolox as element detection model; this will be evaluated and enabled in a separated PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes all seem to be positive: joining sentences back together where they should be

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicated text element removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better element categories (not sure why since default model is still detectron...); better text joining

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better joining of text into coherent elements; note this does look different from yolox model output we have seen before forcing default to detectron so this shows impact from just other improvements in inference

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

properly recognizing list items now

@badGarnet
Copy link
Collaborator Author

In general changes to fixture are due to better grouping of elements and dedupe of elements containing the same/overlap text

@badGarnet badGarnet requested review from qued and benjats07 September 29, 2023 17:41
Copy link
Contributor

@benjats07 benjats07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New outputs looks fine

@badGarnet badGarnet marked this pull request as ready for review September 29, 2023 18:39
badGarnet added a commit that referenced this pull request Sep 29, 2023
Occasionally the es test can fail because the index fail to be created
on the first try. Experiments show adding timeout doesn't help but add
retry mitigates the issue. See history of commits in branch:
yao/bump-inference-to-0.6.6
#1563

---------

Co-authored-by: ryannikolaidis <[email protected]>
Co-authored-by: badGarnet <[email protected]>
@badGarnet badGarnet enabled auto-merge (squash) September 29, 2023 18:43
@badGarnet badGarnet merged commit ad59a87 into main Sep 29, 2023
39 checks passed
@badGarnet badGarnet deleted the yao/bump-inference-to-0.6.6 branch September 29, 2023 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants