Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: partition_pdf() pass kwargs through fast strategy pipeline #3040

Merged
merged 11 commits into from
May 17, 2024

Conversation

christinestraub
Copy link
Collaborator

@christinestraub christinestraub commented May 16, 2024

This PR aims to pass kwargs through fast strategy pipeline, which was missing as part of the previous PR - #3030.
I also did some code refactoring in this PR, so I recommend reviewing this PR commit by commit.

Summary

  • pass kwargs through fast strategy pipeline, which will allow users to specify additional params like sort_mode
  • refactor: code reorganization
  • cut a release for 0.14.0

Testing

CI should pass

# Conflicts:
#	CHANGELOG.md
#	unstructured/__version__.py
# Conflicts:
#	CHANGELOG.md
#	unstructured/__version__.py
Copy link
Contributor

@MthwRobinson MthwRobinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates and refactor look good. Just one quick question, otherwise LGTM if tests are passing!

@MthwRobinson MthwRobinson added this pull request to the merge queue May 17, 2024
@MthwRobinson MthwRobinson removed this pull request from the merge queue due to a manual request May 17, 2024
# Conflicts:
#	CHANGELOG.md
#	unstructured/__version__.py
@christinestraub christinestraub added this pull request to the merge queue May 17, 2024
@christinestraub christinestraub removed this pull request from the merge queue due to a manual request May 17, 2024
# Conflicts:
#	CHANGELOG.md
#	unstructured/__version__.py
#	unstructured/partition/pdf.py
#	unstructured/partition/pdf_image/pdf_image_utils.py
@christinestraub christinestraub added this pull request to the merge queue May 17, 2024
Merged via the queue into main with commit 76831f1 May 17, 2024
42 checks passed
@christinestraub christinestraub deleted the refactor/pdf-pass-kwargs-fast-strategy-pipeline branch May 17, 2024 21:26
Copy link

sentry-io bot commented May 21, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ HTTPException: ["cannot identify image file '/tmp/tmpsier5q54'",["�PNG\r\n\u001a\n\u0000\u0000\u0000\rIHDR\u0000... /general/v0/general View Issue
  • ‼️ AttributeError: 'NoneType' object has no attribute 'x1' /general/v0/general View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants