Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(docx): add strategy parameter to partition_docx() #3026

Merged
merged 1 commit into from
May 15, 2024

Conversation

scanny
Copy link
Collaborator

@scanny scanny commented May 15, 2024

Summary
The behavior of an image sub-partitioner can be partially determined by the partitioning strategy, for example whether it is "hi_res" or "fast". Add this parameter to partition_docx() so it can pass it along to DocxPartitionerOptions which will make it available to any image sub-partitioners.

@scanny scanny requested a review from Coniferish May 15, 2024 19:56
One of "hi_res", "fast", and a few others. These are available as class attributes on
`unstructured.partition.utils.constants.PartitionStrategy` but resolve to str values.
"""
return PartitionStrategy.HI_RES if self._strategy is None else self._strategy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to double-check that it should default to hi_res, otherwise looks good

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I'll confirm with Matt :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, yep, confirmed :)

Copy link
Collaborator

@Coniferish Coniferish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@scanny scanny force-pushed the scanny/add-strategy-param-to-docx branch from 2989187 to efe5785 Compare May 15, 2024 20:30
@scanny scanny added this pull request to the merge queue May 15, 2024
Merged via the queue into main with commit 094e354 May 15, 2024
42 checks passed
@scanny scanny deleted the scanny/add-strategy-param-to-docx branch May 15, 2024 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants