Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring PDF loaders: 02 PyMuPDF #29063
base: master
Are you sure you want to change the base?
Refactoring PDF loaders: 02 PyMuPDF #29063
Changes from 7 commits
21759e2
4607354
668dc9c
7a5b5c5
6340ded
4845781
3beda82
743a83e
b623750
20f5a41
91234f0
80ee3f7
66f97cf
0e6c904
9b45bd8
acf4358
d7d3021
4762fab
6121005
5910f99
7fc01f3
0f654a1
e4f36ed
4a62529
1c78325
1227dbb
90085e4
14264e9
feacf69
c074729
ee4784d
5d4a256
3d15d39
0be6c88
d104ee7
023ba11
23a73a9
a4587f0
d332958
4b37b34
2281d05
0da73f1
d012d60
882c90d
74d3617
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps remove this doc-string or improve it?
This doc-string is better at class level or at init level if semantics are controlled by parameterization in the initializer (e.g., mode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The important message here is that “Insert image, if possible, between two paragraphs.” This is not always the case, and therefore cannot be indicated in BasePDFLoader. That's why I've added this information, specifically in this implementation. It will be found in all the others that work like this. But DocumentIntellignent, for example, doesn't respect this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to make this change, but it's a breaking change due to
kwargs
missing --