You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I'm always frustrated when a conversion suddenly breaks the whole pipeline, especially when processing thousands of files. Imagine importing like 2500 PDF files into an object store via Tika- or PDF-converter. When there is one corrupt PDF or something else goes wrong, an exception is raised breaking the whole process. Of course, typically this happens after one hour has passed and 75% already have been processed.
Describe the solution you'd like
Don't break the whole pipeline or conversion process as soon as there is one exception occuring for whatever reason.
Version: 2.x
The text was updated successfully, but these errors were encountered:
Assuming we are talking of 1.x (the BaseConverter is only available in 1.x), this does not seem difficult to achieve.
We can add a raise_on_failure parameter in the run method of the BaseConverter.
if True (default), raises an exception if the conversion of a single file fails (current behavior)
if False, skips the file without failing
Since the run method is present only in the BaseConverter, it would be necessary to modify only the latter and not the specific converters.
I will put the tag "Contributions wanted" on this issue. @dannybusch feel free to open a PR...
Is your feature request related to a problem? Please describe.
I'm always frustrated when a conversion suddenly breaks the whole pipeline, especially when processing thousands of files. Imagine importing like 2500 PDF files into an object store via Tika- or PDF-converter. When there is one corrupt PDF or something else goes wrong, an exception is raised breaking the whole process. Of course, typically this happens after one hour has passed and 75% already have been processed.
Describe the solution you'd like
Don't break the whole pipeline or conversion process as soon as there is one exception occuring for whatever reason.
Version: 2.x
The text was updated successfully, but these errors were encountered: