Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get underlying error when conversion fails #717

Merged
merged 7 commits into from
Feb 20, 2024
Merged

Get underlying error when conversion fails #717

merged 7 commits into from
Feb 20, 2024

Conversation

apyrgio
Copy link
Contributor

@apyrgio apyrgio commented Feb 15, 2024

When we get an early EOF from the converter process, we should
immediately get the exit code of that process, to find out the actual
underlying error. Currently, the exception we raise masks the underlying
error.

Raise a ConverterProcException, that in turns makes our error handling
code read the exit code of the spawned process, and converts it to a
helpful error message.

Fixes #714

Copy link
Contributor

@deeplow deeplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with doing away with the InterruptedConversionException and removing the proc from ConverterProcException as it did feel odd when I made it store a proc.

However, it feels a bit weird to me that Document would be concerned with any implementation details. It seems like we're overloading the usage of Document. Unless there is some other argument that I am not considering, I would advocate for keeping any proc-related logic within the isolation_process submodule.

So here are the changes I propose, done on top of your PR. Let me know if you'd agree.

         try:
+            conversion_proc = self.start_doc_to_pixels_proc()
             with tempfile.TemporaryDirectory() as t:
                 Path(f"{t}/pixels").mkdir()
-                self.doc_to_pixels(document, t)
+                self.doc_to_pixels(document, t, conversion_proc)
                 # TODO: validate convert to pixels output
                 self.pixels_to_pdf(document, t, ocr_lang)
             document.mark_as_safe()
             if document.archive_after_conversion:
                 document.archive()
-        except errors.ConverterProcException as e:
-            exception = self.get_proc_exception(document.conversion_proc)
+        except errors.ConverterProcException:
+            exception = self.get_proc_exception(conversion_proc)

and also

@@ -91,11 +93,10 @@ class IsolationProvider(ABC):
             self.print_progress(document, True, str(e), 0)
             document.mark_as_failed()
 
-    def doc_to_pixels(self, document: Document, tempdir: str) -> None:
+    def doc_to_pixels(self, document: Document, tempdir: str, p: subprocess.Popen) -> subprocess.Popen:
         percentage = 0.0
         with open(document.input_filename, "rb") as f:
-            p = self.start_doc_to_pixels_proc()
-            document.conversion_proc = p
+            # document.conversion_proc = p
             try:
                 assert p.stdin is not None
                 p.stdin.write(f.read())
@@ -150,6 +151,7 @@ class IsolationProvider(ABC):
             log.info(
                 f"Conversion output (doc to pixels)\n{self.sanitize_conversion_str(untrusted_log)}"
             )
+        return p

@apyrgio
Copy link
Contributor Author

apyrgio commented Feb 19, 2024

I have updated the PR to use an approach like you suggested. Because the change is significant, I have force-pushed the changes, instead of having FIXUP commits.

@deeplow
Copy link
Contributor

deeplow commented Feb 19, 2024

I have updated the PR to use an approach like you suggested. Because the change is significant, I have force-pushed the changes, instead of having FIXUP commits.

Great! Thanks for implementing it! I have only left one comment and then it's ready to merge.

Copy link
Contributor

@deeplow deeplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to merge once the lints have been fixed.

@apyrgio
Copy link
Contributor Author

apyrgio commented Feb 19, 2024

I've tested this branch on Qubes, and added some extra commits which fix some Qubes-related issues I discovered, plus:

  1. An improvement for the Git commit message lint
  2. A fix in our lints, due to a previous bad merge.

Copy link
Contributor

@deeplow deeplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I took some extra time to approve because this was failing on my Qubes, but it turns out the issue also impacted main. So I am now looking into that now.

Feel free to merge (after squashing the fixups).

Start the conversion process earlier, so that we have a reference to the
Popen object in case of an exception.
When we get an early EOF from the converter process, we should
immediately get the exit code of that process, to find out the actual
underlying error. Currently, the exception we raise masks the underlying
error.

Raise a ConverterProcException, that in turns makes our error handling
code read the exit code of the spawned process, and converts it to a
helpful error message.

Fixes #714
Improve the commit message check, by logging only the commit title, and
doing away with the extra spaces.
Adapt Qubes tests to the addition of the conversion process in
doc_to_pixels() call.
Inform testers that the container code no longer returns "UNTRUSTED >"
strings in its output. Every string is trusted now, and the output will
be similar for container and Qubes isolation providers alike.
Do not throw exceptions for unknown error codes. If
`get_proc_exception()` gets called from within an exception context and
raises an exception itself, then this exception will not get caught, and
it will get lost.

Prefer instead to return an exception class that we have for this
purpose, and show to the user the unknown error code of the converesion
process.
@apyrgio apyrgio merged commit e73f10f into main Feb 20, 2024
15 of 18 checks passed
@apyrgio apyrgio deleted the 714-exit-code branch February 20, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong error shown when conversion fails
2 participants