-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to the logging logic #3137
Conversation
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
I see that the last error that the CI test fails with is:
This looks unrelated to my PR, but it's worth checking it locally as well. Thanks for your patience, I'll see that the tests pass locally and I'll update this PR. In the meantime, it would help if you could validate the original assumption of this PR, that logging/warnings should be logged only to stderr. I'm asking because there may be historic reasons why this is not the case, and it's good to know before diving in. |
Thanks for this.
Apologies for this, it looks like there's a recent regression in MuPDF master that is breaking PyMuPDF tests. We're investigating things at the moment.
There are some subtleties in the way we output diagnostics with So i think we should retain the current distinction between Some other comments:
|
Cool, thanks for letting me know 🙂
Got it, I had the same suspicion as well.
Oh, is this a conscious decision of the project, or is it because there was never a use case for it? In our case (Dangerzone), we do use PyMuPDF as part of a pipeline. We convert a document to pixels in one container, print those pixels to stdout, and then we reconstruct the document in a second container from these pixels. We do this for PDF sanitization.
I definitely agree with having a single stream for these messages, and I've experienced the buffering problems you describe when this is not the case. That's why in this PR, I tried to switch everything to one stream, However, I'm not particularly hung up on the way maintainers prefer to print debug logs. The actual reason I'm sending this PR is because a certain print statement is currently interleaved with our own messages: Line 9636 in d20c4e7
So, I can close this PR and send a less opinionated one, that simply:
Would this be more preferable? |
PyMuPDF 1.23.9 made the swapped the new fitz implementation (fitz_new) with the fitz module. In the new module there are prints in the code that interfere with our stderror for sending JSON from the container. Pinning the version seems to have no adverse consequences, since fitz_old hasn't had significant changes and it gives breething room for the print-related issue to be tackled in PR [2]. Fixes temporarily #700 [1]: #700 (comment) [2]: pymupdf/PyMuPDF#3137
PyMuPDF 1.23.9 made the swapped the new fitz implementation (fitz_new) with the fitz module. In the new module there are prints in the code that interfere with our stderror for sending JSON from the container. Pinning the version seems to have no adverse consequences [1], since fitz_old hasn't had significant changes and it gives breething room for the print-related issue to be tackled in PR [2]. Fixes temporarily #700 [1]: #700 (comment) [2]: pymupdf/PyMuPDF#3137
PyMuPDF 1.23.9 swapped the new fitz implementation (fitz_new) with the fitz module. In the new module there are prints in the code that interfere with our stdout for sending JSON from the container. Pinning the version seems to have no adverse consequences [1], since fitz_old hasn't had significant changes and it gives breathing room for the print-related issue to be tackled in PR [2]. Fixes temporarily #700 [1]: #700 (comment) [2]: pymupdf/PyMuPDF#3137
Kind reminder about the above. Would you prefer to close this PR and send a smaller one, with the above points fixed? |
Apologies for the delay in replying, and thanks for the reminder and your understanding. I think your suggestion of a simplified PR would be great at this stage. I'll discuss internally about your use of PyMuPDF in a pipeline. If you're doing it, it needs to be supported. Perhaps we could control where logs/diagnostics go using environment variables, as i can't see a single solution that will address the different concerns that we've been discussing here. Thanks. |
PyMuPDF has some hardcoded log messages that print to stdout [1]. We don't have a way to silence them, because they don't use the Python logging infrastructure. What we can do here is silence a particular call that's been creating debug messages. For a long term solution, we have sent a PR to the PyMuPDF team, and we will follow up there [2]. Fixes #700 [1]: #700 [2]: pymupdf/PyMuPDF#3137
This PR makes the following improvements to the logging logic of the
fitz
module:print()
/log()
call, which are no longer necessary.print()
/sys.stdout.write()
/sys.stderr.write()
/PySys_Write*
calls with just ourlog()
helper.sys.stdout
point tosys.stderr
.Fixes #3135