-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Document stuck at "processing" #1278
Comments
I would guess your machine is just running out of resources to process this. It processed fine for me but note the logs, the pages are images, etc:
|
I'm also able to successfully load the document. |
I retested with 1.8.0, and the document still processes for me. For this file, there's no text content found, so it forces OCR, meaning it has to process a lot of images, which means a lot of time. The default timeout for working an a file is 1800s or 30 mins. If the document doesn't complete by then, it will be marked as failed. |
You mentioned 'marked as failed' |
Ok, I think I see what the issue is. I would bet you'll see in the log a single line like:
That's not much, and certainly not helpful to see the WebUI seemingly still working away, when the background has given up. I'll need to look into what a dependency does and see if it can be improved. That doesn't help with the document still timing out, but I don't see anything which can be done for that besides increasing the timeout. It does complete, it's just a lot of processing. From within the container, you could run |
Anything else to look at here? |
For the timeout not being so visible, I'm working on a solution for that. The "Killed" printed above is from the out of memory manager killing the process. That also might be the cause of an eternally processing document, and I don't think there's anyway to raise that up to a user. |
Running v1.8.0 on RPi4 docker swarm and am getting this problem too. Edit: It looks like they did eventually get processed |
Our next release will include improvements to how worker timeouts are handled. They will be much more visible (see examples paperless-ngx/django-q#2 (comment)) in the UI. If it's the OOM killer, that still won't be obvious; there just isn't a way to detect that. Hopefully, upcoming improvements in underlying libraries like pikepdf and qpdf will help reduce the occurrences. I'm going to close this out, as I believe what we can do here is now fixed. |
for me the timeout works well, it appears in the logs after the default 1800s. However on the dashboard the task for that document is still shown as processing mode in green. Only once the browser is closed and opened again, or I open the webpage on another browser this dissapears. So it looks like there is no feedback to the web UI once processing timeout |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns. |
Description
certain PDF causes paperless to get stuck at "processing"
i attach the file
B1fDMHqLFES.pdf
i just recently started using paperless, i consumed 20 documents with 0 issues so far, but paperless refuses to consume this file
it is stuck at "processing" for some minutes and then if i refresh page (f5) it is like it did nothing
Steps to reproduce
Webserver logs
Paperless-ngx version
1.7.1
Host OS
synology DSM 218+ docker
Installation method
Docker - official image
Browser
Chrome
Configuration changes
No response
Other
No response
The text was updated successfully, but these errors were encountered: