Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for condition before printing #6

Open
beckyconning opened this issue Aug 11, 2017 · 30 comments
Open

Wait for condition before printing #6

beckyconning opened this issue Aug 11, 2017 · 30 comments

Comments

@beckyconning
Copy link
Contributor

One approach would be to add a command line option which takes a javascript expression which returns a boolean. This expression would be repeatedly evaluated until it returns true at which point the PDF would be generated.

I'd love to help make this a thing and will be investigating how to do it now. Any help or guidance through the source would be very appreciated.

@beckyconning
Copy link
Contributor Author

@beckyconning
Copy link
Contributor Author

@beckyconning
Copy link
Contributor Author

@beckyconning
Copy link
Contributor Author

So I'm now taking the approach of adding command line and http request options which cause cef-pdf to produce the pdf when it receives the "pdf" ceqQuery request rather than immediately.

When these options are used the frame in question will need to evaluate window.ceqQuery({request: 'pdf'}); at which point the pdf would be produced.

@beckyconning
Copy link
Contributor Author

Trying to work this out (I don't have C++ experience)

master...beckyconning:Remote-trigger

Is where I am up to right now but when run with --trigger flag it just stalls. Will carry on with this another time : ).

@spajak
Copy link
Owner

spajak commented Aug 12, 2017

I just can find the reason behind that feature. What do you need it for?

@beckyconning
Copy link
Contributor Author

beckyconning commented Aug 12, 2017

To produce PDFs from DOMs produced by javascript applications which use data from http requests of arbitrary duration (could be 0.5 seconds, could be 30 minutes).

Although this could be achieved by transforming that DOM into static HTML (canvas -> img etc) and then passing a data uri or making a post request with this HTML in the body this can not be done non interactively (e.g. on a schedule) without yet another automated browser.

This is akin to a selfie taken by the subject via remote trigger rather than a photo which is taken by someone else as soon as the subject is seen (loaded).

@beckyconning
Copy link
Contributor Author

beckyconning commented Aug 12, 2017

This allows javascript applications to tell cef-pdf when they are ready to be rendered as a PDF.

@spajak
Copy link
Owner

spajak commented Aug 13, 2017

cef-pdf is meant to print static html documents, not applications. Maybe I will rethink this in the future.

@beckyconning
Copy link
Contributor Author

cef-pdf works perfectly with html snapshots from javascript applications but as I say that doesn't work without another browser to produce that snapshot. This is why I'm adding this feature. Obviously you aren't obligated to merge the PR when its ready but I think as an optional feature its very useful. If you don't merge it I will continue to maintain a fork.

Any javascript application can include a print layout via a @media print {} CSS query and then pass its url to cef-pdf over http with the remote trigger tag, when its ready to be rendered as a pdf it pulls that trigger and a pdf version of the information presented by the application will be delivered.

Think of all the Javascript applications out there which would benefit from this : ).

@beckyconning
Copy link
Contributor Author

This is now working, just need to add the http option argument and tidy up. master...beckyconning:trigger-remote

@beckyconning
Copy link
Contributor Author

Then I will submit a PR.

@spajak
Copy link
Owner

spajak commented Aug 27, 2017

@beckyconning I have made some adjustment to your code 4d7ee11 but i'm unable to make the trigger work (under Windows). Can you test this on devel branch? Maybe I missed something important, because my OnQuery method is never executed

@beckyconning
Copy link
Contributor Author

beckyconning commented Aug 29, 2017

http://magpcss.org/ceforum/apidocs3/projects/(default)/CefRenderProcessHandler.html#OnRenderThreadCreated(CefRefPtr)

http://magpcss.org/ceforum/apidocs3/projects/(default)/CefClient.html#OnProcessMessageReceived(CefRefPtr,CefProcessId,CefRefPtr)

Both CefClient and CefRenderProcessHandler have OnProcessMessageReceived. OnProcessMessageReceived is overwritten in Client. I believe they need to be separate instances.

@beckyconning
Copy link
Contributor Author

@spajak does the above solve this issue?

@spajak
Copy link
Owner

spajak commented Sep 1, 2017

But I have CefClient and CefRenderProcessHandler separated. I've just merged CefRequestHandler into CefClient

@beckyconning
Copy link
Contributor Author

beckyconning commented Sep 1, 2017 via email

@spajak
Copy link
Owner

spajak commented Sep 2, 2017

Didn't work. Did you tried this feature on Windows?

@spajak
Copy link
Owner

spajak commented Sep 5, 2017

I wasn't able to compile this on Windows. Besides this feature needs more work. Like some timeout for example, without this, when the callback is never called cef-pdf process is running forever - this cannot be allowed

@beckyconning
Copy link
Contributor Author

beckyconning commented Sep 12, 2017

Why not? If the HTTP connection is closed then it is cancelled, if the process receives SIGINT it is cancelled. Why is it cef-pdfs responsibility to manage timeouts when anything which uses it can easily do this itself? Also yes I did test this on Windows.

@spajak
Copy link
Owner

spajak commented Sep 12, 2017

Closing http connection does not cause renderer process to quit. cef-pdf starts additional process for every job. It is also responsible for closing the process, otherwise the process runs forever

@beckyconning
Copy link
Contributor Author

Ah I see. Shouldn't that be the priority then? Surely if the HTTP connection is closed there is no need for the job process?

@beckyconning
Copy link
Contributor Author

What you describe could happen already if loading the page takes infinite time.

@beckyconning
Copy link
Contributor Author

Like even without remote-trigger.

@beckyconning
Copy link
Contributor Author

So the client should be able to abort and cause cleanup by closing the collection.

@beckyconning
Copy link
Contributor Author

Ah I found the bug. All the trigger stuff was working fine but the convenience function was being put on the window object of about:blank rather than the window object of the given web page.

@beckyconning
Copy link
Contributor Author

I must have been testing with the full expression rather than the convenience function.

@beckyconning
Copy link
Contributor Author

beckyconning commented Sep 14, 2017

Regarding timeouts please consider these scenarios:

Direct connection to PDF generator: A timeout prevents successful PDF generation.
An application user chooses to download a PDF. The application informs the user that the PDF is being produced and this might take some time and to leave the application open while it generates. The user switches to another app and continues with other work while the PDF is generating. The given PDF they asked for takes 3 minutes to generate. The timeout is set for 2 minutes. The user switch back to the app after 5 minutes and find that the PDF has "timed out". How frustrating! The user could have received the PDF but the timeout has prevented this.

Direct connection to PDF generator: Successful PDF generation.
An application user chooses to download a PDF. The application informs the user that the PDF is being produced and this might take some time and to leave the application open while it generates. The user switches to another app and continues with other work while the PDF is generating. The given PDF they asked for takes 3 minutes to generate. There is no timeout. The user switches back to the app after 5 minutes and find that the PDF has been generated.

Direct connection to PDF generator: User impatience prevents successful PDF generation.
An application user chooses to download a PDF. The application informs the user that the PDF is being produced and this might take some time and to leave the application open while it generates. The user switches to another app and continues with other work while the PDF is generating. The given PDF they asked for takes 3 minutes to generate. There is a 4 minute timeout. The user switches back to the app after 2 minutes to find that the PDF is still being generated. The user closes the application.

If processes and resources dedicated to the generation of this PDF are not cleaned up when the connection is dropped the PDF generation will continue wastefully for 2 minutes despite no-one ever receiving this PDF.

Indirect connection from app to PDF generator: Successful PDF generation
An application user called Saiid chooses to download a PDF. The application informs Saiid that the PDF is being produced, that this might make some time and that he can close the app and come back later to collect it. Saiid closes the app and continues with other work while the PDF is generating. The PDF Saiid requested takes 12 minutes to generate.

The application server is set to allow 1000 concurrent PDF generations. Currently there are 1000 generations in progress. The application server cancels the longest running generation which has been in progress for over 10 minutes to make room for Saiid's PDF (if no generation had been running for more than 10 minutes then Saiid's generation would pend until either a generation finished or lasted more than 10 minutes). This generation was started by a user called Hadil. If Hadil checks the app now she will be informed that her PDF was taking a long time to produce and that it will be reattempted when the service is less busy. The application informs the administrator that this has occurred so the administrator can decide whether to increase the generation capacity of this service or not.

The application then starts generating Saiid's PDF. Whilst this is happening 700 generations finish successfully after which only 10 more are started. The service is now less busy so the server reattempts Hadil's PDF generation.

10 minutes later Saiid launches the app and finds his PDF ready to download. 30 minutes later Hadil launches the app and finds her PDF ready to download. If generation cancellation was based only on time rather than on the server capacity and time neither of these PDF would never have been successfully generated.

@beckyconning
Copy link
Contributor Author

beckyconning commented Sep 14, 2017

cef-pdf could decide to implement a more thoughtful cancellation policy such as the one described in the last scenario. However it has no obligation to and applications may wish to implement their own differing policy.

In many cases a cancellation policy based solely on connection is sufficient. If we're still waiting for the pdf then please keep trying to produce it is sensible. If it is not sufficient this policy can be used by other applications to produce any other cancellation policy.

Separation of concerns is important and providing even an optional timeout is dangerous as it encourages misuse.

If anyone wants to use a timeout cancellation policy they can implement it themselves trivially.

Via HTTP curl and wget provide timeouts and ajax timeouts are easy to implement in javascript and other languages. When timeout is exceeded the connection is closed after which cef-pdf should clean up.

Via the command line these examples work pretty well.

bash

( pid=$BASHPID; (sleep 120; kill $pid) & exec cef-pdf --remote-trigger --url=https://reporting.example.com/view/0981230)

timeout

timeout 120 cef-pdf --remote-trigger --url=https://reporting.example.com/view/0981230

windows

start cef-pdf.exe --remote-trigger --url=https://reporting.example.com/view/0981230
timeout /t 120
taskkill /im cef-pdf.exe /f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants