Skip to content
This repository has been archived by the owner on Oct 14, 2022. It is now read-only.

Add lazy/eager instructions #13

Open
nesk opened this issue Sep 6, 2018 · 3 comments
Open

Add lazy/eager instructions #13

nesk opened this issue Sep 6, 2018 · 3 comments
Labels
enhancement New feature or request

Comments

@nesk
Copy link
Member

nesk commented Sep 6, 2018

The problem

Currently, it's up to the implementation to determine if the instruction should be awaited or not. For example, PuPHPeteer always await the instructions.

However, awaiting all the instructions is a problem. The waitForNavigation() method returns a promise that will be resolved only once the goto() method is called. Since PuPHPeteer is already awaiting for the first method, the second method cannot be called and Node will throw a Navigation Timeout Exceeded error. See rialto-php/puphpeteer#4 for more information about this specific bug.

How this can be solved

Instead of letting the implementation choose to use await or not for all of its instructions, Rialto could let the implementation choose a default resolving strategy (await or not) and the user will be able to override this behaviour for some instructions.

For example, Rialto could provide a useAwaitByDefault() method to let the implementation define the preferred resolving strategy:

async handleInstruction(instruction, responseHandler, errorHandler)
{
    instruction.useAwaitByDefault(true);

    // ...
}

Then the user could simply execute an instruction:

$resource->someMethodReturningAPromise(); // This will return the value of the resolved Promise

Or he could override the resolve strategy:

$resource->lazy->someMethodReturningAPromise(); // This will return the Promise

Of course, the implementation could also choose to not await by default ( instruction.useAwaitByDefault(false)), but the user could override this too:

$resource->someMethodReturningAPromise(); // This will return the Promise
$resource->eager->someMethodReturningAPromise(); // This will return the value of the resolved Promise

Promises

Since it would be possible for an instruction to return a Promise, then we should provide some tools to use them.

A promise in PHP should be a BasicResource with a then() method which accepts a PHP callback with the resolved value as the first argument.

A PHP equivalent to Promise.all() should be provided, this would allow to wait for multiple promises. Typically, it would enable parallel calls (see #9):

$browser = (new Puppeteer)->launch();

$page1 = $browser->newPage();
$page2 = $browser->newPage();

$request1 = $page1->lazy->goto('https://github.com/nesk/');
$request2 = $page2->lazy->goto('https://github.com/-not--a--real--profile-/');

Promise::all([$request1, $request2]).then(function ($responses) {
    echo $responses[0]->status(); // 200
    echo $responses[1]->status(); // 404
});
@billisonline
Copy link

billisonline commented Nov 24, 2019

@nesk how feasible do you think it would be for someone to take this on as a first issue? I'm doing heavy PuPHPeteer scraping for a client and would like to have guarantees about when a page is loaded etc. I have strong PHP skills/experience but I'm fair-to-middling when it comes to JS and Node. What do you think?

@defaultpage
Copy link

At the moment, I have some results with asynchronous invocation of operations. I used a slightly different approach. Next week I will try to show what I did.

@defaultpage
Copy link

defaultpage commented Apr 4, 2020

$request1 = $page1->lazy->goto('https://github.com/nesk/');
$request2 = $page2->lazy->goto('https://github.com/-not--a--real--profile-/');

Adding a property with the name lazy or eager in each PHP-resource will not lead to conflicts if the JS-resource has a property with the same name?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants