Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task queue for async resolution of promises #20

Closed
joshdifabio opened this issue Nov 16, 2016 · 64 comments
Closed

Task queue for async resolution of promises #20

joshdifabio opened this issue Nov 16, 2016 · 64 comments

Comments

@joshdifabio
Copy link
Contributor

joshdifabio commented Nov 16, 2016

Edit: There is a lot of discussion below. I've summarised the open questions further down.

Guzzle Promises and React Promise (in the master branch) each include task queues[1][2] which can be used to fire promise callbacks asynchronously without creating a dependency on a loop implementation.

This might be useful for three reasons:

  • It would prevent the call stack from getting extremely deep (coroutines also help with this)
  • Code would behave in a more predictable manner
  • It would prevent sync implementations of async interfaces which always return fulfilled promises (e.g. an in-memory implementation of a cache) from blocking the event loop for a long period of time

Do we want to consider something similar for async-interop/promise?

The async-interop promise spec could enforce that promise implementations use the current promise task queue (e.g. Async\Promise\enqueue($callback, $error, $result)) in order to fire promise callbacks. This would give users the power to ensure that promise callbacks are always called asynchronously in their application if they so desire, while we could include a synchronous queue implementation (which simply fires callbacks as they are enqueued) in order to support applications which don't use an event loop.

[1] GuzzleHttp\Promise\TaskQueueInterface
[2] React\Promise\Queue\QueueInterface

Edit: Note that this could also take the responsibility of handling errors thrown by callbacks away from promise implementation libraries.

@bwoebi
Copy link
Member

bwoebi commented Nov 16, 2016

  1. The call stack getting deep is a non-issue actually? In fact it even helps as you can easily see the immediate chain of promise resolutions. At some point it will inevitably shrink again [otherwise you have some infinite recursion which gets easily detected here, but not when when() cannot nest].
  2. "Predictable". It depends on what you consider predictable. The current rule is that you need to clean up your scope before resolving Promises or invoking when(). If you're respecting this, you get no surprises. Also, in particular using Coroutines there's no difference. As this standard is encouraging the usage of Coroutines, this is less important.
  3. I do not understand how this makes a difference in blocking the event loop, could you please illustrate?

Always async has the disadvantage of not being easily able to just fetch the value of a Promise you know to be resolved (e.g. $promise->when(function($e, $v) use (&$ret) { assert(!$e); $ret = $v; }); /* continue using $ret here */) without being forced to return yourself a Promise here as you'll only know the value in the next tick. [That's a weak argument though, please focus on the other disadvantages.] {This is linked to issue 2.}

Note that this will be at the cost of latency. If there's a lot to do, there's an artificial wait time between resolution and the handlers being called. Usually that latency coincides with I/O waiting, where it's expected, but there sometimes being a lot of nested Generators, Promise piping etc. it will end up with an actual longer wait time for responses, i.e. instead of immediately being able to fire up the next I/O task, as we need to idle for as many ticks as there are Promises needing to be resolved up to the next I/O task. And when there's some load on the server, the duration of every single tick is not negligible.

Another disadvantage here is that Promises are no longer self-contained, but depend on something external (e.g. the event loop) to manage dequeuing, leading to the next disadvantage: You now need to make queues part of the event-loop local state (otherwise you get issues when you nest an event-loop, have quit the original event-loop due to register_shutdown_function etc.), which leads to necessary coupling to the loop in the specification if we want to have different Promise implementations interoperable (the identifier used in Loop::setState() needs to be known).
Which is something we wanted to avoid AFAIK?

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Nov 16, 2016

It would prevent the call stack from getting extremely deep (coroutines also help with this)

The call stack getting deep is a non-issue actually? In fact it even helps as you can easily see the immediate chain of promise resolutions. At some point it will inevitably shrink again [otherwise you have some infinite recursion which gets easily detected here, but not when when() cannot nest].

I think coroutines might alleviate this issue but this standard supports PHP 5.4, which means that coroutines won't be available to some users. Presumably in such cases this standard will only be useful when it's implemented by a thenable implementation which supports chaining, in which case the stack depth is an issue as it can easily get very deep and overflow.[1]

In fact it even helps as you can easily see the immediate chain of promise resolutions.

In my experience, the stack isn't generally useful at all for debugging when promises are used, since virtually every frame in the stack simply points to your Promise implementation.[2]

"Predictable". It depends on what you consider predictable. The current rule is that you need to clean up your scope before resolving Promises or invoking when(). If you're respecting this, you get no surprises. Also, in particular using Coroutines there's no difference. As this standard is encouraging the usage of Coroutines, this is less important.

Fair point, although 5.4 users won't have access to coroutines.

It would prevent sync implementations of async interfaces which always return fulfilled promises (e.g. an in-memory implementation of a cache) from blocking the event loop for a long period of time

I do not understand how this makes a difference in blocking the event loop, could you please illustrate?

It'll take me a while to describe this well. I'll try to find the time to do so soon.

Note that this will be at the cost of latency. If there's a lot to do, there's an artificial wait time between resolution and the handlers being called. Usually that latency coincides with I/O waiting, where it's expected, but there sometimes being a lot of nested Generators, Promise piping etc. it will end up with an actual longer wait time for responses, i.e. instead of immediately being able to fire up the next I/O task, as we need to idle for as many ticks as there are Promises needing to be resolved up to the next I/O task. And when there's some load on the server, the duration of every single tick is not negligible.

I don't agree that it will add measurable latency, although I may be missing something. A promise represents a relatively expensive (usually milliseconds or more) operation: a couple of extra foreach iterations and function calls per promise resolution doesn't seem like a significant overhead? If we provide a default implementation of a task queue which synchronously calls callbacks, as React does in master, then that's only one extra function call per callback, while still giving users the ability to override that mechanism and fire all callbacks asynchronously if they want to do so.

Alternatively, we can have lots of lot of different task queues implemented in different libraries as we have now.

Another disadvantage here is that Promises are no longer self-contained, but depend on something external (e.g. the event loop) to manage dequeuing, leading to the next disadvantage: You now need to make queues part of the event-loop local state (otherwise you get issues when you nest an event-loop, have quit the original event-loop due to register_shutdown_function etc.), which leads to necessary coupling to the loop in the specification if we want to have different Promise implementations interoperable (the identifier used in Loop::setState() needs to be known).
Which is something we wanted to avoid AFAIK?

I need to reflect a bit to be able to respond to this.

Thanks for taking the time to read and respond in detail.

[1] reactphp/promise#22
[2] reactphp/promise#22 (comment)

@joshdifabio
Copy link
Contributor Author

Always async has the disadvantage of not being easily able to just fetch the value of a Promise you know to be resolved (e.g. $promise->when(function($e, $v) use (&$ret) { assert(!$e); $ret = $v; }); /* continue using $ret here */) without being forced to return yourself a Promise here as you'll only know the value in the next tick. [That's a weak argument though, please focus on the other disadvantages.] {This is linked to issue 2.}

This seems like an antipattern. Is there any case outside of an automated test that you would do this?

@bwoebi
Copy link
Member

bwoebi commented Nov 16, 2016

Your example of reactphp/promise#22 is bound to then() returning a Promise. A when() implementation like we are proposing does not have this issue. [well, unless you use something like $promise = \Amp\pipe($promise, ...); in a loop, but why would you do that!?]

In my experience, the stack isn't generally useful at all for debugging when promises are used, since virtually every frame in the stack simply points to your Promise implementation.[2]

Again, this is assuming that we nest use then(), where one then() calls the next then() callback. That's not the case with when(), where you get some sort of an actual callback, typically resulting in a trace from the event loop handler all the way up to the actual action.

I don't agree that it will add measurable latency, although I may be missing something. A promise represents a relatively expensive (usually milliseconds or more) operation: a couple of extra foreach iterations and function calls per promise resolution doesn't seem like a significant overhead? If we provide a default implementation of a task queue which synchronously calls callbacks, as React does in master, then that's only one extra function call per callback, while still giving users the ability to override that mechanism and fire all callbacks asynchronously if they want to do so.

I'm not talking about overhead, but latency where nothing happens.

To illustrate, what we currently have:

I/O task
-> n ticks
All Promises resolved at once and next I/O task submitted
-> n ticks

What you propose:

I/O task
-> n ticks
Promise gets resolved
-> tick
next Promise resolved
-> tick
last Promise resolved and next I/O task submitted
-> n ticks

So, there's now 2 extra ticks where nothing happens except promises being resolved. Thus, instead of immediately sending off the next I/O task, there is now an extra latency of 2 ticks.

Alternatively, we can have lots of lot of different task queues implemented in different libraries as we have now.

Having many different task queues is an especially bad idea, you end up with [subconscious!] assumptions which suddenly break when another task queue gets used. This is the best way to hell.

Always async has the disadvantage of not being easily able to just fetch the value of a Promise you know to be resolved (e.g. $promise->when(function($e, $v) use (&$ret) { assert(!$e); $ret = $v; }); /* continue using $ret here */) without being forced to return yourself a Promise here as you'll only know the value in the next tick. [That's a weak argument though, please focus on the other disadvantages.] {This is linked to issue 2.}

This seems like an antipattern. Is there any case outside of an automated test that you would do this?

As said, it's a weak argument and not great code. I've done that outside of tests too out of convenience to not import every local variable. Let's not argue about this as it's a weak argument and more often than not, as you say, an anitpattern.

@joshdifabio
Copy link
Contributor Author

Alternatively, we can have lots of lot of different task queues implemented in different libraries as we have now.

Having many different task queues is an especially bad idea, you end up with [subconscious!] assumptions which suddenly break when another task queue gets used. This is the best way to hell.

I agree, which is why I think it might be worth having one provided by this standard instead, with the ability for it to be overridden globally. Looking at the existing JS and PHP promise libraries, it seems clear that there is an appetite to be able to fire promise callbacks asynchronously.

So, there's now 2 extra ticks where nothing happens except promises being resolved. Thus, instead of immediately sending off the next I/O task, there is now an extra latency of 2 ticks.

But when you say latency, this is, in reality, an extra few function calls and loop iterations, right? I.e., it's all CPU bound? Or am I missing something? In which case, I don't think it's that significant in the grand scheme of things, especially if the functionality is opt-in.

It would prevent sync implementations of async interfaces which always return fulfilled promises (e.g. an in-memory implementation of a cache) from blocking the event loop for a long period of time

I do not understand how this makes a difference in blocking the event loop, could you please illustrate?

interface MySQLClient
{
    function execute($sql, array $bind = []): Promise;
}

function my_coroutine_fn(MySQLClient $dbClient) {
    $lastId = 0;
    while (true) {
        $result = yield $dbClient->execute('select * from bar where id > ? limit 1', [$lastId]);
        if ($result === null) break;
        // do something with result
        $lastId = $result['id'];
    }
}

Assume a synchronous implementation of MySQLClient which returns fulfilled promises. If the bar table contained a large number of records, my_coroutine_fn() would block the loop for a long time, right? This example is quite contrived but I've seen this issue more than once in test environments which use fulfilled promises to provide in-memory or otherwise fast blocking implementations of services which are asynchronous in production.

How would you feel about having a global function for invoking promise callbacks which would, by default, simply invoke callbacks immediately? That would allow people to use either a synchronous or asynchronous approach application-wide, depending on their preference.

@bwoebi
Copy link
Member

bwoebi commented Nov 17, 2016

I agree, which is why I think it might be worth having one provided by this standard instead, with the ability for it to be overridden globally. Looking at the existing JS and PHP promise libraries, it seems clear that there is an appetite to be able to fire promise callbacks asynchronously.

I do not see a particular appetite there, and the reasons in favor of delayed resolution are rather weak, IMHO. With thenables the case is stronger [see your react issues you linked to], but these aren't applicable for when() and thus I'm not persuaded that it's needed here.

But when you say latency, this is, in reality, an extra few function calls and loop iterations, right? I.e., it's all CPU bound? Or am I missing something? In which case, I don't think it's that significant in the grand scheme of things, especially if the functionality is opt-in.

With ticks I mean all the logic happening while that tick is active. I.e. a lot of logic from other handlers is interleaved first before the next I/O task after the last I/O task can be submitted.

This example is quite contrived but I've seen this issue more than once in test environments which use fulfilled promises to provide in-memory or otherwise fast blocking implementations of services which are asynchronous in production.

In production - the environment which really matters here - it's not an issue as you say. Also, if it blocks the loop for that a long time, most likely your code is bad (like doing a SQL query inside a loop) … Your example is that contrived because any comparable example is bad code too, I'd say.
Also, what would you do if there is instead one single query which blocks for so long? It really is just mitigating the issue a bit, but it's still there.
Do you perhaps have actual examples to show off here?

How would you feel about having a global function for invoking promise callbacks which would, by default, simply invoke callbacks immediately? That would allow people to use either a synchronous or asynchronous approach application-wide, depending on their preference.

And have some imported libraries break subtly because nobody has made serious efforts to make it work with delayed resolution? That's one of the worse scenarios I'd say.
If anything, we really should only have one way.


Also I'm interested in your thoughts related to:

Another disadvantage here is that Promises are no longer self-contained, but depend on something external (e.g. the event loop) to manage dequeuing, leading to the next disadvantage:

You now need to make queues part of the event-loop local state (otherwise you get issues when you nest an event-loop, have quit the original event-loop due to register_shutdown_function etc.), which leads to necessary coupling to the loop in the specification if we want to have different Promise implementations interoperable (the identifier used in Loop::setState() needs to be known).
Which is something we wanted to avoid AFAIK?

I need to reflect a bit to be able to respond to this.

@kelunik
Copy link
Member

kelunik commented Nov 17, 2016

That would allow people to use either a synchronous or asynchronous approach application-wide, depending on their preference.

That's one of the worst things that can happen. It might break APIs subtly because it does no longer depend on the promise implementation, but on a global setting instead.

promises to provide in-memory or otherwise fast blocking implementations of services which are asynchronous in production

If you use blocking operations, it will always block. It's the fault of the developer.

Looking at the existing JS and PHP promise libraries, it seems clear that there is an appetite to be able to fire promise callbacks asynchronously.

This is probably due to the fact that they use ->then which can be chained and thus might result in a huge back trace.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Nov 17, 2016

I'm not convinced by your claims that when() and coroutines eliminate the stack overflow issue. Fundamentally, Amp's coroutine and promise implementation still relies on callbacks to fire downstream code when a promise resolves, so the same issue is there. Presumably this is why Amp's coroutine implementation contains the following:

            if ($this->depth > self::MAX_CONTINUATION_DEPTH) { // Defer continuation to avoid blowing up call stack.
                Loop::defer(function () use ($exception, $value) {
                    ($this->when)($exception, $value);
                });
                return;
            }

The above code means that promise resolution won't necessarily happen synchronously in Amp, right?

Another disadvantage here is that Promises are no longer self-contained, but depend on something external (e.g. the event loop) to manage dequeuing, leading to the next disadvantage:
You now need to make queues part of the event-loop local state (otherwise you get issues when you nest an event-loop, have quit the original event-loop due to register_shutdown_function etc.), which leads to necessary coupling to the loop in the specification if we want to have different Promise implementations interoperable (the identifier used in Loop::setState() needs to be known).
Which is something we wanted to avoid AFAIK?

Why is the Loop::defer within Amp's coroutine implementation (quoted above) not susceptible to this issue?

@joshdifabio
Copy link
Contributor Author

That's one of the worst things that can happen. It might break APIs subtly because it does no longer depend on the promise implementation, but on a global setting instead.

Does this standard currently require promise implementations to fire callbacks synchronously? If not, then clearly that behaviour can't be relied upon.

This is probably due to the fact that they use ->then which can be chained and thus results in a huge back trace.

I'm fairly sure this is an issue with when() as well (why would it not be?). See the Loop::defer() call within Amp's coroutine implementation for example.

@bwoebi
Copy link
Member

bwoebi commented Nov 17, 2016

Why is the Loop::defer within Amp's coroutine implementation (quoted above) not susceptible to this issue?

Because we have no global task queue there. We just tell the loop to continue afterwards.

Does this standard currently require promise implementations to fire callbacks synchronously? If not, then clearly that behaviour can't be relied upon.

The spec says a callback to be invoked *when* the promise is resolved, which implies, for me an immediate invocation upon that event.

I'm fairly sure this is an issue with when() as well (why would it not be?). See the Loop::defer() call within Amp's coroutine implementation for example.

It isn't.
It's just Coroutines calling when() recursively. And with explicit recursion you get bigger stacktraces.
But you usually don't call when() recursively unless you are managing an iterable yielding Promises. And at that point you can easily say "if this same iterable resolves three times immediately, you can continue in the next tick".

Also note that Coroutines are not encompassed by the standard here. Additionally it doesn't make any difference in the context of coroutines as there is nothing executed after the yield. With Promises you may have:

$promise->when(...);
doAction(); // and this may or may not be executed before the $promise resolved

With coroutines however:

yield $promise;
doAction(); // this is only ever executed after the $promise resolved

In coroutines it just doesn't make any difference, while there certainly is one with Promises.

@kelunik
Copy link
Member

kelunik commented Nov 17, 2016

The spec says a callback to be invoked when the promise is resolved, which implies, for me an immediate invocation upon that event.

There's no implication there. It's currently not defined.

Does this standard currently require promise implementations to fire callbacks synchronously? If not, then clearly that behaviour can't be relied upon.

No it doesn't. But you can rely on the specific implementation inside libraries, because they know which implementation they use, just the consumers of the library don't.

Why is the Loop::defer within Amp's coroutine implementation (quoted above) not susceptible to this issue?

It is, but due to recursion. And recursion is only there to minimize the mentioned latency.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Nov 17, 2016

No it doesn't. But you can rely on the specific implementation inside libraries, because they know which implementation they use, just the consumers of the library don't.

The whole point of this standard is to provide an abstract promise. That means that in many cases we won't know the concrete class of the promise we're working with.

function (Promise $aPromiseWeKnowIsResolved) {
    $aPromiseWeKnowIsResolved->when(function ($e, $r) use (&$foo) {
        $foo = $r;
    });
    // it is not safe to assume that $foo has been set
}

So is this an important guarantee or not?

@kelunik
Copy link
Member

kelunik commented Nov 17, 2016

The whole point of this standard is to provide an abstract promise.

The whole point of this specification is to provide a common interface to consume async placeholder values.

That means that in many cases we won't know the concrete class of the promise we're working with.

Consumers won't, but libraries will, because they create the promise somewhere and return it. Consumers should use coroutines in general, so it doesn't affect them either.

We aim for a low-level interface here, while users generally should use the high-level interface (coroutines). That said, I don't think it's a important guarantee for us. I'm not sure whether we should mandate sync / async calls, but it could help to make things more predictable. However, I'm not sure whether "MUST call all ::when callbacks directly upon resolution without any deferring" is compatible with existing libraries. /cc @jsor

@joshdifabio
Copy link
Contributor Author

Another disadvantage here is that Promises are no longer self-contained, but depend on something external (e.g. the event loop) to manage dequeuing, leading to the next disadvantage:
You now need to make queues part of the event-loop local state (otherwise you get issues when you nest an event-loop, have quit the original event-loop due to register_shutdown_function etc.), which leads to necessary coupling to the loop in the specification if we want to have different Promise implementations interoperable (the identifier used in Loop::setState() needs to be known).
Which is something we wanted to avoid AFAIK?

Why is the Loop::defer within Amp's coroutine implementation (quoted above) not susceptible to this issue?

Because we have no global task queue there. We just tell the loop to continue afterwards.

I'll respond to a couple of specific parts from your original comment below:

(otherwise you get issues when you nest an event-loop, have quit the original event-loop due to register_shutdown_function etc.)

This point equally affects Amp's coroutine implementation, right? What if the active loop changes after Amp\Coroutine calls defer() and before the deferred callback is invoked? I assumed that was your concern?

which leads to necessary coupling to the loop in the specification if we want to have different Promise implementations interoperable (the identifier used in Loop::setState() needs to be known).
Which is something we wanted to avoid AFAIK?

But right now coroutine implementations must be dependent on the loop standard, right? (I'm looking at Amp's implementation). And we want everyone to use coroutines. Presumably the promise libraries will provide coroutine implementations, which means they have to depend on the loop standard anyway? However, if the promise standard instead provides a mechanism for calling promise callbacks asynchronously, coroutine implementations would no longer need to depend on the loop standard, right?

It is, but due to recursion. And recursion is only there to minimize the mentioned latency.

In reality, I think this latency would be very minimal. The amount of stuff which happens each time a promise is resolved within a coroutine is already quite significant. I don't believe that adding a couple more function calls etc. is a significant overhead relative to everything which is already happening.

@bwoebi
Copy link
Member

bwoebi commented Nov 17, 2016

This point equally affects Amp's coroutine implementation, right? What if the active loop changes after Amp\Coroutine calls defer() and before the deferred callback is invoked? I assumed that was your concern?

No, the concern was that a task is run in a different loop.
But with Coroutines there's no issue unless the promise is resolve in a different loop (which would always be a bug in your code).

But right now coroutine implementations must be dependent on the loop standard, right? (I'm looking at Amp's implementation). And we want everyone to use coroutines. Presumably the promise libraries will provide coroutine implementations, which means they have to depend on the loop standard anyway?

Our coroutines don't depend on the loop standard per se, only the "smaller backtraces" feature does - in general we also could live without.

However, if the promise standard instead provides a mechanism for calling promise callbacks asynchronously, coroutine implementations would no longer need to depend on the loop standard, right?

Yeah, but how do you want to manage that without adding a dependency between Promise and Loop standards? There's at the very least an indirect dependency there. (I.e. we depend on the Promise impl taking care of that - if this is specified, then we depend on the specification depending on the Loop standard??)

I don't believe that adding a couple more function calls etc. is a significant overhead relative to everything which is already happening.

This doesn't matter. I'm talking about the amount of time spent idling. .E.g. imagine a server request, normally you have task+dispatch -> waiting -> receive+resolve+resolve+resolve+respond -> finished. With your proposal it's task+dispatch -> waiting -> receive+resolve -> resolve -> resolve -> respond -> finish. During these resolve ticks other requests are being partially handled, which is pushing resolution back...

For example, I just now have benchmarked it (concurrency 1000 for 10s) with Aerys (single core):

Thread Stats Avg Stdev Max +/- Stdev
Latency 3.66ms 13.89ms 402.60ms 99.74%
Req/Sec 5.00k 1.54k 7.99k 62.38%
Latency Distribution
50% 2.66ms
75% 2.77ms
90% 4.23ms
99% 7.76ms
100490 requests in 10.10s, 18.68MB read

Now, with a Loop::defer() each time in Amp\Internal\Placeholder:

Thread Stats Avg Stdev Max +/- Stdev
Latency 4.08ms 15.78ms 436.93ms 99.71%
Req/Sec 4.58k 1.40k 7.12k 58.42%
Latency Distribution
50% 2.88ms
75% 3.20ms
90% 4.47ms
99% 6.21ms
92110 requests in 10.10s, 17.12MB read

I see a significant latency increase of 10-15% for the 75% of reqs/90% of reqs; getting similar numbers in other runs. The 50% case is a bit slower, proportional to the decrease in total requests.
Looking at the general average, I see a 11% latency increase, while just an 7% performance drop.

Now, adding an extra I/O job (reading a file with ext/uv):

Thread Stats Avg Stdev Max +/- Stdev
Latency 5.37ms 20.07ms 595.39ms 99.74%
Req/Sec 3.39k 1.77k 6.73k 59.00%
Latency Distribution
50% 3.79ms
75% 5.01ms
90% 6.08ms
99% 10.55ms
67244 requests in 10.00s, 267.03MB read

And now with Loop::defer():

Thread Stats Avg Stdev Max +/- Stdev
Latency 6.22ms 25.34ms 713.87ms 99.64%
Req/Sec 3.05k 822.36 4.87k 66.50%
Latency Distribution
50% 4.14ms
75% 6.01ms
90% 6.62ms
99% 10.38ms
60741 requests in 10.04s, 241.20MB read

Getting you an 11% perf drop and a 17% latency increase on average.


I do not know what you call minimal, but for me a request needing 17% longer, for reading a single file, is not insignificant.

@kelunik
Copy link
Member

kelunik commented Nov 17, 2016

Our coroutines don't depend on the loop standard per se, only the "smaller backtraces" feature does - in general we also could live without.

@bwoebi Yes, they do. Otherwise our coroutine implementation has the same issues as then chaining and may result in a stack overflow.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Nov 17, 2016

This doesn't matter. I'm talking about the amount of time spent idling. .E.g. imagine a server request, normally you have task+dispatch -> waiting -> receive+resolve+resolve+resolve+respond -> finished. With your proposal it's task+dispatch -> waiting -> receive+resolve -> resolve -> resolve -> respond -> finish. During these resolve ticks other requests are being partially handled, which is pushing resolution back...

Right, I understand what you mean.

I see a significant latency increase of 10-15% for the 75% of reqs/90% of reqs; getting similar numbers in other runs. The 50% case is a bit slower, proportional to the decrease in total requests.
Looking at the general average, I see a 11% latency increase, while just an 7% performance drop.

I really appreciate the effort you've put into benchmarking that. Truly. Thanks for doing that.

I'm not sure there's any more I can add at this point. I'm not sure what I think the best approach is but the discussion has helped me.

I'm still curious about the possible coroutine/event loop dependency and the fact that Guzzle and React have their own task queues. Perhaps others have some thoughts on the issue.

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Nov 18, 2016

We spent quite a lot of time discussing this. I'd like to summarise where we got to so it's not time wasted.

  1. Should the spec allow both sync and async resolution of promises? This isn't currently explicitly defined by the spec. Always-async resolution of promises would probably hinder performance
  2. If async resolution is allowed, should promise implementors implement task queues and/or should they depend on the Loop package?
  3. Do we need to consider Guzzle, which is used without an event loop in 99% of cases but which has a task queue for async promise resolution?
  4. Do coroutines need to depend on the event loop? Does it matter if they do?
  5. Would it be useful to have an additional package for promise implementors which helps solve implementor problems?

@bwoebi @kelunik What do you think?

@kelunik
Copy link
Member

kelunik commented Nov 18, 2016

@joshdifabio That's a good summary.

  • 3: Definitely. Do you know whom to ping?
  • 4: Yes, they need that dependency. Either directly or indirectly using the proposed task queue.
  • 5: Which problems do you see?

@joshdifabio
Copy link
Contributor Author

joshdifabio commented Nov 18, 2016

Do we need to consider Guzzle, which is used without an event loop in 99% of cases but which has a task queue for async promise resolution?

Definitely. Do you know whom to ping?

I think that would be @mtdowling and @jeskew.

Would it be useful to have an additional package for promise implementors which helps solve implementor problems?

Which problems do you see?

You said something good earlier in this discussion:

The whole point of this specification is to provide a common interface to consume async placeholder values.

If a task queue was defined it would be something for promise implementors (edit: and possibly application bootstrap to actually run or replace the queue), not for consumers of promises, and it would potentially increase the surface area of this spec quite significantly. Maybe it would belong in a separate package which was for implementors to use?

@bwoebi
Copy link
Member

bwoebi commented Nov 18, 2016

  1. Allowing mixing both is weird. It should IIMHO be always-sync and always-async.
  2. The issue of a TaskQueue is that we'd really need a) a new package (yay?!) b) Every Promise impl must depend on the TaskQueue.
  3. Sure, more opinions are always helpful.
  4. As @kelunik said, indirectly or directly.
  5. It must be a separate package anyway.

@joshdifabio
Copy link
Contributor Author

  1. Allowing mixing both is weird. It should IIMHO be always-sync and always-async.

@bwoebi Please will you clarify? Do you mean or?

@bwoebi
Copy link
Member

bwoebi commented Nov 18, 2016

I mean, the only two allowed modes should be always-sync and always-async. (Well, yes… only one of both modes shall be active at once.)

@joshdifabio
Copy link
Contributor Author

I mean, the only two allowed modes should be always-sync and always-async. (Well, yes… only one of both modes shall be active at once.)

Right, I misunderstood. Thanks for clarifying.

@kelunik kelunik changed the title Consider including a task queue Task queue for async resolution of promises Nov 18, 2016
@kelunik
Copy link
Member

kelunik commented Nov 18, 2016

@joshdifabio I just changed the title of the issue to be more accurate, I hope that's fine.

@joshdifabio
Copy link
Contributor Author

@joshdifabio I just changed the title of the issue to be more accurate, I hope that's fine.

Of course.

@kelunik
Copy link
Member

kelunik commented Nov 19, 2016

Just a short note if somebody else is looking at Amp's coroutine implementation: I thought we'd get a problem with async resolution, because that depth thing will no longer work, but if we have async resolution, that recursion doesn't make sense anymore anyway, so we'd just remove the depth.

@trowski
Copy link
Contributor

trowski commented Dec 21, 2016

Just to put in my 2¢, while I initially found the task queue to avoid worrying about state when resolving a promise, this is really just a crutch and unnecessary. The increased latency and performance drop does not seem worth it in my opinion.

@bwoebi
Copy link
Member

bwoebi commented Dec 22, 2016

Hmm, I could be fine with unspecified too…

@trowski
Copy link
Contributor

trowski commented Dec 22, 2016

Deferring until next tick for resolved promises, but executing immediately when resolving a promise would be inconsistent as well, implementations should pick either always-sync or always-async.

I think the specification should indicate that libs using promises should not depend on async resolution, that code should be able to work with synchronous resolution.

@bwoebi
Copy link
Member

bwoebi commented Dec 22, 2016

s/should/must/ ?

But yes, I agree.

@joshdifabio
Copy link
Contributor Author

Deferring until next tick for resolved promises, but executing immediately when resolving a promise would be inconsistent as well, implementations should pick either always-sync or always-async.

I'm not sure I agree.

The important thing is that APIs are predictable. when() always behaving asynchronously would be more predictable than if it was usually asynchronous and sometimes synchronous. Similarly, Amp\Deferred::resolve() always invoking callbacks immediately would be good and could be documented in the Amp package. The two are compatible.

As long as performance is reasonable, I don't think that the internal behaviour of promise implementations is anywhere near as important as the APIs working in a predictable way.

@trowski
Copy link
Contributor

trowski commented Dec 22, 2016

@bwoebi Yes, code must not depend on async resolution. 😃

@joshdifabio I'm confused… you disagreed and then reiterated my point, so perhaps what I said wasn't clear. Implementations should be either async or sync, not both simultaneously. Async or sync resolution can be also configuration dependent, such as using SynchronousQueue vs. the event loop with React promises. Amp on the other hand would always have synchronous resolution.

I think allowing both async and sync resolution is the best solution so long as it's clear lib code should be tested against synchronous resolution.

@kelunik
Copy link
Member

kelunik commented Dec 22, 2016

If we go that route, the tests won't work the way they exist, because they assume ->when callbacks are called immediately once the promise is resolved.

@joshdifabio
Copy link
Contributor Author

@joshdifabio I'm confused… you disagreed and then reiterated my point, so perhaps what I said wasn't clear. Implementations should be either async or sync, not both simultaneously. Async or sync resolution can be also configuration dependent, such as using SynchronousQueue vs. the event loop with React promises. Amp on the other hand would always have synchronous resolution.

I think allowing both async and sync resolution is the best solution so long as it's clear lib code should be tested against synchronous resolution.

Sorry Aaron, I'm not being clear, but I don't think we agree. As @kelunik pointed out previously, the only real point of contention is what happens when when() is called on an already resolved promise. In all other cases, callbacks passed to when() will always be fired asynchronously regardless of promise implementation.

I think it would be good for when() to be always async instead of almost always async. I.e., if when() is called on a resolved promise, the callback should be deferred. I think this would enable a predictable, consistent API, without hindering the performance of Amp or preventing other implementations from using a task queue.

If we go that route, the tests won't work the way they exist, because they assume ->when callbacks are called immediately once the promise is resolved.

Yes, I think we'd probably need a task queue in this package if we required when() to be asynchronous.

@trowski
Copy link
Contributor

trowski commented Dec 22, 2016

Having async (or rather, deferred) invocation on resolved promises, but not on unresolved promises, can still lead to consistency issues. This is better explained with an example:

// Attach callback, then resolve.
$deferred = new Deferred;
$deferred->promise()->when(function () { echo "Hello, world"; });
$deferred->resolve(); // Prints "Hello, world" immediately.

// Resolve, then attach callback.
$deferred = new Deferred;
$deferred->resolve();
// Does not print "Hello, world" until next tick.
$deferred->promise()->when(function () { echo "Hello, world"; });

Obviously there are ordering issues with sync invocation, but at least there is no performance penalty.

If some callbacks may be invoked asynchronously, then all callbacks need to be invoked asynchronously… but this is really only possible by requiring an event loop.

@joshdifabio
Copy link
Contributor Author

Having async (or rather, deferred) invocation on resolved promises, but not on unresolved promises, can still lead to consistency issues. This is better explained with an example:

I see what you're saying, but at least the developer can predict what is going to happen: after when() returns, their callback definitely won't have been fired. If they call some other functions then they might indirectly resolve a promise, but at least this allows them to set up multiple promise listeners in succession without having to code around the possibility that the listeners might get fired as they're being defined. The rest is outside the scope of this spec.

If some callbacks may be invoked asynchronously, then all callbacks need to be invoked asynchronously… but this is really only possible by requiring an event loop.

In an ideal world I think they would all be asynchronous, but performance! However, I think we can have most of the benefit of async callbacks with little of the performance cost if we only make when() asynchronous.

I'm starting to bore myself now! Sorry guys! If everyone else disagrees then that's enough for me and I'll stop debating it.

@kelunik
Copy link
Member

kelunik commented Dec 22, 2016

I really prefer the always async when invocations from a software technical standpoint. In practice, I don't like depending on the loop and don't like adding a global task queue either.

@joshdifabio
Copy link
Contributor Author

In practice, I don't like depending on the loop and don't like adding a global task queue either.

Yeah, I definitely sympathise. If we don't provide a way for React, Guzzle, etc., to defer their callbacks do we expect those libs to defer using the event loop directly instead? Or should they simply not defer callbacks? Or should application developers manage this configuration? (I think that could be a bit messy.)

@WyriHaximus
Copy link
Member

Pinging @jsor as he knows way more about @reactphp's promises than I do, plus is working on a PR to get support for this spec in

@jsor
Copy link

jsor commented Dec 23, 2016

I prefer not adding a task queue and i'm in favor of leaving sync/async undefined in the spec.

When working in an async environment, you should always assume invocations to be async.

$foo = null;
$somePromise->when(function () use (&$foo) {
    $foo = 'bar';
});
// if $somePromise is already resolved, is $foo equal to null or 'bar'?

IMHO, this is just not how you write async code. If you rely on the state of $foo, you must continue from inside the when callback. Or better, use coroutines. So, your code must not depend on sync resolution.

@joshdifabio
Copy link
Contributor Author

i'm in favor of leaving sync/async undefined in the spec.

When working in an async environment, you should always assume invocations to be async.

I must have misunderstood because this sounds like a contradiction. How can we always assume invocations will be asynchronous if the spec allows synchronous invocations?

IMHO, this is just not how you write async code. If you rely on the state of $foo, you must continue from inside the when callback.

Clearly the example is contrived but it was purely intended to demonstrate that sometimes-synchronous invocation means unpredictable behaviour.

So, your code must not depend on sync resolution.

I think we all agree on that point.

Has anyone considered the fact that promises/a+ requires asynchronous resolution? Is that the case with all JS implementations and, if so, are we happy to dismiss all of that collective experience?

@kelunik
Copy link
Member

kelunik commented Dec 23, 2016

I think we all agree on that point.

We didn't agree on that yet, it's the thing we're discussing here.

When working in an async environment, you should always assume invocations to be async.

If we leave it unspecified, you have to assume it might be async, not always async.

IMHO, this is just not how you write async code.

Usually not, right. But in that case I think we can't provide tests then, because they assume sync invocation after the promise has been resolved.

@jsor
Copy link

jsor commented Dec 23, 2016

i'm in favor of leaving sync/async undefined in the spec.
When working in an async environment, you should always assume invocations to be async.

I must have misunderstood because this sounds like a contradiction. How can we always assume invocations will be asynchronous if the spec allows synchronous invocations?

What i meant was, that you must structure your code that it is not based on the assumption that invocations are synchronous.

IMHO, this is just not how you write async code. If you rely on the state of $foo, you must continue from inside the when callback.

Clearly the example is contrived but it was purely intended to demonstrate that sometimes-synchronous invocation means unpredictable behaviour.

Yes, it is contrived, but also just plain wrong, imho :)

So, your code must not depend on sync resolution.

I think we all agree on that point.

Has anyone considered the fact that promises/a+ requires asynchronous resolution? Is that the case with all JS implementations and, if so, are we happy to dismiss all of that collective experience?

Back then, i did research to find out why promises/a+ required async resolution and all i've found where contrived examples like the one you posted above. See this thread for example.

So, afaik the only reason is that code fails consequently when doing it wrong (like in your example).

@jsor
Copy link

jsor commented Dec 23, 2016

IMHO, this is just not how you write async code.

Usually not, right. But in that case I think we can't provide tests then, because they assume sync invocation after the promise has been resolved.

I think the problem is more like that we don't have the right tools to test async code (yet).

@kelunik
Copy link
Member

kelunik commented Dec 23, 2016

@jsor If promises might be resolved async and we don't want to depend on the loop or a task queue, how would a new tool solve that?

@bwoebi
Copy link
Member

bwoebi commented Dec 23, 2016

The spec maybe doesn't depend on the loop, but individual implementations may in this case.

@kelunik
Copy link
Member

kelunik commented Dec 23, 2016

@jsor From reading the source, React should be able to support sync when callbacks if it doesn't base the implementation on done, right?

@jsor
Copy link

jsor commented Dec 23, 2016

@kelunik Yes, the implementation might differ between 2.x and 3.x. The current implementation of async-interop/promise is for 2.x where everything is synchronous.

@kelunik
Copy link
Member

kelunik commented Dec 23, 2016

@jsor Even in 2.x there could be a async queue set, no?

I'd appreciate always sync then to be defined. It doesn't solve the being always predictable issue, that's anyway only solvable with a task queue or event loop dependency, but it solves having a predictable when invocation for already resolved promises and thus allows for the kind of tests we have now.

Note that those tests aren't the main argument, but having as few as possible unspecified behaviors is.

@trowski
Copy link
Contributor

trowski commented Dec 23, 2016

We should note that code must not be structured to assume invocations are asynchronous. This usually is a problem when dealing with object state that is modified by a when() callback.

public function example() {
    if ($this->deferred === null) {
        $this->deferred = new Deferred;
    }

    // Start an async task returning $otherPromise.
    $otherPromise()->when(function ($exception, $value) {
        $this->deferred->resolve(); // Invokes $this->example() before $this->deferred = null.
        $this->deferred = null;
    });
}

In the above code example, if the code invoked by calling $this->deferred->resolve() will invoke the method example() again, it will do so before setting $this->deferred to null. The proper way to do this is to set $this->deferred to a temp variable, set $this->deferred to null, then resolve the Deferred.

@kelunik
Copy link
Member

kelunik commented Dec 23, 2016

@trowski Your example doesn't really make sense, it resolves the $this->deferred again after being resolved. Maybe you meant $this->example()?

@trowski
Copy link
Contributor

trowski commented Dec 23, 2016

@kelunik Whoops, sorry, it wasn't suppose to be $this->deferred->promise(), but rather another promise from another async task. I've updated the code example.

@jsor
Copy link

jsor commented Dec 23, 2016

@kelunik No, react/promise 2.x does not have the task queue. It will be introduced with 3.0.

kelunik added a commit that referenced this issue Dec 23, 2016
It specifies the exact invocation parameter values now and includes a statement about always-sync when callbacks.

Closes #20.
@kelunik
Copy link
Member

kelunik commented Dec 23, 2016

I just added a PR. Could you just put your thumbs up / down there, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants