Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yielding Event Loop #51

Open
MattEWeber opened this issue Apr 6, 2020 · 6 comments
Open

Yielding Event Loop #51

MattEWeber opened this issue Apr 6, 2020 · 6 comments

Comments

@MattEWeber
Copy link
Collaborator

I'd like to revisit the discussion on yielding the event loop from 25d8525. I think it's a very good thing to avoid the overhead of yielding the event loop when it's unnecessary. And I agree that in most cases the natural time to yield to the event loop is when time is advanced, because that's when physical actions are going to have a chance to execute anyway. But I still think it's a serious problem for reactor-ts to block the event loop indefinitely in these cases:

  • Timestamping on blocked event handlers could be indefinitely delayed. That's a problem for PTIDES execution.
  • Sometimes it will take a sequence of messages and replies for an event handler to get the data it needs to create a physical action. For example, if a message has been broken up into chunks. You can't pipeline this process if the event loop is blocked.
  • Any node modules that internally use the event loop (with event emitters for example) would freeze up.
  • Embedding a reactor-ts app in another Node program would block the other program.
  • I had previously been concerned that blocking the event loop would interfere with output coming from a Node program, but it seems this is not a problem. I did an experiment and the http server response method res.end() which sends a message from the Node program to the client flushes the buffer immediately when you call it, so there is no delay even if the server's event loop is blocked.

So I pushed a new version of reactor.ts to the branch "yield" that has a _yieldPeriod parameter. The main change it makes to _next is immediately after executing a reaction it checks if _yieldPeriod seconds have elapsed since the last time the event loop has gotten to run and it yields when the period has been exceeded.

If _yieldPeriod is null, the behavior of _next reverts to yielding when time is advanced. That way if a programmer is willing to sacrifice some CPU performance to avoid blocking the event loop, they can do that.

I didn't implement it yet, but I'd like to make yieldperiod an optional argument to the app constructor and TypeScript target parameter. The additional overhead on PingPong performance relative to the last commit in the repo (ce47460) is 5% (10 trials), but I bet that could be brought lower.

Any objections to merging the branch and implementing the target parameter in LF?

MattEWeber referenced this issue Apr 7, 2020
…d immediately if the period has been exceeded.
@lhstrh
Copy link
Member

lhstrh commented Apr 7, 2020

Comments below...

I still think it's a serious problem for reactor-ts to block the event loop indefinitely in these cases:

* Timestamping on blocked event handlers could be indefinitely delayed. That's a problem for PTIDES execution.

I don't understand what you mean by this.

* Sometimes it will take a sequence of messages and replies for an event handler to get the data it needs to create a physical action. For example, if a message has been broken up into chunks. You can't pipeline this process if the event loop is blocked.

I don't understand what you mean by "pipelining" in the context of event handling. Pipelining requires parallel processing, which, in JS, can only be realized using workers.

* Any node modules that internally use the event loop (with event emitters for example) would freeze up.

If a module uses the event loop to carry out something asynchronously then this isn't actually a problem; the result is not expected to be witnessed at the current logical time.

* Embedding a reactor-ts app in another Node program would block the other program.

Right. To avoid this, I proposed to run such reactor as a worker.

* I had previously been concerned that blocking the event loop would interfere with output coming from a Node program, but it seems this is not a problem. I did an experiment and the http server response method `res.end()` which sends a message from the Node program to the client flushes the buffer immediately when you call it, so there is no delay even if the server's event loop is blocked.

OK.

So I pushed a new version of reactor.ts to the branch "yield" that has a _yieldPeriod parameter. The main change it makes to _next is immediately after executing a reaction it checks if _yieldPeriod seconds have elapsed since the last time the event loop has gotten to run and it yields when the period has been exceeded.

If _yieldPeriod is null, the behavior of _next reverts to yielding when time is advanced. That way if a programmer is willing to sacrifice some CPU performance to avoid blocking the event loop, they can do that.

I didn't implement it yet, but I'd like to make yieldperiod an optional argument to the app constructor and TypeScript target parameter. The additional overhead on PingPong performance relative to the last commit in the repo (ce47460) is 5% (10 trials), but I bet that could be brought lower.

With what value of yieldPeriod? I think 5% is a bit much for a simple check like this.

Any objections to merging the branch and implementing the target parameter in LF?

For an application programmer it is rather opaque as to what amounts to a desirable setting of the parameter. And how would the programmer even detect there is a problem to which changing the value of the parameter would offer a solution? I sincerely doubt the practicality of this solution.

That said, my main objection is that this feature introduces nondeterminism. Pending JS events are not only capable of scheduling actions (which is useless, anyway, because they cannot be handled until logical time advances), but can also create side effects: they can change state observed by reactions; they can write to the console; they can send messages over the network; they can drive actuators. The ordering in which these kinds of side effects are observed should not depend on the physical execution time of reactions. In other words, this change breaks the reactor semantics. For that reason, I'd like to veto merging this into master.

I think the right solution to address your concerns is to run an App as worker. This allows the reactor to behave deterministically without blocking anything that it has no business blocking.

@MattEWeber
Copy link
Collaborator Author

I'm very open to convincing on all points so I'd appreciate if you could elaborate. This has been bugging me for weeks :)

I'll give some more concrete examples of these issues.

  • Timestamping on blocked event handlers could be indefinitely delayed. That's a problem for PTIDES execution.

I don't understand what you mean by this.

The app is expecting messages over some network connection, so it sets up an event handler for the messages. The app needs to get an accurate timestamp on when the message is received, so it registers a callback that looks something like this:

(message) => {
    doSomething( message, date.now()); // or use a better time function
}

The message arrives at time t, but the callback doesn't get invoked until d seconds later because the event loop is blocked. So the timestamp passed into doSomething is actually t + d.

  • Sometimes it will take a sequence of messages and replies for an event handler to get the data it needs to create a physical action. For example, if a message has been broken up into chunks. You can't pipeline this process if the event loop is blocked.

I don't understand what you mean by "pipelining" in the context of event handling. Pipelining requires parallel processing, which, in JS, can only be realized using workers.

In this scenario the app is waiting for a message that comes in two chunks, so it registers handlers that look something like this

// called on receipt of message part 1
(messagePart1) => {
    save message part 1
    send message saying the app is ready for part 2
}

// called on receipt of message part 2
(messagePart2) => {
    // schedule physical action containing message parts 1 and 2
}

The problem is if the first handler is delayed by the event loop, it will be delayed on requesting message part 2 some arbitrarily large amount.

  • Any node modules that internally use the event loop (with event emitters for example) would freeze up.

If a module uses the event loop to carry out something asynchronously then this isn't actually a problem; the result is not expected to be witnessed at the current logical time.

Good point, it won't be witnessed until later so it will have the correct behavior. But it could still be very bad for the module's performance. Consider a module with a significant total latency, but not a lot of processing that is triggered by an event emitter. The module won't be able to begin its high latency operation until after the event loop clears up

  • Embedding a reactor-ts app in another Node program would block the other program.

Right. To avoid this, I proposed to run such reactor as a worker.

Yeah, I think this could work, but it's not ideal. There are limitations on the kinds of messages that can be sent from a worker, and the reactor wouldn't be able to invoke any callbacks from the other program. It's also a lot more complicated to use a worker than to just flip on a target parameter that solves the event loop blocking problem.

With what value of yieldPeriod? I think 5% is a bit much for a simple check like this.

That was with yieldPeriod set to null, so it defaults to the same algorithm from ce47460 . I think the 5% came from the additional overhead of extra if statements inside the inner while loop of _next. The overhead goes up to 13% with an additional call to getCurrentLogicalTime when yieldPeriod is set to a non-null but really big number.

It is rather opaque as to what amounts to a desirable setting of the parameter

I imagine the thought process would go like this: "I'm having this bad performance because my event loop is being blocked. I don't want it to be blocked for more than about 50ms. So I'll pick that."

How would the programmer even detect there is a problem to which changing the value of the parameter would offer a solution?

There's a program that will tell you exactly what the latency is on your event loop https://clinicjs.org/doctor/

That said, my main objection is that this feature introduces nondeterminism... this change breaks the reactor semantics.

I agree we shouldn't break the reactor semantics. But I don't see how this change breaks anything. The only thing it does is periodically hit pause between reactions so other stuff can happen in the background.

the right solution ... is to run an App as worker

Wouldn't running the app as a worker just push all these event loop issues into the worker thread's event loop?

@lhstrh
Copy link
Member

lhstrh commented Apr 7, 2020

  • Timestamping on blocked event handlers could be indefinitely delayed. That's a problem for PTIDES execution.

I don't understand what you mean by this.

The app is expecting messages over some network connection, so it sets up an event handler for the messages. The app needs to get an accurate timestamp on when the message is received, so it registers a callback that looks something like this:

(message) => {
    doSomething( message, date.now()); // or use a better time function
}

The message arrives at time t, but the callback doesn't get invoked until d seconds later because the event loop is blocked. So the timestamp passed into doSomething is actually t + d.

Time stamping is done in schedule, so the timestamp is no less accurate than it otherwise would be. The event-loop latency effectively factors into the network latency.

  • Sometimes it will take a sequence of messages and replies for an event handler to get the data it needs to create a physical action. For example, if a message has been broken up into chunks. You can't pipeline this process if the event loop is blocked.

I don't understand what you mean by "pipelining" in the context of event handling. Pipelining requires parallel processing, which, in JS, can only be realized using workers.

In this scenario the app is waiting for a message that comes in two chunks, so it registers handlers that look something like this

// called on receipt of message part 1
(messagePart1) => {
    save message part 1
    send message saying the app is ready for part 2
}

// called on receipt of message part 2
(messagePart2) => {
    // schedule physical action containing message parts 1 and 2
}

OK, so you mean "chunking," not "pipelining."

The problem is if the first handler is delayed by the event loop, it will be delayed on requesting message part 2 some arbitrarily large amount.

This will always be the case if there are other parts of the program that block the event loop for extended periods of time. Your solution does not actually solve this problem, because, in JS, you cannot preempt on-going computation. The sole issue that you're addressing is that when Zeno behavior is occurs you interleave that Zeno behavior with background tasks. Since the reactor semantics demand that these background tasks are not reacted to until after time advances, there is no point in reducing the latency of said background tasks, unless they are entirely unrelated to the reactor program. If there are such unrelated concurrent tasks that the reactor program should not interfere with, then these tasks (or the reactor) should run in a separate worker.

  • Any node modules that internally use the event loop (with event emitters for example) would freeze up.

If a module uses the event loop to carry out something asynchronously then this isn't actually a problem; the result is not expected to be witnessed at the current logical time.

Good point, it won't be witnessed until later so it will have the correct behavior. But it could still be very bad for the module's performance.

Which is why one should not write such programs. If you want reactive programs, don't create Zeno conditions.

  • Embedding a reactor-ts app in another Node program would block the other program.

Right. To avoid this, I proposed to run such reactor as a worker.

Yeah, I think this could work, but it's not ideal. There are limitations on the kinds of messages that can be sent from a worker, and the reactor wouldn't be able to invoke any callbacks from the other program. It's also a lot more complicated to use a worker than to just flip on a target parameter that solves the event loop blocking problem.

It doesn't solve the problem. If a reactor exhibits this kind of behavior while it is expected to react quickly to outside stimuli then there is something structurally wrong with it.

With what value of yieldPeriod? I think 5% is a bit much for a simple check like this.

That was with yieldPeriod set to null, so it defaults to the same algorithm from ce47460 . I think the 5% came from the additional overhead of extra if statements inside the inner while loop of _next. The overhead goes up to 13% with an additional call to getCurrentLogicalTime when yieldPeriod is set to a non-null but really big number.

That's pretty bad...

It is rather opaque as to what amounts to a desirable setting of the parameter

I imagine the thought process would go like this: "I'm having this bad performance because my event loop is being blocked. I don't want it to be blocked for more than about 50ms. So I'll pick that."

Like I stated before, this only "helps" when reactions succeed one another in superdense time, which is a pathological behavior in reactive programming, let alone in cooperative multitasking.

How would the programmer even detect there is a problem to which changing the value of the parameter would offer a solution?

There's a program that will tell you exactly what the latency is on your event loop https://clinicjs.org/doctor/

The transparent solution in such case would be to introduce a time delay somewhere in the computation, which the reactor model already allows for.

That said, my main objection is that this feature introduces nondeterminism... this change breaks the reactor semantics.

I agree we shouldn't break the reactor semantics. But I don't see how this change breaks anything. The only thing it does is periodically hit pause between reactions so other stuff can happen in the background.

I already wrote an explanation in my previous response.

the right solution ... is to run an App as worker

Wouldn't running the app as a worker just push all these event loop issues into the worker thread's event loop?

Yes, which is why it provides isolation from other activities that might inadvertently be blocked by a reactor.

If you don't buy my explanations, then feel welcome to come up with a use case where this kind of non-deterministic interleaving between reactions and background tasks is a) necessary and b) not achievable with the current framework. If you find such use case we can reopen this discussion.

@MattEWeber
Copy link
Collaborator Author

  • Timestamping on blocked event handlers could be indefinitely delayed. That's a problem for PTIDES execution.

I don't understand what you mean by this.

The app is expecting messages over some network connection, so it sets up an event handler for the messages. The app needs to get an accurate timestamp on when the message is received, so it registers a callback that looks something like this:

(message) => {
    doSomething( message, date.now()); // or use a better time function
}

The message arrives at time t, but the callback doesn't get invoked until d seconds later because the event loop is blocked. So the timestamp passed into doSomething is actually t + d.

Time stamping is done in schedule, so the timestamp is no less accurate than it otherwise would be. The event-loop latency effectively factors into the network latency.

If we factor it into network latency, doesn't that mean an optimization to limit that latency is a good thing? A reactor that occasionally blocks the event loop for a while would have very bad worst case network latency for PTIDES.

  • Sometimes it will take a sequence of messages and replies for an event handler to get the data it needs to create a physical action. For example, if a message has been broken up into chunks. You can't pipeline this process if the event loop is blocked.

I don't understand what you mean by "pipelining" in the context of event handling. Pipelining requires parallel processing, which, in JS, can only be realized using workers.

In this scenario the app is waiting for a message that comes in two chunks, so it registers handlers that look something like this

// called on receipt of message part 1
(messagePart1) => {
    save message part 1
    send message saying the app is ready for part 2
}

// called on receipt of message part 2
(messagePart2) => {
    // schedule physical action containing message parts 1 and 2
}

OK, so you mean "chunking," not "pipelining."

The problem is if the first handler is delayed by the event loop, it will be delayed on requesting message part 2 some arbitrarily large amount.

This will always be the case if there are other parts of the program that block the event loop for extended periods of time. Your solution does not actually solve this problem, because, in JS, you cannot preempt on-going computation. The sole issue that you're addressing is that when Zeno behavior is occurs you interleave that Zeno behavior with background tasks.

A reactor doesn't have to be Zeno for this issue to come up, it just has to do CPU intensive work inside a couple reactions at the same (or sequential) microsteps. For example a reactor could very legitimately have five reactions triggering at the same time that each take 20ms to execute. Depending on how the app yields, event loop latency could be as low as 20ms or as high as 100ms.

Since the reactor semantics demand that these background tasks are not reacted to until after time advances, there is no point in reducing the latency of said background tasks, unless they are entirely unrelated to the reactor program.

But this is an example where changing the behavior of background tasks does has a direct effect on the arrival time of a physical action within the reactor program. Let's say message1 arrives at (all physical) time a, message2 arrives at time b, the reactor program executes simultaneous reactions from time t to t', and the delay on a message is d. Also assume message1 arrives during the simultaneous reaction execution so t < a < t'.

If the app does not yield the event loop, the earliest message2 can be received (and a physical action scheduled) is t' + d. But if the app were to immediately yield the event loop when message1 arrives so the handler can immediately respond, message2 can instead be received at a + d.

If there are such unrelated concurrent tasks that the reactor program should not interfere with, then these tasks (or the reactor) should run in a separate worker.

They certainly could in this example. But why should we force a programmer to use a worker thread when they could instead write a simple single threaded program and just turn on a target parameter?

Also, worker threads themselves communicate to the main thread through messages handled by the event loop. Worker thread communication would suffer from similar delays.

  • Any node modules that internally use the event loop (with event emitters for example) would freeze up.

If a module uses the event loop to carry out something asynchronously then this isn't actually a problem; the result is not expected to be witnessed at the current logical time.

Good point, it won't be witnessed until later so it will have the correct behavior. But it could still be very bad for the module's performance.

Which is why one should not write such programs. If you want reactive programs, don't create Zeno conditions.

As before, CPU intensive work inside a couple reactions at the same (or sequential) microsteps is not Zeno and would be bad for module performance. I think the "using a module" scenario is still worth discussing.

  • Embedding a reactor-ts app in another Node program would block the other program.

Right. To avoid this, I proposed to run such reactor as a worker.

Yeah, I think this could work, but it's not ideal. There are limitations on the kinds of messages that can be sent from a worker, and the reactor wouldn't be able to invoke any callbacks from the other program. It's also a lot more complicated to use a worker than to just flip on a target parameter that solves the event loop blocking problem.

It doesn't solve the problem. If a reactor exhibits this kind of behavior while it is expected to react quickly to outside stimuli then there is something structurally wrong with it.

With what value of yieldPeriod? I think 5% is a bit much for a simple check like this.

That was with yieldPeriod set to null, so it defaults to the same algorithm from ce47460 . I think the 5% came from the additional overhead of extra if statements inside the inner while loop of _next. The overhead goes up to 13% with an additional call to getCurrentLogicalTime when yieldPeriod is set to a non-null but really big number.

That's pretty bad...

I agree there's an overhead threshold where my proposed change should not be implemented. But I didn't really optimize this version yet, and for perspective performance improved by 60% on PingPong the first time I implemented a version that didn't yield whenever microsteps advanced.

Also we're talking about performance here on the PingPong benchmark which is a Zeno reactor program with no I/O. If we were to benchmark an I/O heavy program with occasional CPU intensive reactions, performance would probably improve. PingPong is sort of an unrealistic worst case for these changes.

It is rather opaque as to what amounts to a desirable setting of the parameter

I imagine the thought process would go like this: "I'm having this bad performance because my event loop is being blocked. I don't want it to be blocked for more than about 50ms. So I'll pick that."

Like I stated before, this only "helps" when reactions succeed one another in superdense time, which is a pathological behavior in reactive programming, let alone in cooperative multitasking.

How would the programmer even detect there is a problem to which changing the value of the parameter would offer a solution?

There's a program that will tell you exactly what the latency is on your event loop https://clinicjs.org/doctor/

The transparent solution in such case would be to introduce a time delay somewhere in the computation, which the reactor model already allows for.

Yeah that would work. But isn't it better for the programmer to just write the reactor program they might write for the C target without having to change their design around a quirk in how the TypeScript runtime never yields the event loop?

That said, my main objection is that this feature introduces nondeterminism... this change breaks the reactor semantics.

I agree we shouldn't break the reactor semantics. But I don't see how this change breaks anything. The only thing it does is periodically hit pause between reactions so other stuff can happen in the background.

I already wrote an explanation in my previous response.

I'm trying to understand it. You wrote that pending JS events can do bad nondeterministic things, and I agree with that. But pending JS events are also necessary to bring physical actions into the reactor model. Specifically you said "this feature introduces nondeterminism", but I don't see how. You can do bad things with exactly the same nondeterministic JS events with or without the feature.

the right solution ... is to run an App as worker

Wouldn't running the app as a worker just push all these event loop issues into the worker thread's event loop?

Yes, which is why it provides isolation from other activities that might inadvertently be blocked by a reactor.

If you don't buy my explanations, then feel welcome to come up with a use case where this kind of non-deterministic interleaving between reactions and background tasks is a) necessary and b) not achievable with the current framework. If you find such use case we can reopen this discussion.

The use cases are:

  • Unnecessary "network latency" in PTIDES
  • Using a module that internally uses event emitters
  • The message1, message2 example

Do you think it would help this discussion if I were to implement sketches of these examples in code? We also don't have any examples yet of your "just do it in a worker thread" suggestion and it might be informative for me to try that.

@lhstrh
Copy link
Member

lhstrh commented Apr 7, 2020

Time stamping is done in schedule, so the timestamp is no less accurate than it otherwise would be. The event-loop latency effectively factors into the network latency.

If we factor it into network latency, doesn't that mean an optimization to limit that latency is a good thing? A reactor that occasionally blocks the event loop for a while would have very bad worst case network latency for PTIDES.

No. You're not reducing latency in any way by yielding to the event loop amid synchronous reactions.

This will always be the case if there are other parts of the program that block the event loop for extended periods of time. Your solution does not actually solve this problem, because, in JS, you cannot preempt on-going computation. The sole issue that you're addressing is that when Zeno behavior is occurs you interleave that Zeno behavior with background tasks.

A reactor doesn't have to be Zeno for this issue to come up, it just has to do CPU intensive work inside a couple reactions at the same (or sequential) microsteps.

You're just repeating what I wrote above, so I guess we agree.

For example a reactor could very legitimately have five reactions triggering at the same time that each take 20ms to execute. Depending on how the app yields, event loop latency could be as low as 20ms or as high as 100ms.

I don't question this analysis. This is true for any node program.

Since the reactor semantics demand that these background tasks are not reacted to until after time advances, there is no point in reducing the latency of said background tasks, unless they are entirely unrelated to the reactor program.

But this is an example where changing the behavior of background tasks does has a direct effect on the arrival time of a physical action within the reactor program. Let's say message1 arrives at (all physical) time a, message2 arrives at time b, the reactor program executes simultaneous reactions from time t to t', and the delay on a message is d. Also assume message1 arrives during the simultaneous reaction execution so t < a < t'.

I disagree. In the reactor model, "arrival" is denoted by an invocation of schedule.

If the app does not yield the event loop, the earliest message2 can be received (and a physical action scheduled) is t' + d. But if the app were to immediately yield the event loop when message1 arrives so the handler can immediately respond, message2 can instead be received at a + d.

The only thing this strategy achieves is that there is a greater different between the timestamp of the message and the logical time at which it is handled. What is the point of this?

If there are such unrelated concurrent tasks that the reactor program should not interfere with, then these tasks (or the reactor) should run in a separate worker.

They certainly could in this example. But why should we force a programmer to use a worker thread when they could instead write a simple single threaded program and just turn on a target parameter?

Who is talking about forcing anything?

Also, worker threads themselves communicate to the main thread through messages handled by the event loop. Worker thread communication would suffer from similar delays.

This comes with the territory of a having a single-threaded event loop. In JS, you have two options: 1) write small functions that are quick to execute to completion or 2) dispatch computation to a worker. There is no 3) preempt computation at arbitrary times to decrease latency. Conceptually, a sequence of reactions is atomic in the reactor model -- just like an event handler, it ought to run to completion before doing anything else.

Which is why one should not write such programs. If you want reactive programs, don't create Zeno conditions.

As before, CPU intensive work inside a couple reactions at the same (or sequential) microsteps is not Zeno and would be bad for module performance. I think the "using a module" scenario is still worth discussing.

Again, this holds true for any JS program. Programs that perform long synchronous computation are considered bad practice. Yes, you can use reactors to write bad JS code; you can do the same without them.

The transparent solution in such case would be to introduce a time delay somewhere in the computation, which the reactor model already allows for.

Yeah that would work. But isn't it better for the programmer to just write the reactor program they might write for the C target without having to change their design around a quirk in how the TypeScript runtime never yields the event loop?

The situation for the C program is not much different. If you want to give physical actions a chance to be reacted to, you cannot overwhelm the reaction queue with rapidly succeeding events.

I'm trying to understand it. You wrote that pending JS events can do bad nondeterministic things, and I agree with that. But pending JS events are also necessary to bring physical actions into the reactor model. Specifically you said "this feature introduces nondeterminism", but I don't see how. You can do bad things with exactly the same nondeterministic JS events with or without the feature.

That was obviously not my point. I'm not discussing determinism in plain JS programs, I'm talking about determinism in reactor programs. The ability to block out asynchronous tasks is an advantage in the sense that it allows us to ensure that reactor state cannot be changed by those activities while reactions are occurring. If we allow them in, they can render the reactor's behavior nondeterministic.

If you don't buy my explanations, then feel welcome to come up with a use case where this kind of non-deterministic interleaving between reactions and background tasks is a) necessary and b) not achievable with the current framework. If you find such use case we can reopen this discussion.

The use cases are:

* Unnecessary "network latency" in PTIDES

I think "problem" is based a misconception.

* Using a module that internally uses event emitters

Reaction code is synchronous. Event emitters are used for handling asynchronous events. Blocking
asynchronous activity while reacting is not a bug, it is a feature.

* The message1, message2 example

If two messages need to be handled simultaneously, you would store the first message and wait for the next to arrive. Once both messages have arrived, you schedule a physical action.

Do you think it would help this discussion if I were to implement sketches of these examples in code?
We also don't have any examples yet of your "just do it in a worker thread" suggestion and it might be informative for me to try that.

Sure -- implemented examples are always useful.

@MattEWeber
Copy link
Collaborator Author

Time stamping is done in schedule, so the timestamp is no less accurate than it otherwise would be. The event-loop latency effectively factors into the network latency.

If we factor it into network latency, doesn't that mean an optimization to limit that latency is a good thing? A reactor that occasionally blocks the event loop for a while would have very bad worst case network latency for PTIDES.

No. You're not reducing latency in any way by yielding to the event loop amid synchronous reactions.

Could you explain this more? I don't understand what you mean with both "event-loop latency effectively factors into the network latency" and "You're not reducing latency in any way by yielding to the event loop amid synchronous reactions." If event-loop latency is a factor of network latency, how does reducing event-loop latency not reduce latency in any way?

The more I think about this, the more I think the non-deterministic event loop is going to be an obstacle for PTIDES however we choose to handle this issue on yielding.

A reactor doesn't have to be Zeno for this issue to come up, it just has to do CPU intensive work inside a couple reactions at the same (or sequential) microsteps.

You're just repeating what I wrote above, so I guess we agree.

My understanding is Zeno behavior usually means an infinite number of steps without time advancing. If you instead mean any number of steps without time advancing, then yes, we're talking about the same thing.

For example a reactor could very legitimately have five reactions triggering at the same time that each take 20ms to execute. Depending on how the app yields, event loop latency could be as low as 20ms or as high as 100ms.

I don't question this analysis. This is true for any node program.

But this is an example where changing the behavior of background tasks does has a direct effect on the arrival time of a physical action within the reactor program. Let's say message1 arrives at (all physical) time a, message2 arrives at time b, the reactor program executes simultaneous reactions from time t to t', and the delay on a message is d. Also assume message1 arrives during the simultaneous reaction execution so t < a < t'.

I disagree. In the reactor model, "arrival" is denoted by an invocation of schedule.

In this example I'm discussing the behavior of code that's not part of the reactor model. It's necessary to use non-reactor code from the underlying platform to bring physical actions into the reactor model. In this case blocking the event loop would make that code perform poorly.

If the app does not yield the event loop, the earliest message2 can be received (and a physical action scheduled) is t' + d. But if the app were to immediately yield the event loop when message1 arrives so the handler can immediately respond, message2 can instead be received at a + d.

The only thing this strategy achieves is that there is a greater different between the timestamp of the message and the logical time at which it is handled. What is the point of this?

I'm assuming here it's desirable to begin processing a reaction to the physical action as quickly as possible and the reactor program became idle after t'. In that case, beginning a reaction at physical time a + d is better than t' + d.

If there are such unrelated concurrent tasks that the reactor program should not interfere with, then these tasks (or the reactor) should run in a separate worker.

They certainly could in this example. But why should we force a programmer to use a worker thread when they could instead write a simple single threaded program and just turn on a target parameter?

Who is talking about forcing anything?

If the only way to prevent a reactor program from interfering with unrelated concurrent tasks is to use a worker thread and you want to achieve that outcome, aren't you forced to use a worker thread?

Also, worker threads themselves communicate to the main thread through messages handled by the event loop. Worker thread communication would suffer from similar delays.

This comes with the territory of a having a single-threaded event loop. In JS, you have two options: 1) write small functions that are quick to execute to completion or 2) dispatch computation to a worker. There is no 3) preempt computation at arbitrary times to decrease latency. Conceptually, a sequence of reactions is atomic in the reactor model -- just like an event handler, it ought to run to completion before doing anything else.

With regard to JS I totally agree. But while a sequence of reactions is atomic in the reactor model, I don't see why it has to also be atomic for the underlying platform too. Breaking up a sequence of reactions in Node.js is like turning a long function into several smaller functions. And that should be invisible from the perspective of the reactor program.

As before, CPU intensive work inside a couple reactions at the same (or sequential) microsteps is not Zeno and would be bad for module performance. I think the "using a module" scenario is still worth discussing.

Again, this holds true for any JS program. Programs that perform long synchronous computation are considered bad practice. Yes, you can use reactors to write bad JS code; you can do the same without them.

If we can turn a reactor program into better JS code without changing the reactor program, I don't see why we shouldn't.

The transparent solution in such case would be to introduce a time delay somewhere in the computation, which the reactor model already allows for.

Yeah that would work. But isn't it better for the programmer to just write the reactor program they might write for the C target without having to change their design around a quirk in how the TypeScript runtime never yields the event loop?

The situation for the C program is not much different. If you want to give physical actions a chance to be reacted to, you cannot overwhelm the reaction queue with rapidly succeeding events.

Then perhaps this would be an improvement over C. You could just write your TS reactor program without worrying about giving physical actions a chance to show up. You could only advance time when it makes sense within the logic of your reactor program.

I'm trying to understand it. You wrote that pending JS events can do bad nondeterministic things, and I agree with that. But pending JS events are also necessary to bring physical actions into the reactor model. Specifically you said "this feature introduces nondeterminism", but I don't see how. You can do bad things with exactly the same nondeterministic JS events with or without the feature.

That was obviously not my point. I'm not discussing determinism in plain JS programs, I'm talking about determinism in reactor programs. The ability to block out asynchronous tasks is an advantage in the sense that it allows us to ensure that reactor state cannot be changed by those activities while reactions are occurring. If we allow them in, they can render the reactor's behavior nondeterministic.

It wasn't so obvious to me. :) I didn't realize you saw blocking activity during reactions as an advantage. Now more of what you wrote makes sense to me. But if some lazy programmer were to write a bad asynchronous function like this in the preamble:

function (stateThatShouldNotBeModified) {
    stateThatShouldNotBeModified.set("BAD");
}

and execute it at random times, wouldn't it be just as bad for that function to execute in the middle of a chain of reactions as when the reactor program is idle? Seems like the programmer's determinism is screwed either way to me.

The use cases are:

* Unnecessary "network latency" in PTIDES

I think "problem" is based a misconception.

I agree we're probably talking past each other on this. Can you elaborate on what you mean?

* Using a module that internally uses event emitters

Reaction code is synchronous. Event emitters are used for handling asynchronous events. Blocking
asynchronous activity while reacting is not a bug, it is a feature.

I don't see poor module performance as a feature. I view access to npm modules as one of the biggest advantages of the TypeScript target, and I think we should do everything we can to make TypeScript reactors play nice with them.

* The message1, message2 example

If two messages need to be handled simultaneously, you would store the first message and wait for the next to arrive. Once both messages have arrived, you schedule a physical action.

The nuance of this scenario is that message2 has to be specifically requested once message1 has been received.

Do you think it would help this discussion if I were to implement sketches of these examples in code?
We also don't have any examples yet of your "just do it in a worker thread" suggestion and it might be informative for me to try that.

Sure -- implemented examples are always useful.

Cool! I was also thinking it would be good to write a custom "http server" benchmark for network performance. A benchmark like that definitely wouldn't be part of the Savina suite. Anyway, I'm going to work on command line arguments for TS first before getting into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants