Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers: Curtail the browser's license to kill #1004

Closed
lars-t-hansen opened this issue Apr 6, 2016 · 10 comments · Fixed by #1457
Closed

Workers: Curtail the browser's license to kill #1004

lars-t-hansen opened this issue Apr 6, 2016 · 10 comments · Fixed by #1457
Labels
needs implementer interest Moving the issue forward requires implementers to express interest normative change topic: workers

Comments

@lars-t-hansen
Copy link

(Background: https://www.w3.org/Bugs/Public/show_bug.cgi?id=29039; #851)

The current "kill a worker" algorithm states that it may be run at any time, and the rest of the workers spec seems to take that into account properly. Thus a correct implementation of workers can pick a random worker in a random domain every second and just gun it down for no good reason at all. This license seriously impacts the utility of workers and is almost certainly wider license than the browser needs.

I expect the need for the /some/ wording comes from the desire to kill runaway workers without a slow-script dialog and also from the need to close tabs and throw things out of the history; I also expect the current wording was simply expedient.

I would like this wording to be narrowed considerably:

In the context of current browsers gunning down workers will lead to pages that don't work but is probably otherwise "benign". But in the context of the evolving spec on shared memory for JavaScript (http://tc39.github.io/ecmascript_sharedmem/shmem.html) gunning down workers will also lead to scenarios where workers that hold locks in shared memory disappear without releasing those locks or cleaning up state, and that will lead to yet other workers deadlocking. (The lack of a termination signaling mechanism does not help here but is a separate bug.)

I don't have any concrete proposal for a new wording, and when workers communicate synchronously through shared memory and don't return to their event loops regularly it will be very difficult to determine if a worker is 'runaway'. Could we at least enumerate in the spec the situations under which the kill a worker algorithm can be run, so that we can discuss specific cases?

@domenic
Copy link
Member

domenic commented Apr 6, 2016

It's not clear to me what this bug is meant to accomplish. If there are times when workers should be prohibited from being killed, then we should enumerate them (pending implementer agreement). But if there are no such times, then the current wording seems to reflect reality well.

I'm happy to evaluate a concrete proposal (or rather, to take a concrete proposal to the worker teams of various browsers and see if they are OK with the normative changes proposed), but I'm not clear what normative changes are being asked for here.

@annevk
Copy link
Member

annevk commented Apr 6, 2016

@domenic I think what @lars-t-hansen is asking for that instead of enumerating when they should not be killed, we should more clearly enumerate when they are allowed to be killed, instead of basically allowing that all the time, because nobody could be bothered to enumerate the reasons (it was more expedient, as lth says, most likely).

Reasons I can think of:

  • OOM (already covered by a general disclaimer at the start of the specification)
  • "Owner" document unloads
  • "Owner" worker is terminated

I'm not sure why slow-script dialog would be a reason. That only seems a concern if the worker shares the event loop with a document, which is not technically allowed.

@lars-t-hansen
Copy link
Author

What @annevk said, which is a nice clarification.

@domenic
Copy link
Member

domenic commented Apr 6, 2016

I guess if implementers are OK with disallowing killing at all other times, then we could change to a safelist instead of the current approach (which is effectively a currently-empty blocklist). I am skeptical this is the case though; in general UAs have preferred the freedom to protect the user's experience and change their heuristics here. See, for example, service workers, where these heuristics are still evolving to my knowledge, and explicitly not interoperable.

Not sure who the implementers are who work on workers at other browsers, but I believe @kinu would be for Chrome. @kinu, would your team be interested in a spec change that explicitly enumerates the places user agents are allowed to kill workers, instead of letting them kill them at any time? Would that be something you could conform to now and going forward?

@domenic domenic added normative change needs implementer interest Moving the issue forward requires implementers to express interest labels Apr 6, 2016
@annevk
Copy link
Member

annevk commented Apr 6, 2016

Note that service workers have a totally different lifecycle. That they are a subclass here and there is a bit of a lie, since the processing requirements are vastly different (though not always written down as such). I would expect this thread to only cover requirements for dedicated and shared workers.

@domenic
Copy link
Member

domenic commented Apr 6, 2016

Well, they still use "kill a worker", I thought.

@annevk
Copy link
Member

annevk commented Apr 6, 2016

Yeah I know, I'm not really sure their reuse makes sense once you start looking at the details, though I have heard we share some code between shared workers and service workers, so who knows. E.g., while workers are based around message channels, service workers do not use those.

@annevk
Copy link
Member

annevk commented Jun 20, 2016

Okay, having discussed this within Mozilla, these are the ways a (dedicated/shared) worker can be shutdown:

  • Document shutdown
  • Worker closes itself (.close())
  • Parent closes worker (.terminate())
  • GC

GC happens if there is nothing (other than network, I suspect) that can observe the worker being terminated. E.g., SharedArrayBuffer would prevent GC.

It seems currently the specification doesn't really address GC. (Perhaps we can leave it since it's generally not supposed to be observable... @bakulf?) And on the other hand, permits way more shutdown than at least Mozilla actually performs.

I think we should move the specification in this direction. We need to get more predictable JavaScript "threads" if we want folks to actually rely on this for their computing needs.

@bakulf
Copy link

bakulf commented Jun 22, 2016

GC behavior is not observable: workers are terminated when there are no APIs keeping them alive for some networking (XHR, WebSocket), cross-thread event listeners (MessagePort, BroadcastChannel, etc) or pending async operations, and, of course, if there is not JS code running.

This could be consider an internal optimization also if not part of the spec.

@duanyao
Copy link

duanyao commented Jun 22, 2016

I think GC of worker should be spec'ed, because it determines whether developers have to always terminate workders they created to prevent resource leak -- this can be tricky on error conditions.

@annevk
Copy link
Member

annevk commented Jun 22, 2016

I see, yes, we should not mention it in that case. So that leaves us with three very simple cases, which should all already be covered.

annevk added a commit that referenced this issue Jun 22, 2016
There’s a new sheriff in town and workers can now no longer be killed.
“Kill” was a poor term to use anyway and also not needed as worker
termination already covered all angles we care about except for one.

Now, workers can solely be terminated due to close(), terminate(), or
disappearing documents. They can of course continued to be GC’d as long
as that is not observable, not even from the network.

Fixes #1004.
annevk added a commit that referenced this issue Apr 19, 2017
There’s a new sheriff in town and workers can now no longer be killed.
“Kill” was a poor term to use anyway and also not needed as worker
termination already covered all angles we care about except for one.

Now, workers can solely be terminated due to close(), terminate(), or
disappearing documents.

Fixes #1004.
inikulin pushed a commit to HTMLParseErrorWG/html that referenced this issue May 9, 2017
There’s a new sheriff in town and workers can now no longer be killed.
“Kill” was a poor term to use anyway and also not needed as worker
termination already covered all angles we care about except for one.

Now, workers can solely be terminated due to close(), terminate(), or
disappearing documents.

Fixes whatwg#1004.
inikulin pushed a commit to HTMLParseErrorWG/html that referenced this issue May 9, 2017
There’s a new sheriff in town and workers can now no longer be killed.
“Kill” was a poor term to use anyway and also not needed as worker
termination already covered all angles we care about except for one.

Now, workers can solely be terminated due to close(), terminate(), or
disappearing documents.

Fixes whatwg#1004.
alice pushed a commit to alice/html that referenced this issue Jan 8, 2019
There’s a new sheriff in town and workers can now no longer be killed.
“Kill” was a poor term to use anyway and also not needed as worker
termination already covered all angles we care about except for one.

Now, workers can solely be terminated due to close(), terminate(), or
disappearing documents.

Fixes whatwg#1004.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs implementer interest Moving the issue forward requires implementers to express interest normative change topic: workers
Development

Successfully merging a pull request may close this issue.

5 participants