-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify ReadableStream.[[Transfer]] #623
Conversation
As [[storedError]] is observable, can be an arbitrary object, and is very likely an uncloneable Error, it can't be sent to a new realm reliably. So just forbid errored streams. Still needs clearer semantics of when structured cloning occurs and how DataCloneErrors are reported. Cloning needs polyfilling somehow too. Related to: whatwg#244, whatwg#276
Lacking an obvious way to actually postMessage a stream in this polyfill, this adds the 'cloning' stream type for testing purposes. Byte streams to come.
I want to apologise for not reviewing this yet. I'm very interested in it, but I've just been busy finishing up stuff for the end of the year. Hopefully someone else will get to it soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I'm surprised how reasonable this all is even without whatwg/html#935. If you ignore the question of "so how does this work for Indexed DB's use of structured clone", it looks pretty great. I have some question of strategy which I've included in the array.
In particular, there's an apparent problem with transferring errored streams (namely [[storedError]]), as well as the open question of what even ends up in [[storedError]] if the underlyingSource throws or calls controller.error() it from the original realm with something uncloneable.
This seems similar to the question of what happens if the underlyingSource calls controller.enqueue(somethingUncloneable). The answer is to error the destination stream, right? With a TypeError of some kind I guess.
StructuredClone doesn't seem to be actually possible to truly polyfill as it iterates over objects in a different order than for-in does
Doesn't it iterate over objects in the same order that Reflect.ownKeys
does?
is there any reason StructuredClone isn't/ shouldn't be exposed on its own by the JS engine, for admittedly really niche use cases like this?
That's whatwg/html#793. Basically lack of implementer interest, presumably driven by lack of use cases.
1. Set _that_.[[readableStreamController]] to _controller_. | ||
1. Let _queue_ be _controller_.[[queue]]. | ||
1. Repeat for each Record {[[value]], [[size]]} _pair_ that is an element of _queue_, | ||
1. Set _pair_.[[value]] to ! <a abstract-op>StructuredClone</a>(_pair_.[[value]], _targetRealm_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use ?, not !, as StructuredClone could throw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually can't at this point, because every [[value]]
here is already the result of a previous StructuredClone call. The only call to StructuredClone that could throw is at enqueue time.
@@ -3589,11 +3634,13 @@ throughout the rest of this standard. | |||
</emu-alg> | |||
|
|||
<h4 id="enqueue-value-with-size" aoid="EnqueueValueWithSize" throws>EnqueueValueWithSize ( <var>queue</var>, | |||
<var>value</var>, <var>size</var> )</h4> | |||
<var>value</var>, <var>size</var>, <var>targetRealm</var> )</h4> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems nicer to me if EnqueueValueWithSize stays as a naive implementation of the queue-with-sizes data structure, and the structured cloning happens elsewhere. Is doing that much uglier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well for one, it would mean reimplementing the same code for WritableStream eventually, so I guess there could be a wrapper both RS and WS use to enqueue? But then there would be...a single user of EnqueueValueWithSize that didn't use the wrapper, WritableStreamDefaultControllerClose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, I also had a preference for throwing the RangeError
for invalid strategy return values before potentially throwing the DataCloneError
.
@@ -12,6 +14,7 @@ const { AcquireWritableStreamDefaultWriter, IsWritableStream, IsWritableStreamLo | |||
|
|||
const InternalCancel = Symbol('[[Cancel]]'); | |||
const InternalPull = Symbol('[[Pull]]'); | |||
const InternalTranfer = Symbol('[[Transfer]]'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo Tran[s]fer
@@ -251,6 +257,33 @@ class ReadableStream { | |||
const branches = ReadableStreamTee(this, false); | |||
return createArrayFromList(branches); | |||
} | |||
|
|||
[InternalTranfer](/* targetRealm */) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in code I would reify the realms as their globals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would allow you to write web platform tests using iframes (although they might not do so great in our test runner... but maybe they would!)
throw new TypeError('Only cloning streams are transferable'); | ||
} | ||
/* can't exactly polyfill realm-transfer */ | ||
const that = new ReadableStream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here this would be new targetRealm.ReadableStream()
@@ -1447,6 +1478,12 @@ Instances of {{ReadableStreamDefaultController}} are created with the internal s | |||
</tr> | |||
</thead> | |||
<tr> | |||
<td>\[[targetRealm]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach, of using the ReadableStreamDefaultController in one realm instead of just creating a new stream + controller pair, is very interesting, but not what I expected. Why did you choose that?
To be specific, I would have expected the algorithm to be something like
const reader = this.getReader();
const that = new ReadableStream({
pull(c) {
return reader.read().then(
({ value, done }) => {
if (done) {
c.close();
return;
}
c.enqueue(StructuredClone(value));
},
e => c.error(e)
);
},
// probably more
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach just seemed intuitive to me.
- For one, I wanted to make throw-on-enqueue possible, so the clone call has to occur when a chunk first enters the controller's queue, not when it leaves it, as would happen in your algorithm.
- The underlyingSource still lives in the original realm, and has a reference to the controller object. (Conceptually, 'controller' is sort of a term for two different things that are conflated, a public API object provided to the underlyingSource and the hidden implementation details of the stream. The API object is stuck in the original realm of the underlyingSource, but the 'implementation details' of the stream are in charge of moving things between realms, and I was conceiving that as synonymous with a single controller, I guess, not a controller+a second, entirely invisible controller with no corresponding public API object.)
- I conceived of a transferred stream as a communications channel between two realms, but the destination realm could change when it's transferred repeatedly. You could also represent that as an ever-increasing chain of readablestreams that move messages between realms, and the implementations just do invisible optimizations as if those in between steps never happen, but this seemed a more accurate model of reality. Plus, what if those realms go away, or should go away, except for this one readablestream ferrying messages between two other realms keeping its event loop alive?
- I thought it would also be nice to have a stream that clones its chunks without having to transfer it to a worker and back again, and there's no need for a second readablestream at all in that case.
- It just seems like less code.
Can you elaborate? I'm not familiar with IndexedDB. This PR doesn't make streams cloneable anyway, trying to clone them is still an error as long as ReadableStream lacks a
Not necessarily. Why should a And 'erroring the stream' still leaves the original question of what even to actually put in that storedError slot at all--
Right, that. Or a generic DataCloneError? We could attempt a StructuredClone of the error object actually, since it may or may not be uncloneable, and just place a DataCloneError in the slot if the clone attempt throws. ...Or we could do something weird and set it to the results of running
That I wasn't aware of, neat. |
IndexedDB uses structured clone, kind of. But then it writes to disk, instead of recreating the object in another realm. So the current structure of StructuredClone is kind of broken for it.
Good point. Maybe it is even more fine then.
I said error the destination stream, not the original being-cloned stream.
Yes, we'd definitely try to clone it at first. I guess "DataCloneError" DOMException is better if that fails. |
Okay I'm semantically confused. I'm considering it conceptually the same stream, 'transferred' to a new realm. Same underlyingSource and everything. The original ReadableStream object is 'detached' and an empty shell of its former self. So we're both talking about the exact same stream here and I don't really understand your reply. |
No, they're definitely not the same stream. The underlying data is what gets transfered, but there are two different ReadableStream objects, and thus two different streams. |
Although I still am unsure on what model we want to pursue here, and I should spend more time thinking about it, I was talking with a coworker today who was very interested in this use case. To demonstrate to him exactly how it would work I produced two gists:
I thought they'd be worth dropping in the thread for the community to see. |
I've been thinking about it a bit. The transferred stream ends up with a different strategy. This is weird. But I think it's unavoidable. For byte streams it's not too bad. The HWM may be different from the original stream. It's attractive to me to let the browser set that to some "optimal" value which depends on the transfer overhead. For other streams I guess we just end up with the equivalent of CountQueuingStrategy(1). Not great. We could recognise the original value CountQueuingStrategy.prototype.size and ByteLengthQueuingStrategy.prototype.size and treat them specially. That seems like such a terrible idea that I don't want to pursue it. |
I think that might not be so bad... For example, consider a readable stream with HWM = 5 in a worker that gets transferred to the main thread, where it ends up with HWM = 1. The HWM isn't observable from the main thread, since all you have is a reader. Meanwhile, the stream creator over in the worker thread is still getting appropriate backpressure signals, because they still see HWM = 5. A writable stream is a bit more troubling. Assuming the same setup, the producer in the main thread will always see desiredSize = 1. I wonder if we should consider some kind of asynchronous proxying of desiredSize over to the main thread... that makes things two-directional, which seems weird. This reminds me of when we were discussing whether the caller of getWriter() should determine the HWM, instead of the writable stream creator. The fact that readables and writables are so different here implies to me I might be missing something. Thoughts welcome. |
The underlyingSource of the transferred stream could only see backpressure if its queue isn't being dequeued from, and it fills up to the HWM=5. So for the second case, after writing one chunk, why wouldn't the main thread continue to see The producer realm has to somehow know when not to dequeue something and send it off to the other side. it can't just send chunks off without some kind of ready signal. Isn't it the same in both cases? When the underlyingSink's queue is full, the extra (transferred) single-chunk queue will fill up with one chunk and stay full until it gets the ready signal. |
We've implemented transferable streams in a different way, so I'm closing this. |
See also #276.
As @domenic mentioned in #244 (comment), this effort seems more or less blocked on whatwg/html#935, but this at least gives a general idea of the parts it would need to touch.
In particular, there's an apparent problem with transferring errored streams (namely
[[storedError]]
), as well as the open question of what even ends up in [[storedError]] if the underlyingSource throws or callscontroller.error()
it from the original realm with something uncloneable.StructuredClone doesn't seem to be actually possible to truly polyfill as it iterates over objects in a different order than for-in does, and there doesn't seem to be an obvious way to replicate it; is there any reason StructuredClone isn't/ shouldn't be exposed on its own by the JS engine, for admittedly really niche use cases like this?
Preview | Diff