Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weighted flows of datagrams and short-lived streams #419

Closed
jan-ivar opened this issue Sep 21, 2022 · 19 comments
Closed

Weighted flows of datagrams and short-lived streams #419

jan-ivar opened this issue Sep 21, 2022 · 19 comments

Comments

@jan-ivar
Copy link
Member

TPAC TL;DR: fixing stream/datagram priority requires addressing a category problem: "datagrams take priority" means all datagrams over all streams, whereas "this stream takes priority" means over some other streams and sometimes datagrams.

#62 tries to solve two different things with the latter: 1) bandwidth allocation/starvation between long-lived streams (and datagrams), and 2) strict send-order within a media-flow of short-lived segment/frame streams. How can this work in concert?

Future apps may have multiple data flows (composed of datagrams or short-lived streams), and need a way to manage their relative importance and speed (weight). Sending less works poorly, and setting the priority of every datagram would be silly.

Proposal F from TPAC proposed new high-level objects to address this. With feedback from @wilaw, we've iterated on it:

const streamFlow = await wt.createStreamFlow({weight: 2.0});
for await (const segment of mediaSegments) {
  const sls = await streamFlow.createShortLivedStream();
  if (warp) sls.jumpToHeadOfFlowSendQueue(); // bikeshed
  segment.pipeTo(sls).catch(() => {});
}

const datagramFlow = await wt.createDatagramFlow({weight: 1.0});
const writer = datagramFlow.writable.getWriter();
for (const datagram of datagrams) {
  await writer.ready;
  writer.write(datagram).catch(() => {});
}

The app can rebalance over time: streamFlow.weight = 1.0; datagramFlow.weight = 2.0;

There's no order property, instead the app calls a method to e.g. keep i-frames ahead of p-frames:

pFrame.jumpToHeadOfFlowSendQueue();
iFrame.jumpToHeadOfFlowSendQueue();

There's a couple of ways to skin this, but the idea is long-lived objects with weight, and short-lived objects with order. Thoughts?

@jan-ivar jan-ivar changed the title Weighting flows of datagrams and short-lived streams Weighted flows of datagrams and short-lived streams Sep 21, 2022
@vasilvv
Copy link
Contributor

vasilvv commented Sep 27, 2022

I'm not sure I understand why we would want to have a separate stream flow object, instead of changing priorities on the stream itself.

I'm still not sure I understand the use case for weights -- for all use cases discussed, absolute priorities seemed to be the correct fit.

@jan-ivar
Copy link
Member Author

jan-ivar commented Sep 27, 2022

Here's a use-case: I'm streaming two live video feeds to a server over WebTransport (two flows) simultaneously:
A) from my camera, and B) from screen-sharing motion-heavy content (e.g. a first-person game).

The screen-sharing resolution is much higher than the camera, so even at the same frame rate, B has twice the amount of data to send per second compared to A.

To cover the simplest case first: I don't care about head-of-line blocking, and I simply pipe encoded data down two streams A and B. How does the user agent know to dequeue B at a faster rate than A without being told?

In the streams spec, back-pressure means the rate of a stream is determined by how fast it is consumed. A user agent can't easily look upstream to determine where there is more data to send, to send it faster. Having apps build up exaggerated queues and/or chunk sizes, hoping to influence some implementation's send-rate doesn't bode well for web compat.

To cover the advanced case: I care about head-of-line blocking, and I have to manage two flows of thousands of (either datagrams or) short-lived-streams (SLS) I continually create.

Here each SLS competes for bandwidth by default, which means in theory I could create twice as many streams for B vs A, to influence rate. But at significant app cost. It would seem a lot more natural for each SLS to map 1-1 to either a frame or segment. And assuming the frame-rate of my camera and my game are the same, it means I'd likely create the same number of SLS's for the two flows. Which again gets us to: How does the user agent know to dequeue streams from flow B at a faster rate than streams from flow A without being told? Here it is even harder to look "upstream" as the user agent has no context.

Some weight/rate-control seems useful, and I see no practical way to polyfill something like this.

@jan-ivar
Copy link
Member Author

Let's see if we can make due with strict priorities at the stream level:

image

With strict priorities alone, there's no way to express wanting to interleave sending of (some) data from long-lived stream F with data from the short-lived streams A1, B1, A2, B2, etc.

The file upload would stall until the user agent has drained all queued stream segments, which, depending on the app, might require the application to calculate holding off creating new streams from segments for a period, even if it has data available ahead of time.

E.g. a file upload might work just fine when both videos are real-time, but not if the app switches to a pre-recorded one.

IOW, the app controls the file upload window not by how much data it is streaming simultaneously for other stuff, but by how much ahead it buffers on send, which sounds similar but isn't.

@jan-ivar
Copy link
Member Author

Here's the same setup interacting with a high-priority control-channel instead of a low-priority file-upload:

image

The app would have to know to never send anything significant in size over the control channel, or video will stutter.

@jan-ivar
Copy link
Member Author

Writing the diagrams helped me appreciate the importance of being able to set streams to the same priority, which makes jumpToHeadOfFlowSendQueue() insufficient. Maybe we start order at 0 and decrement by default?

@LPardue
Copy link

LPardue commented Sep 29, 2022

Probably repeating myself, sorry, but in extensible priorities the suggestion ends up working out to be something like

  1. when incremental is false, streams of the same urgency, are served serially in order of their creation
  2. when incremental is true, streams of the same urgency, are served in an interleaved round robin fashion
  3. urgency levels are served in order, from lowest (0) to highest (7)
  4. if an urgency level has both non-incremental and incremental streams, serve the non-incremental ones first, then move onto interleaving the rest. Think of these as two "groups" in an urgency level.

There are use cases for webtransport where order of stream creation is not suitable. This might mean LIFO, or it could just mean FIFO with the odd case of "stick this stream at the start of my urgency level". This could be supported by augmenting extensible priorities model with an order parameter (say a u32 number) such that it creates a new "group" in an urgency level, that might operate something like

  1. serve the "ordered, non-incremental group" first, from lowest order value to highest order value
  2. serve the "ordered, incremental group" next, in round-robin (augmented by order perhaps)
  3. serve the non-incremental group, in stream creation order
  4. serve the incremental group, in round-robin

This way, streams with an explicit order are served deterministically whatever way an application would like. Stream A (urgency=0, order=10) would get served after Stream B (urgency=0, order=1). Explicit orders also work with urgency levels; Stream C (urgency=6, order=10), would get served ahead of Stream D (urgency 7, order=1).

If there are "ordered, non-incremental" streams with the same urgency and same order value, the tie-breaker would become the stream creation order (because it is guaranteed unique).

@kixelated
Copy link

kixelated commented Sep 29, 2022

I'm glad this is getting traction. I wanted to start with my train of thoughts without any concrete recommendation:

In an ideal world, the application could choose the stream/datagram contents for every packet so it can implement any prioritization scheme. In theory, the QUIC library could fire a callback for each packet to ask the application which datagram/stream it should send next (I think quiche works like this?). I imagine this would be a performance bottleneck for the web given it's quite a low-level API.

An alternative is to have the application maintain a priority queue (heap) that can be queried by the QUIC library for each packet. This is effectively how Warp works! The stream order is set at creation time but it could certainly be updated dynamically, like the jumpToHeadOfFlowSendQueue() method.

The priority queue will retain the same order until updated by the application. This will starve streams/datagrams at the end of the queue, which is useful when an application is designed with this in mind (Warp). However this could be a problem when the application is not designed with starvation in mind.

To avoid starvation, the application could reorder the priority queue after each packet is sent, or some amount of data is sent, or some amount of time has elapsed. This also sounds like a potential performance bottleneck but it could be done asynchronously with mixed results.

The role of weights is to allow the QUIC library to automatically pop/push the priority queue without this constant feedback to the application. These rules could be expressed in ways other than weights (ex. byte_per_pop). I wonder if there's inspiration to be had from other priority queue APIs.

A separate goal of mine is the ability to write the prioritization scheme to the wire (ex. extensible priorities). That way proxies could maintain the same prioritization regardless of where the congestion occurs. This can be done at the application layer, but maybe it's something that could be sent at the WebTransport layer instead. However, this would rule out any arbitrary prioritization APIs.

@LPardue
Copy link

LPardue commented Sep 29, 2022

A separate goal of mine is the ability to write the prioritization scheme to the wire (ex. extensible priorities). That way proxies could maintain the same prioritization regardless of where the congestion occurs. This can be done at the application layer, but maybe it's something that could be sent at the WebTransport layer instead. However, this would rule out any arbitrary prioritization APIs.

Agree. That's a question a asked a while back but I think we forgot to record anywhere in WebTransport land. A similar discussion was kicked off in IETF land when discussing another "client upload oriented" protocol - httpwg/http-extensions#2242. The HTTP spec is for bulk uploads, Kazuho highlights that the congestion bottleneck is probably between a user-agent and the Intermediary, rather than any intermediary and backend, so onward prioritization is maybe not that useful. But latency sensitive workloads like Warp might be different. Would be interesting to know if you have any data on that.

@jan-ivar
Copy link
Member Author

jan-ivar commented Oct 11, 2022

I'm glad this is getting traction. ...

Thanks, but that might be optimistic as I'm not seeing responses from browser implementers (short of my effort to drive an understanding of the problem). I merely meant to highlight a problem with adding a simple scheme:

With strict priorities alone, there's no way to express wanting to interleave sending of (some) data from long-lived stream F with data from the short-lived streams A1, B1, A2, B2, etc.

Lucas captures that we have two kinds of streams: incremental vs. non-incremental. He's also correct the property of weighting isn't inherently needed to prioritize between them (if we don't want weighting), it can be any second number.

OT1H, this might conservatively inform a minimal surface needed (using his names, as it seems premature to bikeshed):

const stream = await wt.createUnidirectionalStream({urgency: 0, order: 10});

Since I sense reluctance to adding a lot of new API surface at this point, this might be one avenue to consider.

OTOH, this might be surfacing a class of problems with non-incremental streams, such as:

  1. having .getStats on them seems kinda useless, wrong level of inquiry
  2. a streams abstraction on what are basically large chunks comes at a cost, e.g. chunks aren't transferable Question: Sharing a WebTransport connection in multiple workers?  #420

I think which avenue folks prefer might come down to how close to the finish line they think we are. I don't sense a lot of interest to bring priorities to the wire, but I could be wrong.

@LPardue
Copy link

LPardue commented Oct 11, 2022

Re: wire. We should probably ask the IETF group. I'll try to poke the list this week

@jan-ivar
Copy link
Member Author

jan-ivar commented Oct 12, 2022

Meeting:

  • Maybe wait? Hard to get priority APIs right.
  • File an issue on specifying default behavior since there's some signs Chrome and Firefox default behaviors diverge significantly.
  • some are not super enthusiastic about defining priority signaling for webtransport
  • Maybe put together some asks for next IETF115 WebTransport?

@wilaw wilaw added the Discuss at next meeting Flags an issue to be discussed at the next WG working label Oct 19, 2022
@jan-ivar
Copy link
Member Author

Meeting:

  • Media over QUIC and Warp seem to rely on having some priority scheme in place.
  • Nothing in QUIC to hang this on. Bringing it up in IETF115 would be good.

@kixelated
Copy link

Apologies for not replying sooner.

We're combining some of the MoQ efforts: Warp + RUSH + QUICR. I just put out a new base draft. This draft requires WebTransport and strongly recommends prioritization.

One of the concepts is the intended "delivery order" of media segments. I tried to keep this separate from the underlying QUIC prioritization so weighting or resetting streams could be used. However, you're going to get the best results if you can tell the underlying QUIC library the desired delivery order of streams.

@kixelated
Copy link

kixelated commented Oct 25, 2022

Some thoughts on the API design.

I do like being able to specify the prioritization of each stream individually. It's simple and could be easily sent over the wire. However, I don't think it works with multiple schemes; it's not clear how a sender should work if one stream uses weights and the other delivery order. You either have to enforce a single scheme, or do something else.

If multiple schemes are desired, I think it's necessary to create nested "priority groups" of streams in the application. I'm not a JavaScript API designer but here's my suggestion:

Strict delivery order:

transport.priority = [ audio, video, transport.datagrams ]

Weighted delivery:

transport.priority = [ { weight: 2.0, value: audio }, { weight: 0.5, value: video }, { weight: 1.5, value: transport.datagrams } ]

Strict then weighted delivery:

transport.priority = [ audio, [ { weight: 2.0, value: video }, { weight: 1.0, value: transport.datagrams } ] ]

Weighted then strict delivery:

transport.priority = [ { weight: 2.0, value: audio }, { weight: 1.0, value: [ audio, datagrams ] } ]

note: audio and video are stream objects.

To avoid misconfiguration and to signal browser support, I would recommend some sort of container like PriorityWeighted that requires pushing weight and priority group pairs. If none of the streams in the root priority group have pending data, then WebTransport would round-robin all streams like default.

This API would be easy to extend. In fact you could prioritize based on stream offset with something like:

transport.priority = [ PriorityOffset(video, 0, 100), audio, video ]

That would deliver the first 100 bytes of the video stream (metadata), then audio, and then the rest of video.

The main downside with this API is that it could get cumbersome for the application to maintain. It doesn't seem that annoying to insert each stream after creation, but removing streams once they've been closed would be tedious and possible to automate.

@LPardue
Copy link

LPardue commented Oct 25, 2022

That sounds like H2 priority, and we learned in H2 that defining turing complete machinery to implement different priority schemes is great in theory but meets implementation barriers. :D

I'd be more supportive of a single scheme that is opinionated but extensible within reason, and in future somebody can invent a means to negotiate schemes if they really, really had to diverge.

@kixelated
Copy link

kixelated commented Oct 25, 2022

Yeah, writing a full prioritization scheme to the wire just seems impractical, and using data structures is not too different.

The other option is to leverage existing turing complete machinery... Javascript!

// strict order; evaluated left to right
transport.priority = [ audio, video, transport.datagrams ]

// "round-robin" delivery; vary the delay to implement weights
setInterval(() => {
  transport.priority.push(transport.priority.shift())
}, 50)

It's not perfect since Javascript has an incomplete and infrequent view of the QUIC state, but I think it's passable as a first API. Some callbacks or the ability to access the current stream offset would help. It could be extended by adding data structures like I mentioned.

@LPardue
Copy link

LPardue commented Oct 25, 2022

Agree this sort of thing could work. I guess it boils down to how much of a declarative vs. imperitive style of API would people want. (If they want an API at all!).

@kixelated I'm curious if you see the warp "delivery order" as external from the transport prioritization? As in, it acts as an input to data prioritization but in an indirect sort of way? Or in other words, the "delivery order" is an end-to-end property of the media while local data send scheduling is only a property of one transport hop. Or am I misreading?

@kixelated
Copy link

@kixelated I'm curious if you see the warp "delivery order" as external from the transport prioritization? As in, it acts as an input to data prioritization but in an indirect sort of way? Or in other words, the "delivery order" is an end-to-end property of the media while local data send scheduling is only a property of one transport hop. Or am I misreading?

Pretty much. I don't think it's possible to mandate that every hop prioritize streams. The best we can do is write the delivery order on the wire and hope there's prioritization on the hops with congestion.

For example, for web ingest today I would still write the delivery order on the stream, even though WebTransport doesn't implement prioritization. But maybe a later hop does and can use the delivery order to prioritize.

@wilaw wilaw removed the Discuss at next meeting Flags an issue to be discussed at the next WG working label Nov 2, 2022
@jan-ivar
Copy link
Member Author

jan-ivar commented Jan 4, 2023

Overtaken by #431.

@jan-ivar jan-ivar closed this as completed Jan 4, 2023
@jan-ivar jan-ivar closed this as not planned Won't fix, can't repro, duplicate, stale Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants