"Automatic" use of scalable video coding? #149

aboba · 2014-09-10T22:07:39Z

Currently within the ORTC API specification, Section 9.9 provides examples of how to set up encoding parameters to support simulcast and/or scalable video coding (SVC).

While some developers might be interested in controlling exactly how simulcast and SVC are used in their applications, other developers would probably be happier if they could leave that to the browser. Looking at the current specification, it appears to me that SVC configuration within an RTCRtpReceiver might be unnecessary for some video codecs, and in addition, it might be possible to dispense with configuring the SVC configuration within an RTCRtpSender in some cases as well.

Below is my understanding of how "automatic" use of SVC works on the RTCRtpReceiver and how it might work on an RTCRtpSender. Comments/corrections/suggestions welcome.

In situations where a compliant decoder can decode any valid encoding, it would appear to me that it is not necessary to set up the SVC configuration within RTCRtpReceiver.receiver. Essentially, the decision whether to utilize scalable video coding can be left to the sender. If the receiver can handle anything that the sender can send, there isn't even a need for negotiation, such as an exchange of capabilities.

To give a practical example, if a VP8 decoder can decode any valid VP8 encoding, including temporal scalability, it seems to me that an RTCRtpReceiver would not need to configure an SVC layer configuration within RTCRtpEncodingParameters. In the event that a layering configuration is provided (e.g. two temporal layers are expected) the RTCRtpSender should still be free to send something else (e.g. maybe only 1 temporal layer, or perhaps 3) without a resulting error. So it seems to me that for the RTCRtpReceiver, configuration of SVC layering is somewhat extraneous. Also, I'm not clear about the usefulness of having RTCRtpReceiver.getCapabilities return a value for RTCRtpCodecCapability.maxTemporalLayers. Where maxTemporalLayers is not set, the default interpretation could be "I can handle the maximum temporal layers supported by the codec."

For an RTCRtpSender, it does seem useful for the developer to be able to indicate whether to use temporal scalability or not. For example, in peer-to-peer communication the overhead of SVC might not make sense, so it might be useful to be able to specify only a single layer in RTCRtpSender.send(). On the other hand, there might be situations where the developer would just as soon leave the decision to use SVC up to the browser. Rather than trying to adjust the number of temporal layers within the application, the browser could decide how many layers might make sense.

Currently within RTCRtpEncodingParameters, it doesn't appear that a developer can indicate to the browser "Send SVC if you think it is useful". For example, within the RTCRtpEncodingParameters dictionary there is no "maxTemporalLayers" attribute. All you have is encodingId and a sequence of dependencyEncodingIds.

It would seem desirable to me to be able to indicate to an RTCRtpSender "utilize temporal scalability if you think it makes sense" without having to specify the encodingId and dependencyEncodingIds.

dictionary RTCRtpEncodingParameters {
unsigned long ssrc;
payloadtype codecPayloadType;
RTCRtpFecParameters fec;
RTCRtpRtxParameters rtx;
double priority = 1.0;
double maxBitrate;
double minQuality = 0;
double framerateBias = 0.5;
double resolutionScale;
double framerateScale;
boolean active = true;
DOMString encodingId;
sequence dependencyEncodingIds;
};

aboba · 2014-09-13T16:26:50Z

Suggestion from Peter:
http://lists.w3.org/Archives/Public/public-ortc/2014Sep/0012.html

On the receive side, if certain parameters don't need to be known,
then I would say that they don't need to be set. Hopefully someday
the header extensions will have enough information that nothing will
be needed other than some IDs.

But I don't think the sender should send more than one layer unless
the parameters say to. Anything to simplify things for convenience
purposes can be done via a library, so it doesn't seem that important
to make it too automatic for the sender.

aboba · 2014-10-02T21:52:18Z

For the behavior on the receiver side, here is some proposed text for Section 9.8.1:

encodingId of type DOMString
An identifier for the encoding object. This identifier should be unique within the scope of the localized sequence of RTCRtpEncodingParameters for any given RTCRtpParameters object. For a codec (such as VP8) where a compliant decoder is required to be able to decode anything that an encoder can send, it is not necessary to specify the expected scalable video coding configuration on the receiver via use of encodingId (or dependencyEncodingIds). Where specified for a receiver, the expected layering is ignored.

…3c#146 Address questions about RTCRtpCodecCapability.preferredPayloadType, as noted in: Issue w3c#147 Address questions about RTCRtpSender.setTrack() error handling, as noted in: Issue w3c#148 Partially address 'automatic' use of scalable video coding (in RTCRtpReceiver.receive()) as noted in: Issue w3c#149 Renamed RTCIceListener to RTCIceGatherer as noted in: Issue w3c#150 Added text on multiplexing of STUN, TURN, DTLS and RTP/RTCP, as noted in: Issue w3c#151 Address issue with queueing of candidate events within the RTCIceGatherer, as noted in: Issue w3c#152 Clarify behavior of RTCRtpReceiver.getCapabilities(kind), as noted in: Issue w3c#153

aboba · 2014-10-22T01:52:48Z

Based on the discussion at the October 17, 2014 ORTC CG meeting, I believe the agreed direction was to add a clarification that a browser can at its discretion send fewer layers than are specified in RTCRtpEncodingParameters, but never more. So if a developer enables use of SVC by configuring X temporal extension layers beyond the base layer, where 0 < X <= maxTemporalLayers, then the browser can send up to X+1 layers and as few as 1 (only the base layer).

aboba · 2014-11-16T19:11:32Z

Proposed text for Section 9.8.1:

For a codec (such as VP8) where a compliant decoder is required to be able to decode anything that an encoder can send, it is not necessary to specify the expected scalable video coding configuration on the receiver via use of encodingId (or dependencyEncodingIds). Where specified for a receiver, the expected layering is ignored. A sender MAY send fewer layers than what is specified in RTCRtpEncodingParameters, but MUST NOT send more.

robin-raymond added the 1.1 label Sep 25, 2014

robin-raymond mentioned this issue Oct 14, 2014

Latest ORTC specification #156

Merged

robin-raymond closed this as completed Jan 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Automatic" use of scalable video coding? #149

"Automatic" use of scalable video coding? #149

aboba commented Sep 10, 2014

aboba commented Sep 13, 2014

aboba commented Oct 2, 2014

aboba commented Oct 22, 2014

aboba commented Nov 16, 2014