Media Capabilities #218

chcunningham · 2017-11-14T19:21:39Z

Hello TAG!

I'm requesting a TAG review of:

Name: MediaCapabilities
Specification URL: https://github.com/WICG/media-capabilities
Explainer: https://github.com/WICG/media-capabilities/blob/master/explainer.md
Tests: https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/external/wpt/media-capabilities/decodingInfo.html
Primary contacts: chcunningham@, mounirlamouri@

Further details (optional):

Relevant time constraints or deadlines: [please provide]
[x ] I have read and filled out the Self-Review Questionnare on Security and Privacy. The assessment is here.
[x ] I have reviewed the TAG's API Design Principles

You should also know that...

The API is available in Chrome behind a flag. --enable-blink-features=MediaCapabilities
Implementation bugs are tracked here.

We'd prefer the TAG provide feedback as (please select one):

open issues in our Github repo for each point of feedback
open a single issue in our Github repo for the entire review
[x ] leave review feedback as a comment in this issue and @-notify [mounirlamouri, chcunningham]

torgo · 2018-02-02T11:34:41Z

Discussed at London f2f day 3

triblondon · 2018-02-02T11:41:25Z

Sangwhan to write up this review over dinnertime today.

triblondon · 2018-02-02T11:42:33Z

Issues raised in conversation:

Privacy: potential for fingerprinting, private mode is insufficient mitigation

foolip · 2018-02-27T04:57:48Z

FYI, there is now an Intent to Ship: Media Capabilities: decoding on blink-dev.

I note that this review was slower than usual, with 2.5 months passing before there was any activity, and sounds like there's still a write-up to come? Feedback at any stage of the lifecycle of a spec is welcome of course, but I'll suggest in the blink-dev thread to not block waiting for more feedback.

foolip · 2018-02-27T10:00:06Z

On fingerprinting:
https://www.chromium.org/Home/chromium-security/security-faq#TOC-Why-isn-t-passive-browser-fingerprinting-including-passive-cookies-in-Chrome-s-threat-model-
https://www.chromium.org/Home/chromium-security/client-identification-mechanisms

cynthia · 2018-02-27T12:49:15Z

@foolip apologies for the delay. We try to triage incoming reviews as soon as we see them, but there are times stuff falls through the cracks. I think it’s safe to not consider this a blocker for shipping the feature, I’ll summarize the discussion from the F2F into a write-up shortly.

cynthia · 2018-03-06T06:53:37Z

Apologies that this took so long. @chcunningham @foolip

As for the privacy issues, thanks for the links related to fingerprinting. The S&P questionnaire link in the original review request seems to be a 404, could you clarify on this? https://github.com/WICG/media-capabilities/blob/master/security-privacy-questionnaire.md

First pass review I notice a inconsistency with a API that touches on the same domain - Web Audio. Web Audio defines channels in the form of a unsigned long (which does obstruct away the presence of a low frequency channel) and the sample rate is a float. I don't have a strong opinion on which is better, but types for parameters touching the same concepts should probably be consistent. How to deal with the presence of a low frequency channel is an open question though - and whether or not exposing this detail is actually useful to the content authors.

The content mime type would most likely require additional parsing from each application that uses this - would it make sense to provide this in structured data to make it easier to use? It seems like most content authors would do codec checks via regex or substring matching with this approach, which isn't great. A section in the explainer (https://github.com/WICG/media-capabilities/blob/master/explainer.md#why-not-a-list-of-formats) seems to touch on this, but the intention for this review comment is different from the one mentioned here.

A normative definition of what defines a screen change (or a reference back to a normative definition) would be helpful.

Minor question 1: The explainer example code seems to suggest that the screen color depth is a string (the spec is missing a definition for this though) - is there any particular reason for this decision?

Minor question 2: The explainer touches on HDCP - but that isn't on the spec. Wouldn't the approach in the explainer break when a user launches the content on a HDCP capable screen, starts playback, then drag it into a non-HDCP capable screen?

Since it is unclear exactly what from the spec is shipping - would you mind sharing the CL that relates to the I2S?

mounirlamouri · 2018-03-06T13:34:45Z

The S&P questionnaire link in the original review request seems to be a 404, could you clarify on this? https://github.com/WICG/media-capabilities/blob/master/security-privacy-questionnaire.md

I believe the link is working. Maybe GH had troubles when you tried?

Web Audio and channels

As mentioned in the spec, channels is still a placeholder and we do not currently use it in Chrome's implementation. I have filed w3c/media-capabilities#73 to make it clearer.

The content mime type would most likely require additional parsing from each application that uses this - would it make sense to provide this in structured data to make it easier to use?

I'm not entirely sure what you meant by this. Specifically, what you mean by "additional parsing from each application". I would expect web applications to copy-paste the type in their code or read it directly from a manifest of some sort.

A normative definition of what defines a screen change (or a reference back to a normative definition) would be helpful.

As mentioned below, only part 2 of the spec is something we are launching in Chrome. Part 3 is more draft-y and most of it was or will be merged into a CSS spec. This will likely be the case with the change event if it ever happens.

Minor question 1: The explainer example code seems to suggest that the screen color depth is a string (the spec is missing a definition for this though) - is there any particular reason for this decision?

Color depth changes we had were merged into the appropriate CSS spec. I believe 3.3 is a leftover from a removel. I've filed w3c/media-capabilities#74

Minor question 2: The explainer touches on HDCP - but that isn't on the spec. Wouldn't the approach in the explainer break when a user launches the content on a HDCP capable screen, starts playback, then drag it into a non-HDCP capable screen?

HDCP was split into another specification with a brief explainer in the directory. It will be an extension of EME. Your point about EME and screen changes is correct though I believe CDM might deal with this. The screen change event would be another way but the intent of this event is larger and could be fired when the screen size has changed.

Since it is unclear exactly what from the spec is shipping - would you mind sharing the CL that relates to the I2S?

That's a very good point. Part 2 is the one that is shipping in Chrome: https://wicg.github.io/media-capabilities/#decoding-encoding-capabilities

cynthia · 2018-03-06T13:51:04Z

I believe the link is working. Maybe GH had troubles when you tried?

It seems so - I just tried again and it works just fine.

I'm not entirely sure what you meant by this. Specifically, what you mean by "additional parsing from each application". I would expect web applications to copy-paste the type in their code or read it directly from a manifest of some sort.

I imagined a use case where the content author wants to parse out just the codec and not the container information, with a string based format this would require parsing the string.

As mentioned below, only part 2 of the spec is something we are launching in Chrome. Part 3 is more draft-y and most of it was or will be merged into a CSS spec. This will likely be the case with the change event if it ever happens.

It would be great if the spec could be trimmed down to only what is shipping. Stale draft material tends to confuse both implementors and content authors.

mounirlamouri · 2018-03-07T10:15:22Z

Good point for the spec state, I will add a warning on top of section 3 mentioning that it's still WIP.

Regarding the codecs string, we require the container and the codec, such as video/webm;codecs=vp8. I believe that most places in the web platform ask for formats in this form (old APIs would accept container-only).

cynthia · 2018-10-31T12:40:21Z

The framerate is the number of frames used in one second (frames per second). It is represented either as a double or as a fraction.

This is a bit strange, but I'm guessing some sort of compatibility legacy reason? Would be useful to know why this is like this (and if this is the long term right way forward)

cynthia · 2018-10-31T13:32:39Z

Hey all,

Thanks for filing this issue. We took it up during our Paris F2F. Apologies that it took so long.

I had a question about how this could be use to test capabilities in a multiple media stream context. For example, can we understand if it's possible to efficiently decode more than one video stream at once? Or get a maximum number of streams/channels/densities at which the client would hit limits? This case could come up in an RTC scenario. There may be cases where decode won't be smooth when you have two decoders or a encoder and decoder pair running, for example.

We also had questions about naming and types of the returned capabilities, specifically smooth and powerEfficient. These names imply a guarantee - despite what you've already written about how they are not. Have there been any alternative names considered? Curious if this can be addressed somehow? (We ballbrainstormed ''janky or powerInefficient as poor choices, but with the logic inverted.)

Thanks again, and looking forward to hearing back.

chcunningham · 2018-10-31T22:17:26Z

Hey Cynthia,

Re: framerate, this stems discussion here
w3c/media-capabilities#12

But its recently being reconsidered
w3c/media-capabilities#96

Happy to have your input.

chcunningham · 2018-10-31T23:27:25Z

For example, can we understand if it's possible to efficiently decode more than one video stream at once?

Its not easy to know with current API shape and I haven't thought much about ways the API could be changed for this use case. If you wanted to show 2 videos of same resolution you could approximate by doing a query that doubles the framerate. For sw codecs this is decent. For hw codecs it will depend how the hw resources are divided (e.g. by time slice vs by playback). AFAIK platform decoder APIs don't often surface parallel session limits up front, so this would involve some guessing/learning even for implementers (not a deal breaker).

These names imply a guarantee...

We could prefix the names with "ideally" or "typically"... I tend to favor the shorter names though. I expect sophisticated users of the API to understand that nothing is ever guaranteed in media ;).

cynthia · 2019-04-10T05:36:40Z

@chcunningham did you folks have a chance to discuss this internally about the feedback above?

chcunningham · 2019-05-02T00:27:28Z

@cynthia apologies for the delay.

Re: native APIs for concurrent playbacks, I'll have to double check - some changes coming to new Android, but not sure if this particular query is possible. @jernoble for Safari/Mac (and thoughts on the use case).

Re: webRTC - the API currently doesn't cover WebRTC decoding at all. Its debatable whether it should. There is some ongoing exploration of doing this for WebRTC encoding via encodingInfo()... The alternative approach is to augment WebRTC to better handle capabilities questions in its domain.

@mounirlamouri, thoughts on Remote Playback? What is capability discovery like in that API right now?

mounirlamouri · 2019-05-02T18:01:31Z

Regarding remoting, I think supporting something via Open Screen Protocol and exposed through the Presentation API would be interested but I do not think we should extend the Remote Playback API which is meant to be a very simple API to play things remotely.

deanliao · 2019-05-08T05:31:39Z

Hi TAG,

I'm implementing MediaCapabilities encodingInfo() API, specifically "transmission" type.
Intent to Implement
Design doc
I'm requesting privacy / security review as it adds fingerprinting surface just like decodingInfo(). Except above, do I need to provide any information for launching the encodingInfo() method?

chcunningham · 2019-05-08T20:23:53Z

Re encodingInfo(), I recommend holding off on TAG review of that for a bit longer. We are still discussing the API internally (particularly the WebRTC part), so lets wait for that dust to settle (should know soon).

markafoltz · 2019-05-08T23:21:42Z

Re: #218 (comment)

@chrisn Sorry I missed this earlier, it got filtered into a folder.

Capability queries for the Open Screen Protocol are being discussed in w3c/openscreenprotocol#123. Depending on where it lands, it should help user agents accurately determine which devices are capable of remote playback of a specific media element. But I didn't see a direct impact on the Media Capabilities API itself, since the idea is the user agent (not the Web application) figures out the best way to play the media remotely.

For the Presentation API, if the user agent rendering the presentation supports the Media Capabilities API, the presentation can use that to make choices about what media to play; but the API should work the same as on any other document.

Happy to discuss further at our upcoming F2F.

hober · 2019-05-23T12:48:09Z

Hi, @kenchris and I took a quick look at this during our F2F in Reykjavik. We note that the two pending bits of review that got added to this issue recently—decodingInfo() for encrypted media, and the "transmission" type of encodingInfo()—have fingerprinting implications, but the Security & Privacy Questionnaire assessment hasn't been updated in light of these new features. Could you update your security & privacy assessment accordingly?

Additionally, it's difficult for us to review the "transmission" type for encodingInfo() as your design document is not visible to non-Google employees. We'll hold off on reviewing this aspect until your internal dust settles and you make a design document public.

mounirlamouri · 2019-05-23T19:14:57Z

@hober and @kenchris I do not believe that we need to update the Security & Privacy Questionnaire as the potential fingerprinting concerns for encrypted media and "transmission" type are the same category as the fingerprinting concerns of the rest of the API.

@deanliao can you create a publicly visible version of your design document?

chcunningham · 2019-11-12T02:22:57Z

Hi Tag, Just doing a bit of cleanup.

Re: encodingInfo() - I've created new issues in MediaCapabilities and WebRTC to scrutinize that proposal a bit more and ponder whether WebRTC should instead have a its own separate mechansim.

Does anyone have lingering issues to discuss here?

chcunningham · 2019-11-12T02:24:18Z

Re: manifests, just for clarity, these would often be the well known mpeg-dash manifests. Some sites roll their own formats (e.g. YouTube), but they probably prefer it that way.

hadleybeeman · 2019-12-05T00:10:49Z

We're looking at this in our TAG face-to-face in Cupertino.

Looks like we are still waiting for a design doc from @deanliao.

@chcunningham @mounirlamouri, I'm guessing you're tracking this too. Could one of the three of you help us with this?

deanliao · 2019-12-05T03:28:00Z

@hadleybeeman sorry I transferred to other team and handed over the task to @chcunningham .

cynthia · 2020-03-04T01:39:27Z

@plinss and I revisited this in our Wellington F2F.

Apologies that we've had this open for so long. We don't think there are significant enough technical issues remaining to justify keeping this issue open, so we would like to close this. Thank you for bringing this to our attention!

If you have any significant design changes that would require a second round of review, let us know and we will re-open.

torgo assigned cynthia Jan 31, 2018

plinss added the extra time label Feb 2, 2018

plinss added this to the tag-f2f-london-2018-01-31 milestone Feb 2, 2018

triblondon added Progress: in progress and removed extra time labels Feb 2, 2018

torgo changed the title ~~Review Request: Media Capabilities~~ Media Capabilities Oct 30, 2018

torgo added extra time labels Oct 31, 2018

torgo assigned plinss Oct 31, 2018

plinss added Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review and removed Paris2018f2f labels Oct 31, 2018

cynthia removed the Progress: in progress label Oct 31, 2018

plinss modified the milestones: 2018-01-31-f2f-london, 2018-11-20-telcon Oct 31, 2018

torgo removed this from the 2018-11-20-telcon milestone Nov 28, 2018

plinss modified the milestones: 2019-04-03-telcon, 2019-04-10-telcon Apr 8, 2019

plinss modified the milestones: 2019-04-10-telcon, 2019-05-08-telcon Apr 22, 2019

torgo unassigned travisleithead May 20, 2019

kenchris added the Venue: WICG label May 23, 2019

hober added Missing: security & privacy review Topic: media Topic: privacy Venue: WebRTC WebRTC and media capture labels May 23, 2019

plinss modified the milestones: 2019-05-08-telcon, 2019-06-12-telecon Jun 11, 2019

tantek mentioned this issue Dec 16, 2019

Proliferation of manifests at W3C w3ctag/design-principles#148

Closed

alice removed this from the 2019-06-12-telecon milestone Jan 27, 2020

cynthia closed this as completed Mar 4, 2020

torgo mentioned this issue Mar 31, 2020

Guide spec authors to use one manifest (web app manifest) w3ctag/design-principles#95

Closed

chrisn mentioned this issue Aug 27, 2024

Horizontal reviews w3c/media-capabilities#226

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Media Capabilities #218

Media Capabilities #218

chcunningham commented Nov 14, 2017

torgo commented Feb 2, 2018 •

edited

Loading

triblondon commented Feb 2, 2018

triblondon commented Feb 2, 2018

foolip commented Feb 27, 2018

foolip commented Feb 27, 2018

cynthia commented Feb 27, 2018

cynthia commented Mar 6, 2018

mounirlamouri commented Mar 6, 2018

cynthia commented Mar 6, 2018

mounirlamouri commented Mar 7, 2018

cynthia commented Oct 31, 2018

cynthia commented Oct 31, 2018

chcunningham commented Oct 31, 2018

chcunningham commented Oct 31, 2018

cynthia commented Apr 10, 2019

chcunningham commented May 2, 2019

mounirlamouri commented May 2, 2019

deanliao commented May 8, 2019

chcunningham commented May 8, 2019

markafoltz commented May 8, 2019

hober commented May 23, 2019

mounirlamouri commented May 23, 2019

chcunningham commented Nov 12, 2019

chcunningham commented Nov 12, 2019

hadleybeeman commented Dec 5, 2019

deanliao commented Dec 5, 2019

cynthia commented Mar 4, 2020

Media Capabilities #218

Media Capabilities #218

Comments

chcunningham commented Nov 14, 2017

torgo commented Feb 2, 2018 • edited Loading

triblondon commented Feb 2, 2018

triblondon commented Feb 2, 2018

foolip commented Feb 27, 2018

foolip commented Feb 27, 2018

cynthia commented Feb 27, 2018

cynthia commented Mar 6, 2018

mounirlamouri commented Mar 6, 2018

cynthia commented Mar 6, 2018

mounirlamouri commented Mar 7, 2018

cynthia commented Oct 31, 2018

cynthia commented Oct 31, 2018

chcunningham commented Oct 31, 2018

chcunningham commented Oct 31, 2018

cynthia commented Apr 10, 2019

chcunningham commented May 2, 2019

mounirlamouri commented May 2, 2019

deanliao commented May 8, 2019

chcunningham commented May 8, 2019

markafoltz commented May 8, 2019

hober commented May 23, 2019

mounirlamouri commented May 23, 2019

chcunningham commented Nov 12, 2019

chcunningham commented Nov 12, 2019

hadleybeeman commented Dec 5, 2019

deanliao commented Dec 5, 2019

cynthia commented Mar 4, 2020

torgo commented Feb 2, 2018 •

edited

Loading