Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media Capabilities #218

Closed
2 tasks
chcunningham opened this issue Nov 14, 2017 · 30 comments
Closed
2 tasks

Media Capabilities #218

chcunningham opened this issue Nov 14, 2017 · 30 comments
Assignees
Labels
Missing: security & privacy review Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review Topic: media Topic: privacy Venue: WebRTC WebRTC and media capture Venue: WICG

Comments

@chcunningham
Copy link

Hello TAG!

I'm requesting a TAG review of:

Further details (optional):

You should also know that...

The API is available in Chrome behind a flag. --enable-blink-features=MediaCapabilities
Implementation bugs are tracked here.

We'd prefer the TAG provide feedback as (please select one):

  • open issues in our Github repo for each point of feedback
  • open a single issue in our Github repo for the entire review
  • [x ] leave review feedback as a comment in this issue and @-notify [mounirlamouri, chcunningham]
@plinss plinss added this to the tag-f2f-london-2018-01-31 milestone Feb 2, 2018
@torgo
Copy link
Member

torgo commented Feb 2, 2018

Discussed at London f2f day 3

@triblondon
Copy link

Sangwhan to write up this review over dinnertime today.

@triblondon
Copy link

Issues raised in conversation:

  • Privacy: potential for fingerprinting, private mode is insufficient mitigation

@foolip
Copy link

foolip commented Feb 27, 2018

FYI, there is now an Intent to Ship: Media Capabilities: decoding on blink-dev.

I note that this review was slower than usual, with 2.5 months passing before there was any activity, and sounds like there's still a write-up to come? Feedback at any stage of the lifecycle of a spec is welcome of course, but I'll suggest in the blink-dev thread to not block waiting for more feedback.

@cynthia
Copy link
Member

cynthia commented Feb 27, 2018

@foolip apologies for the delay. We try to triage incoming reviews as soon as we see them, but there are times stuff falls through the cracks. I think it’s safe to not consider this a blocker for shipping the feature, I’ll summarize the discussion from the F2F into a write-up shortly.

@cynthia
Copy link
Member

cynthia commented Mar 6, 2018

Apologies that this took so long. @chcunningham @foolip

As for the privacy issues, thanks for the links related to fingerprinting. The S&P questionnaire link in the original review request seems to be a 404, could you clarify on this? https://github.com/WICG/media-capabilities/blob/master/security-privacy-questionnaire.md

First pass review I notice a inconsistency with a API that touches on the same domain - Web Audio. Web Audio defines channels in the form of a unsigned long (which does obstruct away the presence of a low frequency channel) and the sample rate is a float. I don't have a strong opinion on which is better, but types for parameters touching the same concepts should probably be consistent. How to deal with the presence of a low frequency channel is an open question though - and whether or not exposing this detail is actually useful to the content authors.

The content mime type would most likely require additional parsing from each application that uses this - would it make sense to provide this in structured data to make it easier to use? It seems like most content authors would do codec checks via regex or substring matching with this approach, which isn't great. A section in the explainer (https://github.com/WICG/media-capabilities/blob/master/explainer.md#why-not-a-list-of-formats) seems to touch on this, but the intention for this review comment is different from the one mentioned here.

A normative definition of what defines a screen change (or a reference back to a normative definition) would be helpful.

Minor question 1: The explainer example code seems to suggest that the screen color depth is a string (the spec is missing a definition for this though) - is there any particular reason for this decision?

Minor question 2: The explainer touches on HDCP - but that isn't on the spec. Wouldn't the approach in the explainer break when a user launches the content on a HDCP capable screen, starts playback, then drag it into a non-HDCP capable screen?

Since it is unclear exactly what from the spec is shipping - would you mind sharing the CL that relates to the I2S?

@mounirlamouri
Copy link

The S&P questionnaire link in the original review request seems to be a 404, could you clarify on this? https://github.com/WICG/media-capabilities/blob/master/security-privacy-questionnaire.md

I believe the link is working. Maybe GH had troubles when you tried?

Web Audio and channels

As mentioned in the spec, channels is still a placeholder and we do not currently use it in Chrome's implementation. I have filed w3c/media-capabilities#73 to make it clearer.

The content mime type would most likely require additional parsing from each application that uses this - would it make sense to provide this in structured data to make it easier to use?

I'm not entirely sure what you meant by this. Specifically, what you mean by "additional parsing from each application". I would expect web applications to copy-paste the type in their code or read it directly from a manifest of some sort.

A normative definition of what defines a screen change (or a reference back to a normative definition) would be helpful.

As mentioned below, only part 2 of the spec is something we are launching in Chrome. Part 3 is more draft-y and most of it was or will be merged into a CSS spec. This will likely be the case with the change event if it ever happens.

Minor question 1: The explainer example code seems to suggest that the screen color depth is a string (the spec is missing a definition for this though) - is there any particular reason for this decision?

Color depth changes we had were merged into the appropriate CSS spec. I believe 3.3 is a leftover from a removel. I've filed w3c/media-capabilities#74

Minor question 2: The explainer touches on HDCP - but that isn't on the spec. Wouldn't the approach in the explainer break when a user launches the content on a HDCP capable screen, starts playback, then drag it into a non-HDCP capable screen?

HDCP was split into another specification with a brief explainer in the directory. It will be an extension of EME. Your point about EME and screen changes is correct though I believe CDM might deal with this. The screen change event would be another way but the intent of this event is larger and could be fired when the screen size has changed.

Since it is unclear exactly what from the spec is shipping - would you mind sharing the CL that relates to the I2S?

That's a very good point. Part 2 is the one that is shipping in Chrome: https://wicg.github.io/media-capabilities/#decoding-encoding-capabilities

@cynthia
Copy link
Member

cynthia commented Mar 6, 2018

I believe the link is working. Maybe GH had troubles when you tried?

It seems so - I just tried again and it works just fine.

I'm not entirely sure what you meant by this. Specifically, what you mean by "additional parsing from each application". I would expect web applications to copy-paste the type in their code or read it directly from a manifest of some sort.

I imagined a use case where the content author wants to parse out just the codec and not the container information, with a string based format this would require parsing the string.

As mentioned below, only part 2 of the spec is something we are launching in Chrome. Part 3 is more draft-y and most of it was or will be merged into a CSS spec. This will likely be the case with the change event if it ever happens.

It would be great if the spec could be trimmed down to only what is shipping. Stale draft material tends to confuse both implementors and content authors.

@mounirlamouri
Copy link

Good point for the spec state, I will add a warning on top of section 3 mentioning that it's still WIP.

Regarding the codecs string, we require the container and the codec, such as video/webm;codecs=vp8. I believe that most places in the web platform ask for formats in this form (old APIs would accept container-only).

@torgo torgo changed the title Review Request: Media Capabilities Media Capabilities Oct 30, 2018
@cynthia
Copy link
Member

cynthia commented Oct 31, 2018

The framerate is the number of frames used in one second (frames per second). It is represented either as a double or as a fraction.

This is a bit strange, but I'm guessing some sort of compatibility legacy reason? Would be useful to know why this is like this (and if this is the long term right way forward)

@cynthia
Copy link
Member

cynthia commented Oct 31, 2018

Hey all,

Thanks for filing this issue. We took it up during our Paris F2F. Apologies that it took so long.

I had a question about how this could be use to test capabilities in a multiple media stream context. For example, can we understand if it's possible to efficiently decode more than one video stream at once? Or get a maximum number of streams/channels/densities at which the client would hit limits? This case could come up in an RTC scenario. There may be cases where decode won't be smooth when you have two decoders or a encoder and decoder pair running, for example.

We also had questions about naming and types of the returned capabilities, specifically smooth and powerEfficient. These names imply a guarantee - despite what you've already written about how they are not. Have there been any alternative names considered? Curious if this can be addressed somehow? (We ballbrainstormed ''janky or powerInefficient as poor choices, but with the logic inverted.)

Thanks again, and looking forward to hearing back.

@plinss plinss added Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review and removed Paris2018f2f labels Oct 31, 2018
@chcunningham
Copy link
Author

Hey Cynthia,

Re: framerate, this stems discussion here
w3c/media-capabilities#12

But its recently being reconsidered
w3c/media-capabilities#96

Happy to have your input.

@chcunningham
Copy link
Author

For example, can we understand if it's possible to efficiently decode more than one video stream at once?

Its not easy to know with current API shape and I haven't thought much about ways the API could be changed for this use case. If you wanted to show 2 videos of same resolution you could approximate by doing a query that doubles the framerate. For sw codecs this is decent. For hw codecs it will depend how the hw resources are divided (e.g. by time slice vs by playback). AFAIK platform decoder APIs don't often surface parallel session limits up front, so this would involve some guessing/learning even for implementers (not a deal breaker).

These names imply a guarantee...

We could prefix the names with "ideally" or "typically"... I tend to favor the shorter names though. I expect sophisticated users of the API to understand that nothing is ever guaranteed in media ;).

@torgo torgo removed this from the 2018-11-20-telcon milestone Nov 28, 2018
@cynthia
Copy link
Member

cynthia commented Apr 10, 2019

@chcunningham did you folks have a chance to discuss this internally about the feedback above?

@chcunningham
Copy link
Author

@cynthia apologies for the delay.

Re: native APIs for concurrent playbacks, I'll have to double check - some changes coming to new Android, but not sure if this particular query is possible. @jernoble for Safari/Mac (and thoughts on the use case).

Re: webRTC - the API currently doesn't cover WebRTC decoding at all. Its debatable whether it should. There is some ongoing exploration of doing this for WebRTC encoding via encodingInfo()... The alternative approach is to augment WebRTC to better handle capabilities questions in its domain.

@mounirlamouri, thoughts on Remote Playback? What is capability discovery like in that API right now?

@mounirlamouri
Copy link

Regarding remoting, I think supporting something via Open Screen Protocol and exposed through the Presentation API would be interested but I do not think we should extend the Remote Playback API which is meant to be a very simple API to play things remotely.

@deanliao
Copy link

deanliao commented May 8, 2019

Hi TAG,

I'm implementing MediaCapabilities encodingInfo() API, specifically "transmission" type.
Intent to Implement
Design doc
I'm requesting privacy / security review as it adds fingerprinting surface just like decodingInfo(). Except above, do I need to provide any information for launching the encodingInfo() method?

@chcunningham
Copy link
Author

Re encodingInfo(), I recommend holding off on TAG review of that for a bit longer. We are still discussing the API internally (particularly the WebRTC part), so lets wait for that dust to settle (should know soon).

@markafoltz
Copy link

Re: #218 (comment)

@chrisn Sorry I missed this earlier, it got filtered into a folder.

Capability queries for the Open Screen Protocol are being discussed in w3c/openscreenprotocol#123. Depending on where it lands, it should help user agents accurately determine which devices are capable of remote playback of a specific media element. But I didn't see a direct impact on the Media Capabilities API itself, since the idea is the user agent (not the Web application) figures out the best way to play the media remotely.

For the Presentation API, if the user agent rendering the presentation supports the Media Capabilities API, the presentation can use that to make choices about what media to play; but the API should work the same as on any other document.

Happy to discuss further at our upcoming F2F.

@hober
Copy link
Contributor

hober commented May 23, 2019

Hi, @kenchris and I took a quick look at this during our F2F in Reykjavik. We note that the two pending bits of review that got added to this issue recently—decodingInfo() for encrypted media, and the "transmission" type of encodingInfo()—have fingerprinting implications, but the Security & Privacy Questionnaire assessment hasn't been updated in light of these new features. Could you update your security & privacy assessment accordingly?

Additionally, it's difficult for us to review the "transmission" type for encodingInfo() as your design document is not visible to non-Google employees. We'll hold off on reviewing this aspect until your internal dust settles and you make a design document public.

@mounirlamouri
Copy link

@hober and @kenchris I do not believe that we need to update the Security & Privacy Questionnaire as the potential fingerprinting concerns for encrypted media and "transmission" type are the same category as the fingerprinting concerns of the rest of the API.

@deanliao can you create a publicly visible version of your design document?

@chcunningham
Copy link
Author

Hi Tag, Just doing a bit of cleanup.

Re: encodingInfo() - I've created new issues in MediaCapabilities and WebRTC to scrutinize that proposal a bit more and ponder whether WebRTC should instead have a its own separate mechansim.

Does anyone have lingering issues to discuss here?

@chcunningham
Copy link
Author

Re: manifests, just for clarity, these would often be the well known mpeg-dash manifests. Some sites roll their own formats (e.g. YouTube), but they probably prefer it that way.

@hadleybeeman
Copy link
Member

We're looking at this in our TAG face-to-face in Cupertino.

Looks like we are still waiting for a design doc from @deanliao.

@chcunningham @mounirlamouri, I'm guessing you're tracking this too. Could one of the three of you help us with this?

@deanliao
Copy link

deanliao commented Dec 5, 2019

@hadleybeeman sorry I transferred to other team and handed over the task to @chcunningham .

@cynthia
Copy link
Member

cynthia commented Mar 4, 2020

@plinss and I revisited this in our Wellington F2F.

Apologies that we've had this open for so long. We don't think there are significant enough technical issues remaining to justify keeping this issue open, so we would like to close this. Thank you for bringing this to our attention!

If you have any significant design changes that would require a second round of review, let us know and we will re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing: security & privacy review Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review Topic: media Topic: privacy Venue: WebRTC WebRTC and media capture Venue: WICG
Projects
None yet
Development

No branches or pull requests