-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Media Capabilities #218
Comments
Discussed at London f2f day 3 |
Sangwhan to write up this review over dinnertime today. |
Issues raised in conversation:
|
FYI, there is now an Intent to Ship: Media Capabilities: decoding on blink-dev. I note that this review was slower than usual, with 2.5 months passing before there was any activity, and sounds like there's still a write-up to come? Feedback at any stage of the lifecycle of a spec is welcome of course, but I'll suggest in the blink-dev thread to not block waiting for more feedback. |
@foolip apologies for the delay. We try to triage incoming reviews as soon as we see them, but there are times stuff falls through the cracks. I think it’s safe to not consider this a blocker for shipping the feature, I’ll summarize the discussion from the F2F into a write-up shortly. |
Apologies that this took so long. @chcunningham @foolip As for the privacy issues, thanks for the links related to fingerprinting. The S&P questionnaire link in the original review request seems to be a 404, could you clarify on this? https://github.com/WICG/media-capabilities/blob/master/security-privacy-questionnaire.md First pass review I notice a inconsistency with a API that touches on the same domain - Web Audio. Web Audio defines channels in the form of a unsigned long (which does obstruct away the presence of a low frequency channel) and the sample rate is a float. I don't have a strong opinion on which is better, but types for parameters touching the same concepts should probably be consistent. How to deal with the presence of a low frequency channel is an open question though - and whether or not exposing this detail is actually useful to the content authors. The content mime type would most likely require additional parsing from each application that uses this - would it make sense to provide this in structured data to make it easier to use? It seems like most content authors would do codec checks via regex or substring matching with this approach, which isn't great. A section in the explainer (https://github.com/WICG/media-capabilities/blob/master/explainer.md#why-not-a-list-of-formats) seems to touch on this, but the intention for this review comment is different from the one mentioned here. A normative definition of what defines a screen change (or a reference back to a normative definition) would be helpful. Minor question 1: The explainer example code seems to suggest that the screen color depth is a string (the spec is missing a definition for this though) - is there any particular reason for this decision? Minor question 2: The explainer touches on HDCP - but that isn't on the spec. Wouldn't the approach in the explainer break when a user launches the content on a HDCP capable screen, starts playback, then drag it into a non-HDCP capable screen? Since it is unclear exactly what from the spec is shipping - would you mind sharing the CL that relates to the I2S? |
I believe the link is working. Maybe GH had troubles when you tried?
As mentioned in the spec,
I'm not entirely sure what you meant by this. Specifically, what you mean by "additional parsing from each application". I would expect web applications to copy-paste the type in their code or read it directly from a manifest of some sort.
As mentioned below, only part 2 of the spec is something we are launching in Chrome. Part 3 is more draft-y and most of it was or will be merged into a CSS spec. This will likely be the case with the change event if it ever happens.
Color depth changes we had were merged into the appropriate CSS spec. I believe 3.3 is a leftover from a removel. I've filed w3c/media-capabilities#74
HDCP was split into another specification with a brief explainer in the directory. It will be an extension of EME. Your point about EME and screen changes is correct though I believe CDM might deal with this. The screen change event would be another way but the intent of this event is larger and could be fired when the screen size has changed.
That's a very good point. Part 2 is the one that is shipping in Chrome: https://wicg.github.io/media-capabilities/#decoding-encoding-capabilities |
It seems so - I just tried again and it works just fine.
I imagined a use case where the content author wants to parse out just the codec and not the container information, with a string based format this would require parsing the string.
It would be great if the spec could be trimmed down to only what is shipping. Stale draft material tends to confuse both implementors and content authors. |
Good point for the spec state, I will add a warning on top of section 3 mentioning that it's still WIP. Regarding the codecs string, we require the container and the codec, such as |
This is a bit strange, but I'm guessing some sort of compatibility legacy reason? Would be useful to know why this is like this (and if this is the long term right way forward) |
Hey all, Thanks for filing this issue. We took it up during our Paris F2F. Apologies that it took so long. I had a question about how this could be use to test capabilities in a multiple media stream context. For example, can we understand if it's possible to efficiently decode more than one video stream at once? Or get a maximum number of streams/channels/densities at which the client would hit limits? This case could come up in an RTC scenario. There may be cases where decode won't be smooth when you have two decoders or a encoder and decoder pair running, for example. We also had questions about naming and types of the returned capabilities, specifically Thanks again, and looking forward to hearing back. |
Hey Cynthia, Re: framerate, this stems discussion here But its recently being reconsidered Happy to have your input. |
Its not easy to know with current API shape and I haven't thought much about ways the API could be changed for this use case. If you wanted to show 2 videos of same resolution you could approximate by doing a query that doubles the framerate. For sw codecs this is decent. For hw codecs it will depend how the hw resources are divided (e.g. by time slice vs by playback). AFAIK platform decoder APIs don't often surface parallel session limits up front, so this would involve some guessing/learning even for implementers (not a deal breaker).
We could prefix the names with "ideally" or "typically"... I tend to favor the shorter names though. I expect sophisticated users of the API to understand that nothing is ever guaranteed in media ;). |
@chcunningham did you folks have a chance to discuss this internally about the feedback above? |
@cynthia apologies for the delay. Re: native APIs for concurrent playbacks, I'll have to double check - some changes coming to new Android, but not sure if this particular query is possible. @jernoble for Safari/Mac (and thoughts on the use case). Re: webRTC - the API currently doesn't cover WebRTC decoding at all. Its debatable whether it should. There is some ongoing exploration of doing this for WebRTC encoding via encodingInfo()... The alternative approach is to augment WebRTC to better handle capabilities questions in its domain. @mounirlamouri, thoughts on Remote Playback? What is capability discovery like in that API right now? |
Regarding remoting, I think supporting something via Open Screen Protocol and exposed through the Presentation API would be interested but I do not think we should extend the Remote Playback API which is meant to be a very simple API to play things remotely. |
Hi TAG, I'm implementing MediaCapabilities encodingInfo() API, specifically "transmission" type. |
Re encodingInfo(), I recommend holding off on TAG review of that for a bit longer. We are still discussing the API internally (particularly the WebRTC part), so lets wait for that dust to settle (should know soon). |
Re: #218 (comment) @chrisn Sorry I missed this earlier, it got filtered into a folder. Capability queries for the Open Screen Protocol are being discussed in w3c/openscreenprotocol#123. Depending on where it lands, it should help user agents accurately determine which devices are capable of remote playback of a specific media element. But I didn't see a direct impact on the Media Capabilities API itself, since the idea is the user agent (not the Web application) figures out the best way to play the media remotely. For the Presentation API, if the user agent rendering the presentation supports the Media Capabilities API, the presentation can use that to make choices about what media to play; but the API should work the same as on any other document. Happy to discuss further at our upcoming F2F. |
Hi, @kenchris and I took a quick look at this during our F2F in Reykjavik. We note that the two pending bits of review that got added to this issue recently—decodingInfo() for encrypted media, and the "transmission" type of encodingInfo()—have fingerprinting implications, but the Security & Privacy Questionnaire assessment hasn't been updated in light of these new features. Could you update your security & privacy assessment accordingly? Additionally, it's difficult for us to review the "transmission" type for encodingInfo() as your design document is not visible to non-Google employees. We'll hold off on reviewing this aspect until your internal dust settles and you make a design document public. |
@hober and @kenchris I do not believe that we need to update the Security & Privacy Questionnaire as the potential fingerprinting concerns for encrypted media and "transmission" type are the same category as the fingerprinting concerns of the rest of the API. @deanliao can you create a publicly visible version of your design document? |
Hi Tag, Just doing a bit of cleanup. Re: encodingInfo() - I've created new issues in MediaCapabilities and WebRTC to scrutinize that proposal a bit more and ponder whether WebRTC should instead have a its own separate mechansim. Does anyone have lingering issues to discuss here? |
Re: manifests, just for clarity, these would often be the well known mpeg-dash manifests. Some sites roll their own formats (e.g. YouTube), but they probably prefer it that way. |
We're looking at this in our TAG face-to-face in Cupertino. Looks like we are still waiting for a design doc from @deanliao. @chcunningham @mounirlamouri, I'm guessing you're tracking this too. Could one of the three of you help us with this? |
@hadleybeeman sorry I transferred to other team and handed over the task to @chcunningham . |
@plinss and I revisited this in our Wellington F2F. Apologies that we've had this open for so long. We don't think there are significant enough technical issues remaining to justify keeping this issue open, so we would like to close this. Thank you for bringing this to our attention! If you have any significant design changes that would require a second round of review, let us know and we will re-open. |
Hello TAG!
I'm requesting a TAG review of:
Further details (optional):
You should also know that...
The API is available in Chrome behind a flag. --enable-blink-features=MediaCapabilities
Implementation bugs are tracked here.
We'd prefer the TAG provide feedback as (please select one):
The text was updated successfully, but these errors were encountered: