-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit: Let getDisplayMedia() influence the default type choice in the picker #184
Comments
Sorry this is not ready for PR. |
This would revisit an existing WG decision. Has new information surfaced since #32 to consider it? cc @martinthomson
As I recall, this was not among the concerns. The concerns, outlined in the Security and Privacy Questionnaire, were the security risks of sharing a web surface under attacker control: that it allows active attacks on the same-origin policy. The only hurdle is socially engineering users to select it.
This stems partly from Chrome already violating the spec's recommendations by neither implementing elevated permissions for web sources specifically nor warning users about their elevated risk. See crbug 920752 for context. |
As I replied on twitter, we'd like to focus on w3c/mediacapture-screen-share-extensions#9: a proposal to give web-pages that meet the security-criteria [agreed on with Chrome Security], preferential placement in the |
When/where has Chrome Security given their blessing to w3c/mediacapture-screen-share-extensions#9? To preferential placement of certain documents in the media picker? AFAIK, Chrome Security has spoken for cross-origin isolation and an opt-in header for getViewportMedia. That's a different topic. |
Yes, this is asking to revisit an existing WG decision. |
The argument is that the WG decision was based on wrong information and an inadequate security evaluation, and that the WG decision has led to a lack of conformance to the WG specification in the market. We're asking to revisit it. |
@eladalon1983 They haven't. I said "Their view on a similar proposal [#155] for easy self-share is that it requires not only site-isolation but opt-in from targets in order to be safe", and then "We've put forth a proposal that would give web-pages that meet these security-criteria preferential placement in the getDisplayMedia() picker."
We're interpreting the scope of their advice differently. They said "this as a larger problem of APIs that might leak data from cross-origin resources at the page-level."
Different API, same topic: how to have webpages capture webpages safely. |
@alvestrand In order to not waste the WG's time, I believe it is customary to introduce new information with such requests, is it not? Simply asking for a re-vote doesn't seem productive, because what would make a different outcome likely?
A discovery that old information is wrong might qualify as new information, if it can be substantiated. Would you be able to point out prose in the spec or its security questionnaire that is wrong? |
Part of the discussion showing developer interest is in https://crbug.com/904831 and bugs duplicated into it. The current API is based on the presumption that in user story flows that involve capturing something, the user story flow is neutral as to what type of surface is to be handled, and that any input is going to be considered valid. This presumption is clearly absurd; in nearly every user story that involves capturing something, the story involves capturing exactly one type of surface, and the idea that it should be impossible for the application to incorporate this information into its user flow is just not logical. The idea that "an application shouldn't push the user towards sharing more dangerous surfaces" is only valid if the value of sharing a more dangerous surface and sharing a less dangerous surface has equal value to the user; this is wrong. The user wants to present what the user wants to present, and that's either a more dangerous surface or a less dangerous surface; putting obstacles in the way of the user for doing what the user needs to do can never be a good UI design. Putting up dialog boxes and confirmation buttons has some value. Forcing the user to consider options that he is not going to choose anyway, because it is not what he wants to do, has none. |
as to "what's in the spec is wrong": This section: "Not accepting constraints for source selection means that getDisplayMedia only provides fingerprinting surface that exposes whether audio, video or audio and video display sources are present. (This is a fingerprinting vector.)" doesn't compute for me. And this section (from the security questionnaire): The decision of what to share (or whether to share anything at all) rests entirely with the end-user. Websites cannot influence this choice in any way is listed as an answer to the question "Is this specification exposing the minimum amount of information necessary to power the feature?" It is not at all clear that it is an answer to the question, and again, it does not reflect a reasoning behind it. |
Today the main complaint I see from vendors and users is the amount of clicking that are needed to get screen sharing done. Having the ability for the application to hint on the desired screen sharing default choice would be a good start to remedy that. |
We at RingCentral think this would be useful to us. |
At Pexip, we think this would be a useful feature too. Giving the opportunity to save historic preferences is one benefit, but the most significant benefit for us would be to guide towards the appropriate option which supports audio sharing (i.e. go straight to tab capture specifically on Mac). |
This looks good and I find it could prove useful to the Jitsi Meet app suite. Thanks for doing the work! |
We'd be interested in this for Webex as well. |
As Atos we're very much interested in this too |
@alvestrand and @eladalon1983 suggested some UX mitigations this morning that might let us move forward here. The spec could strongly recommend that user agents:
This would by no means be a catch-all — same-origin documents may lurk in other tabs and tabs' BFCache — but should preserve the social engineering obstacle to basic click-through active attacks. Self-capture use cases typically don't want a picker anyway, and will be best served by |
If we follow the advice in 1. - should this apply to just the requesting tab, or to all tabs with the same origin? |
The same workaround could be applied with tabs that only appear to be cross-origin. Namely:
Because of this, I think the recommendation need not apply to same-origin other tabs. |
It seems valuable to me to provide as precise as possible guidelines. As of iframe communication, origin partitioning should help preventing example.com/evil.com iframe to communicate (through BroadcastChannel, IDB...) to example2.com/evil.com or to evil.com. |
I've made a demo. Please launch these two tabs side by side and wait ~5s: What you see here is that these two cross-origin tabs can talk to each other. So tabs that are not same-origin may nevertheless collude to produce behavior identical to capturing a same-origin other-tab. |
Right, and this is something that Safari prohibits. |
See whatwg/html#5803 for BroadcastChannel specifically. |
Is Safari planning to prohibit cross-origin tabs talking to each other using a shared server exposing a RESTful API designed to facilitate this communication? Because evil.com and collaborator.com can try that, too. |
FWIW, if MUST is not realistic, and MAY too weak, this sounds like SHOULD would be a good representation of our intent while recognizing the reality of the world. |
MAY/SHOULD if applied to promise rejection is not great due to potential browser compat. |
Because it minimizes the logic that has to be written. Why would Chrome want to code two parallel codepaths that accomplish virtually the same thing?
Is there a good reason for the spec to mandate the the user agent MUST reject? Can't we mandate MAY ignore? After all, the user can always select a surface type other than the
I think that |
On Chrome side maybe, probably not much though. Other browsers might have to write code to actually accept but ignore this value. |
I have posed a question to you. ("Is there a good reason for the spec to mandate the the user agent MUST reject? Can't we mandate MAY ignore?") I'd like to remind you of this question.
I think understanding the objection to "MAY ignore" is the core issue we have left. (You and @jan-ivar might have the issue of constraints left, but I think that's orthogonal.) |
When you define a type in WebIDL as an enumeration, rejection is happening when the value passed to a method is not valid. What would be nice is if we had an open-ended enumeration: if value is understood, use it and if it is not, ignore it. This would also allow to more easily extend the enumeration should we have the desire to do so.
Insinuated?
This argument goes both side: some lines of code to ignore monitor by all UAs vs. some lines of code to add a separate property (that would do nothing than override the value to monitor in native code) by one UA. |
My apologies for missing those. Please link me to the relevant comment, so that I may re-read it.
I've made an argument as to why that is a non-issue here. It starts with "After all, the user can always select a surface type other than the ideal one..." |
What I was meaning is that, as a web developer, I set 'monitor', which is a valid value as per spec/WebIDL, but the picker is not defaulting to 'monitor', or it does but only in specific Chrome versions. Or, as a web developer, I set 'monitor' and getDisplayMedia throws. This would be surprising. As a web developer, I might have to put some specific UI to help user selects the monitor picker, which may depend on the support of the 'monitor' hint or not. To try summarising the discussions, there are a few options available.
Pros: Easiest to understand. Can be easily extended to additional values (self-tab maybe) in the future. Chrome can extend it to other values, should there be a need.
Pros: rejection makes it clear 'monitor' is not supported.
Pros: No compatibility issue like for 2 (but web page cannot learn the hint is ignored similarly to 1). |
So that I may answer succinctly and to the point, could you please clarify explicitly what your objections to "MAY ignore" are? Is it that developers might attempt something and it ends up being no-op? Is it something else? Is it one of multiple objections? I think it's important to understand the answer to the above questions before deep-diving into alternatives. For example, if your chief concern is indeed that developers might incorrectly expect But this is really premature, as I am answering an issue which might be marginal in your eyes. I first ask again - what are your chief objections to my proposed solution, where user agents MAY regard or disregard any hint? |
User agents SHOULD steer the user away from monitor capture, regardless of |
We are not going to specify "the user agent MUST respect the hint." |
I see nothing wrong with stating UAs MAY respect the hint, but SHOULD steer users away from monitor capture in spite of that hint. |
That's a narrowing of the MAY, not an expansion of it. |
My preference:
|
I didn't say we should say it is "dangerous", which is a characterization. I said we should say what the UA SHOULD do when interpreting the hint, which is a prescription. |
This assumes the spec would define hint values that would point user towards monitor capture. |
I'm assuming we use the existing displaysurface constraint. No need to define new surface here IMHO. |
If we use constraints, Also, I don't think anyone else would currently accept an approach other than constraints...? Nobody other than Youenn has been positive on that possibility so far. |
Reusing displaysurface has some drawbacks:
Hence why I would go with a new attribute extending DisplayMediaStreamConstraints as an open enum when supported by WebIDL and a USVString in the meantime. On the other hand, I do not really see what displaysurface gains us. Can you express the benefits of reusing displaysurface?
This is only a possibility at this point, who knows what future will be. |
as far as I can tell, again the monitor issue is only a concern if you want to use IDL to enforce a prohibition against preferring the monitor. As Elad has pointed out, this approach is unlikely to gain consensus, given that we don't have consensus that prohibiting "monitor" as preferred display surface is never appropriate. (FWIW - personal opinion - I wouldn't want this on the open web, but it is sometimes appropriate in managed deployment contexts.) Youenn, I think you are alone in your taste. getDisplayMedia({ audio: true, video: true, prefer: 'window'}) seems to indicate that the preference affects both audio and video, while getDisplayMedia({ audio: true, video: { displaySurface: 'window'} }) indicates clearly that displaySurface affects the video track only. |
This looks quite minimal to me:
Or even shorter:
Or shorter still:
|
We can throw on
These ideas go further wrt influencing user choice than what I'm comfortable with. They also seem orthogonal to looking at the existing
I don't see how. displaySurface uses DOMString, not enum, so
I disagree. I'm no fan of constraints, but consistency with getUserMedia and MST wins here in my book. |
@HTA:
Constraints are currently used throughout. Deviating from the established course requires stronger consensus IMHO. With Harald and Jan-Ivar in favor of consistency, and me generally agreeing with them, and not a lot of other people participating in the discussion, I think we should proceed under the assumption that we've decided to use constraints to resolve the current issue. That said, if Youenn formulates an independent plan to move us generally away from constraints, I will likely be very interested. |
Sorry in advance for this very long message but there are lots of different points.
@alvestrand, to be clear, I am not proposing we reject based on WebIDL.
I was referring to the issue that we are exposing a value ('monitor') we actually do not want to expose to the open web (discussion has been about potentially using this value for transition).
I think you liked the idea to influence the prompt based on the fact that tabs would be site isolated.
{video: { displaySurface: { exact: "monitor" }} does not reject right now in Chrome/Firefox AFAIK.
Consistency with getUserMedia is not making it easy to understand getDisplayMedia. Getting back to displaySurface as a constraint, it also makes it possible for web apps to do something like:
There is no plan to move away from constraints, simply to not extend their use over what is implemented in browsers. |
Sites shouldn't be able to query
Yes, for user agents to highlight some tabs over others (no spec changes needed for that), whereas here we're discussing an app signal on a default category (it's up to user agents how to apply this signal to UX).
That's hypothetical, as I'd prefer no more hints from apps.
Seems irrelevant.
Firefox would start rejecting as soon as we fix bug 1732122, so this is unrelated. It's also not really a compatibility issue since it doesn't do anything. By this standard, every new feature is a compatibility issue for people who guessed a future API name expecting it to not be there.
Consistency = similarity in API shape and pattern (which users recognize from examples, MDN, or the explainer.md)
I don't think we have to say anything. The app has declared a preference for tab or window. We don't specify UX.
Specs represent consensus, and right now, displaySurface is in the spec. If you feel it should be taken out, please file a separate issue, with new information worthy of reconsideration by the WG.
That sounds editorial. I don't think it's reasonable to block API progress on editorial cleanup. I'd like to get back to debating this on the merits of the proposed change. |
Based on the 2022-03-15, I believe we converged on:
|
We continue to have a strong demand from Web developers for functionality that lets them influence what kind of display surface the user will capture; this is one of the core differences between the pre-standard "Chrome extension API" and the WG-defined getDisplayMedia() function.
Such a functionality is easy to add (allow a constraint on capture surface type). If it does not block the user from picking other things, but merely changes the default capture surface (currently "screen" on both Chrome and Safari), it doesn't seem to be a huge increase in user risk exposure.
Example comment: https://twitter.com/RickByers/status/1403349775387353089?s=19
The text was updated successfully, but these errors were encountered: