-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New principle: Deployability and Monitoring #368
Comments
Can you say more about the principal you're proposing @yaovweiss? I can't quite picture what that principal would look like. What lines are you considering drawing. Whats the backstop / limit? I want to re-emphaisze that these discussions keep conflating two different issues. I've never suggest that was inappropriate (w/ or w/o scare quotes) for sites are able to monitor sites with their own resources and devices, or with devices they've received consent to use, or rented from a monitoring services, etc. A more advanced isitdown that took the measurements you're concerned with sounds great. A "Google Site Monitor" service that monitored sites from a variety of IPs and browsers, running from google servers, sounds like a wonderful thing. It could even use the wonderful puppeteer tool that Google maintains. What I do think is inappropriate, is to standardize browser functionality specifically designed for sites to collect user data about users' environments, network conditions, browser configurations etc, for purposes that don't directly benefit users, and without asking consent. I think its doubly inappropriate because part of the justification thats been expressed for not asking consent is "we don't think enough users would grant consent". In short, it's all good to improve the ability of sites to monitor how their sites perform; but it doesn't follow (and isn't the case) its its therefor appropriate for sites to conscript unknowing, unconsenting users' for that purpose. Just because there are resources on users' machines that could benefit sites, doesn't mean those resources sites' for the taking. |
Unfortunately, lab based services such as the one you're proposing are not sufficient in order for sites to e.g. confidently know that:
While lab data has a lot of advantages, it's not sufficient on its own.
I think this is the crux of our disagreement, so let's tackle it head on:
Any link to that? The justifications I heard are mostly around "there's no need to ask for user consent in order to perform a |
Friendly ping! :) |
Apologies for the delay. I was waiting until the privacy-principals group had its next meeting to respond. That group is discussing many very similar issues (such as w3ctag/privacy-principles#162). @jyasskin is part of those conversations, and I hope he'll correct me if im wrong, but I think its accurate to say that there are folks other than myself who share (or are at least working through) the same concerns expressed here; that consent, opt-in, etc is an import part of user agents making this kind of information available to sites. Its very not settled, and I'm not at all trying to say "that group said X, so let's close this issue." I'm just trying to avoid splitting threads too much. I think it'd be good if we could pause this thread until the privacy-principals conversation is resolved, and to continue this conversation there (either with you directly, or with the Google members of that group expressing your concerns; though i think @jyasskin is already doing so :) ). How does that sound?
I was thinking of WICG/crash-reporting#1 (comment). I understood "significantly hurts the quality of the data" to mean "less data", i.e., many folks wouldn't consent if we asked them.
I'm calling this user data because its generated by users, and (directly or otherwise) describes a user's environment, experience, capabilities, choice of software, etc.
I don't think it is accurate to generalize the Reporting API as " If this is just about developer ergonomics, then I suggest ya'll create a Reporting-API JS library that provides the already existing data with a nicer API to use; a jQuery for analytics. In other words, if this is just about exposing already available data in a nicer way, then a nicer JS API over existing capabilities would seem to solve that concern, we can delete the Reporting API and call it a day ;) But i'm pretty sure thats not the case ;) I think the Reporting API (and the related report types) intends to exposing new qualities and new quantities of data to sites. I think users and browsers currently don't provide this data to sites (either at all, or in the amount that Reporting-API-supporters expect the Reporting API would make available) and the goal of the Reporting API is to change that. And so, if an API is going to:
then the right thing to do is either a) continue not providing that data to sites, or b) ask users if they'd like to share the new kinds and types of user data with sites. |
Since Reporting is a general mechanism that doesn't expose in and of itself any information (but just enables other specs to send it), it's not immediately clear to me what "this kind of information" means. Can you elaborate?
I'm fine discussing this either here and on the Privacy Principles repo, but @jyasskin suggested that this repo may be a better fit, since it's a tradeoff between a privacy principle (don't share data) and a design one (apps need to monitor their operations).
OK. FWIW, I disagree with this. In my mind, there are 2 separate questions when we're talking about exposing certain data. There's the question of medium (JS API, Reporting API, etc), and there's the more important question: Can we expose this data from a privacy perspective? The answer to the latter question doesn't change based on the medium. If exposing the data requires a permission prompt, then it should require one regardless of how it's exposed. At the same time, requesting a user permission for any use of the Reporting API is similar to asking for permission for any use of an HTTP request.
So any data is "user data" in your definition? If we e.g. log HTTP status codes on both the server side and the client side and report both, would a 502 error code be considered "server data" when logged on the server and "user data" when logged on the client?
I can't help what you think, but the Reporting API is infrastructure for sending reports in the same way that Fetch is infrastructure for sending HTTP requests. It provides reliability and ergonomic advantages over Fetch, but a lot of the data that is exposed relying on that infrastructure could similarly be exposed by JS APIs that expose that data + fetch. The "reliability" part of Reporting enables it to send those reports in cases where the site failed to load entirely, e.g. due to bad configuration of the site or of the resources it embeds. But it's the relying features that make use of that infrastructure to send potentially new data, in cases where it is warranted. If there are privacy issues with e.g. HTML reporting to sites that their COOP is misconfigured, it makes sense to file issues against that specific functionality, outline why it violates user privacy and then tackle that. What you're suggesting is different though. You're suggesting that if the infrastructure can potentially be misused, then any use of it should be put to stricter standards than the equivalent features (i.e. |
Hey @yoavweiss! I think that this question might benefit from being split a bit into:
I'm sorry to expand the workload on this (insert |
Hey @darobin :)
I think we agree here. The privacy considerations for telemetry seem like something that the Privacy Principles TF could answer. They all rely on the assumption that telemetry (in support of deployability and monitoring) is a legitimate use case, which is what I want to establish here in this issue. Does that make sense? |
This could reveal somewhat limited information about end-user-specific proxies. (In many cases that information might also be attainable through I agree that telemetry is useful, but we have to be extremely careful when it goes beyond what can be observed by websites directly. I don't think such care has always been demonstrated, which in part is why there's reluctance around this set of features. Beyond that there's the argument that telemetry is wasteful for end users (in terms of computing resources) and therefore has to be opt-in. To me that seems more like a policy decision as you cannot really enforce the collection of telemetry through technical means. At least not where it does not go beyond what can already be observed. You could choose not to aid it, but that likely results in less overall end user control. |
@yoavweiss shared this with me as I was going over some of my frustrations with browser breaking changes.
This is 100% true. to use Salesforce as one example there is a potential of millions permutations of the web application and then the spectrum of UAs, form factors, regions, network latency, etc makes the lab insufficient for truly deriving whether something will or won't break. This would be like proposing that browsers or OSes not have the capability of gathering telemetry but should just run labs.
I'm going to split these in two:
The premise that this telemetry does not benefit users is completely unfounded. We have had hundreds of customer cases opened by our customer's and end-users due to deprecations, functional and non-functional regressions. The entire desire that I have from this API is actually to ensure that our user's have a great experience.
I agree with this if the group does feel the information is personal and has the potential of impacting privacy. I highly recommend that if we go this path then this should be identified and placed on specific properties and upon utiilization of those properties the prompt should be show; not on the more generic information. To be a bit more specific, Salesforce is primary focus today is the DeprecationReportBody aspect of the API. This is invaluable as you're able to understand how many users are actually potentially hitting code paths that will break when a deprecation rolls out. Additionally, it enables us to have insights earlier on on potential breakages that we may not know about.
I know you noted that you're just thinking these things out but I will ultimately push back against this but I won't dig into it too far until concrete proposals are put forward. Happy to help figure this out with you all to keep this API moving forward and getting into other UAs as I'd like to increase the knowledge of impact for our users proactively rather than waiting for them to break to then get it addressed. |
In the time since this issue was opened, some new text has been added on this topic to the Privacy Principles document which feel covers these cases. Hence we're going to close this. If people feel there is need for additional text in the Design Principles doc for this issue, please raise a PR. |
whatwg/html#6933 was filed by @pes10k against the integration of HTML with the Reporting API, an API which goal is to ensure monitoring and deployability of other web platform features or of the serving of web applications themselves. The issue claims that these use cases are somehow "inappropriate".
I think it's important to outline as a principle that monitoring and reporting (on the web application's performance, use of deprecated APIs, or its use of new security-related restrictions in "report only" mode, to name a few examples) are an essential part of being able to deploy web applications at scale.
The text was updated successfully, but these errors were encountered: