Mike West, November 2018
(Note: This isn't a proposal that's well thought out, and stamped solidly with the Google Seal of Approval. It's a collection of interesting ideas for discussion, nothing more, nothing less.)
User agents specify a set of languages as part of each HTTP request via the Accept-Languages
header.
This header's value is critical for some kinds of content negotiation, ideally allowing servers to give
each user content in a language they prefer. My user agent sends a header along the lines of:
Accept-Language: en-US,en;q=0.9,de;q=0.8
This shows that I prefer content in English, but accept content in German as well. That's great for a few important use cases, but exposes quite a bit of entropy to the web at large, even though a small subset of sites I visit will actually use the information to improve my experience.
This document proposes that we deprecate that header, and move to an opt-in model which will reveal my language preferences to the specific sites that declare they can do something useful with it, while keeping them off the wire for the rest.
In the glorious future, servers will receive no information about the user's language preferences.
Servers can instead opt-into receiving such information via a new Lang
Client Hint. We might
accomplish this as follows:
-
Browsers should deprecate and eventually remove the
Accept-Language
header. In order to quickly realize some privacy gains without a lot of compatibility impact, the header could be locked to something genericly appropriate (perhaps based on typical language preferences for the user's rough, IP-based geolocation) as an intermediate step. -
Browsers should introduce a new
Sec-CH-Lang
header field, representing the users' language preferences. This would be a Structured Header whose value is a flat list of language codes in the users' preferred order. For example:Sec-CH-Lang: "en-US", "en", "de"
-
Browsers should restrict the
navigator.language
andnavigator.languages
APIs to secure contexts, to match Client Hints' scope. -
Servers can opt-into the new
Lang
Client Hint by delivering an appropriateAccept-CH
header in the usual way:Accept-CH: Lang, DPR, Width
Note that
Accept-CH
is only accepted over secure transport: this proposal has the effect of deprecating language availability over plaintext entirely. Note also that only the top-level document's server can opt-into Client Hints. If third-parties require the data, it can be delegated to them via an appropriate Feature Policy.
If the server absolutely requires language information to make a reasonable decision about what
data to send to the user, it can do so by sending an Accept-CH
header along with a 302 response,
causing another navigation request, which will include the Sec-CH-Lang
header.
The Accept-Language
header includes a weighting mechanism (e.g. the q=0.8
bits in the example
above). A potential application of that is expressing equal preference for two or more languages,
but the challenge of exposing such an option to users (compared to an ordered list) seems to make
practical use unlikely. I'll blithely assert that q
weighting can be removed without impact.
Right. There are more than a few open PRs:
- Fetch integration of Accept-CH opt-in: whatwg/fetch#773
- HTML integration of Accept-CH-Lifetime and the ACHL cache: whatwg/HTML#3774
- Adding new CH features to the CH list in Fetch: whatwg/fetch#725
- Other PRs for adding the Feature Policy 3rd party opt-in: whatwg/fetch#811 and wicg/feature-folicy#220
Based on some discussion in w3ctag/design-reviews#320,
it seems reasonable to forbid access to these headers from JavaScript, and demarcate them as
browser-controlled client hints so they can be documented and included in requests without triggering
CORS preflights. A Sec-CH-
prefix seems like a viable approach. This bit might shift as the broader
Client Hints discussions above coalesce into something more solid that lands in specs.