-
Notifications
You must be signed in to change notification settings - Fork 548
Conversation
edent
commented
Mar 1, 2018
- Update spec to insist on UTF-8
- Fixes Require UTF-8 #1039
* Update spec to insist on UTF-8 * Fixes #1039
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to make a change requiring UTF-8 from authors, we should ask for very wide review.
Meanwhile, requested a couple of small changes at least to be considered...
|
||
In addition, due to a number of restrictions on <{meta}> elements, there can only be one | ||
<code>meta</code>-based character encoding declaration per document. | ||
Authoring tools should default to using <a>UTF-8</a> for newly-created documents. [[!ENCODING]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we require utf-8 then this (and similar requirements) have to be a must.
@@ -1417,24 +1414,31 @@ | |||
|
|||
A <dfn>character encoding declaration</dfn> is a mechanism by which the <a>character encoding</a> | |||
used to store or transmit a document is specified. | |||
|
|||
The only acceptable character encoding declaration for the modern web is <a>UTF-8</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to rephrase this less didactically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No :-)
That is, I'm not sure how else to phrase it in order to get the point across. Suggestions welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problems outlined here go away when exclusively using UTF-8, which is one of the many reasons that is now the mandatory encoding for all things.
https://www.w3.org/TR/2018/CR-encoding-20180327/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is hardly a high-order problem.
might end up interpreting supposedly benign plain text content as HTML tags and JavaScript. | ||
</p> | ||
<a state for="http-equiv" lt="content-type">encoding declaration state</a>, then the character | ||
encoding used must be an <a>ASCII-compatible encoding</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why doesn't this just require utf-8?
sections/iana.include
Outdated
The parameter's value must be one of the <a lt="character encoding">labels</a> of the <a>character encoding</a> | ||
used to serialize the file. [[!ENCODING]] | ||
The parameter's value must be an <a>ASCII case-insensitive</a> match for the string | ||
"<code>utf-8</code>". [[!ENCODING]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we just enforce a string, there is no normative dependency on the encoding spec here.
Regarding wide review. UTF-8 is now 91% of the web https://w3techs.com/technologies/overview/character_encoding/all Happy to hear arguments why it shouldn't be mandated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be noted in the changes section.
sections/iana.include
Outdated
<dd>The charset parameter may be provided. The parameter's value must be "<code>utf-8</code>". This parameter serves no purpose; it is only allowed for compatibility with legacy servers.</dd> | ||
<dt><code data-x="">charset</code></dt> | ||
<dd>The charset parameter may be provided. The parameter's value must be "<code>utf-8</code>". | ||
This parameter serves no purpose; it is only allowed for compatibility with legacy servers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This parameter is for compatibility with legacy servers", no?
@@ -1417,24 +1414,31 @@ | |||
|
|||
A <dfn>character encoding declaration</dfn> is a mechanism by which the <a>character encoding</a> | |||
used to store or transmit a document is specified. | |||
|
|||
The only acceptable character encoding declaration for the modern web is <a>UTF-8</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is hardly a high-order problem.
@chaals I have updated the Changes file. |
See comments from the i18n WG on this change at #1039 |