You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #276 we asked about the inherent UTF-8 requirement for the text (and to a far lesser extent json) methods. These method's default implementation assumes that the encoding of the message's bytes are, in fact, UTF-8 if the message is to be treated as text. The I18N WG is happy that UTF-8 is the default encoding and that it is the only supported encoding. But we note that there is no mention outside of the message data interface of UTF-8 or Unicode. Other data can be sent down the wire and retrieved using arrayBuffer or blob, but there is no mention of character encodings aside from the references to utf-8 decode and utf-8 encode in this section. So our ask is:
Should there be a health warning about using non-UTF-8 encodings?
[Note: this came out of I18N WG reviewing our previous comments in our periodic review cycle]
The text was updated successfully, but these errors were encountered:
Should there be a health warning about using non-UTF-8 encodings?
We can probably add a note or something. My reading is that the "utf-8 decode" will just add replacement characters but will always succeed (even with garbage).
Should we add a note just saying something about replacement characters? Or do you mean something else by "health warning about using non-UTF-8 encodings"?
If you have an example from another spec, that would be really helpful!
The problem here is that there is no actual mention of character encoding besides the utf-8 decode. Yes, the decode will succeed regardless of the encoding of bytes, but this interface can also be used for sending bytes. I would at least mention that failing to use UTF-8 will produce replacement characters or mojibake garbage. Perhaps:
Note that textual content is expected to use the UTF-8 character encoding. Content using a different character encoding needs to be decoded from an arrayBuffer() or blob().
PushMessageData interface
https://www.w3.org/TR/push-api/#pushmessagedata-interface
In #276 we asked about the inherent UTF-8 requirement for the
text
(and to a far lesser extentjson
) methods. These method's default implementation assumes that the encoding of the message's bytes are, in fact, UTF-8 if the message is to be treated as text. The I18N WG is happy that UTF-8 is the default encoding and that it is the only supported encoding. But we note that there is no mention outside of the message data interface of UTF-8 or Unicode. Other data can be sent down the wire and retrieved usingarrayBuffer
orblob
, but there is no mention of character encodings aside from the references toutf-8 decode
andutf-8 encode
in this section. So our ask is:Should there be a health warning about using non-UTF-8 encodings?
[Note: this came out of I18N WG reviewing our previous comments in our periodic review cycle]
The text was updated successfully, but these errors were encountered: