From f08fe3bc18b41e4de5fe504eb32586073eb4a6ee Mon Sep 17 00:00:00 2001
From: Anne van Kesteren <annevk@annevk.nl>
Date: Mon, 16 Dec 2024 09:39:51 +0100
Subject: [PATCH] Review Draft Publication: December 2024

---
 encoding.bs              |    2 +-
 review-drafts/2024-12.bs | 3584 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 3585 insertions(+), 1 deletion(-)
 create mode 100644 review-drafts/2024-12.bs
diff --git a/encoding.bs b/encoding.bs
index 812821c..b5a539d 100644
--- a/encoding.bs
+++ b/encoding.bs
@@ -3,7 +3,7 @@ Group: WHATWG
 H1: Encoding
 Shortname: encoding
 Text Macro: TWITTER encodings
-Text Macro: LATESTRD 2023-06
+Text Macro: LATESTRD 2024-12
 Abstract: The Encoding Standard defines encodings and their JavaScript API.
 Translation: ja https://triple-underscore.github.io/Encoding-ja.html
 Markup Shorthands: css off
diff --git a/review-drafts/2024-12.bs b/review-drafts/2024-12.bs
new file mode 100644
index 0000000..0be499d
--- /dev/null
+++ b/review-drafts/2024-12.bs
@@ -0,0 +1,3584 @@
+<pre class=metadata>
+Group: WHATWG
+Status: RD
+Date: 2024-12-16
+H1: Encoding
+Shortname: encoding
+Text Macro: TWITTER encodings
+Text Macro: LATESTRD 2024-12
+Abstract: The Encoding Standard defines encodings and their JavaScript API.
+Translation: ja https://triple-underscore.github.io/Encoding-ja.html
+Markup Shorthands: css off
+Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeoptions textdecodeoptions,index section-index
+</pre>
+
+<link rel=stylesheet href=visualization-colors.css>
+
+
+
+<h2 id=preface>Preface</h2>
+
+<p>The UTF-8 encoding is the most appropriate encoding for interchange of Unicode, the
+universal coded character set. Therefore for new protocols and formats, as well as
+existing formats deployed in new contexts, this specification requires (and defines) the
+UTF-8 encoding.
+
+<p>The other (legacy) encodings have been defined to some extent in the past. However,
+user agents have not always implemented them in the same way, have not always used the
+same labels, and often differ in dealing with undefined and former proprietary areas of
+encodings. This specification addresses those gaps so that new user agents do not have to
+reverse engineer encoding implementations and existing user agents can converge.
+
+<p>In particular, this specification defines all those encodings, their algorithms to go
+from bytes to scalar values and back, and their canonical names and identifying labels.
+This specification also defines an API to expose part of the encoding algorithms to
+JavaScript.
+
+<p>User agents have also significantly deviated from the labels listed in the
+<a href=https://www.iana.org/assignments/character-sets/character-sets.xhtml>IANA Character Sets registry</a>.
+To stop spreading legacy encodings further, this specification is exhaustive about the
+aforementioned details and therefore has no need for the registry. In particular, this
+specification does not provide a mechanism for extending any aspect of encodings.
+
+
+
+<h2 id=security-background>Security background</h2>
+
+<p>There is a set of encoding security issues when the producer and consumer do not agree on the
+encoding in use, or on the way a given encoding is to be implemented. For instance, an attack was
+reported in 2011 where a <a>Shift_JIS</a> lead byte 0x82 was used to “mask” a 0x22 trail byte in a
+JSON resource of which an attacker could control some field. The producer did not see the problem
+even though this is an illegal byte combination. The consumer decoded it as a single U+FFFD and
+therefore changed the overall interpretation as U+0022 is an important delimiter. Decoders of
+encodings that use multiple bytes for scalar values now require that in case of an illegal byte
+combination, a scalar value in the range U+0000 to U+007F, inclusive, cannot be “masked”. For the
+aforementioned sequence the output would be U+FFFD U+0022. (As an unfortunate exception to this, the
+<a>gb18030 decoder</a> will “mask” up to one such byte at <a>end-of-queue</a>.)
+
+<p>This is a larger issue for encodings that map anything that is an <a>ASCII byte</a> to something
+that is not an <a>ASCII code point</a>, when there is no lead byte present. These are
+“ASCII-incompatible” encodings and other than <a>ISO-2022-JP</a> and <a>UTF-16BE/LE</a>, which are
+unfortunately required due to deployed content, they are not supported. (Investigation is
+<a href=https://github.com/whatwg/encoding/issues/8 lt="Add more labels to the replacement encoding">ongoing</a>
+whether more labels of other such encodings can be mapped to the <a>replacement</a> encoding, rather
+than the unknown encoding fallback.) An example attack is injecting carefully crafted content into a
+resource and then encouraging the user to override the encoding, resulting in, e.g., script
+execution.
+
+<p>Encoders used by URLs found in HTML and HTML's form feature can also result in slight information
+loss when an encoding is used that cannot represent all scalar values. E.g., when a resource uses
+the <a>windows-1252</a> encoding a server will not be able to distinguish between an end user
+entering “💩” and “&amp;#128169;” into a form.
+
+<p>The problems outlined here go away when exclusively using UTF-8, which is one of the many reasons
+that is now the mandatory encoding for all things.
+
+<p class=note>See also the <a href=#browser-ui>Browser UI</a> chapter.
+
+
+
+<h2 id=terminology>Terminology</h2>
+
+<p>This specification depends on the Infra Standard. [[!INFRA]]
+
+<p>Hexadecimal numbers are prefixed with "0x".
+
+<p>In equations, all numbers are integers, addition is represented by "+", subtraction by "&minus;",
+multiplication by "×", integer division by "/" (returns the quotient), modulo by "%" (returns the
+remainder of an integer division), logical left shifts by "&lt;&lt;", logical right shifts by ">>",
+bitwise AND by "&amp;", and bitwise OR by "|".
+
+<p>For logical right shifts operands must have at least twenty-one bits precision.
+
+<hr>
+
+<p>An <dfn id=concept-stream export>I/O queue</dfn> is a type of <a for=/>list</a> with
+<a for=list>items</a> of a particular type (i.e., <a>bytes</a> or <a>scalar values</a>).
+<dfn id="end-of-stream" export>End-of-queue</dfn> is a special <a for=list>item</a> that can be
+present in <a for=/>I/O queues</a> of any type and it signifies that there are no more
+<a for=list>items</a> in the queue.
+
+<div class=note>
+ <p>There are two ways to use an <a for=/>I/O queue</a>: in immediate mode, to represent I/O data
+ stored in memory, and in streaming mode, to represent data coming in from the network. Immediate
+ queues have <a>end-of-queue</a> as their last item, whereas streaming queues need not have it, and
+ so their <a for="I/O queue">read</a> operation might block.
+
+ <p>It is expected that streaming <a for=/>I/O queues</a> will be created empty, and that new
+ <a for=list>items</a> will be <a for="I/O queue">pushed</a> to it as data comes in from the
+ network. When the underlying network stream closes, an <a>end-of-queue</a> item is to be
+ <a for="I/O queue">pushed</a> into the queue.
+
+ <p>Since reading from a streaming <a for=/>I/O queue</a> might block, streaming
+ <a for=/>I/O queues</a> are not to be used from an <a for=/>event loop</a>. They are to be used
+ <a>in parallel</a> instead.
+</div>
+
+<p>To <dfn id=concept-stream-read for="I/O queue" export>read</dfn> an <a for=list>item</a> from an
+<a for=/>I/O queue</a> <var>ioQueue</var>, run these steps:
+
+<ol>
+ <li><p>If <var>ioQueue</var> <a for=list>is empty</a>, then wait until its <a for=list>size</a> is
+ at least 1.
+
+ <li><p>If <var>ioQueue</var>[0] is <a>end-of-queue</a>, then return <a>end-of-queue</a>.
+
+ <li><p><a for=list>Remove</a> <var>ioQueue</var>[0] and return it.
+</ol>
+
+<p>To <a for="I/O queue">read</a> a number <var>number</var> of <a for=list>items</a> from
+<var>ioQueue</var>, run these steps:
+
+<ol>
+ <li><p>Let <var>readItems</var> be « ».
+
+ <li>
+  <p>Perform the following step <var>number</var> times:
+
+  <ol>
+   <li><p><a for=list>Append</a> to <var>readItems</var> the result of
+   <a for="I/O queue">reading</a> an item from <var>ioQueue</var>.
+  </ol>
+ </li>
+
+ <li><p><a for=list>Remove</a> <a>end-of-queue</a> from <var>readItems</var>.
+
+ <li><p>Return <var>readItems</var>.
+</ol>
+
+<p>To <dfn for="I/O queue" export>peek</dfn> a number <var>number</var> of <a for=list>items</a>
+from an <a for=/>I/O queue</a> <var>ioQueue</var>, run these steps:
+
+<ol>
+ <li><p>Wait until either <var>ioQueue</var>'s <a for=list>size</a> is equal to or greater than
+ <var>number</var>, or <var>ioQueue</var> <a for=list>contains</a> <a>end-of-queue</a>, whichever
+ comes first.
+
+ <li><p>Let <var>prefix</var> be « ».
+
+ <li>
+  <p><a for=list>For each</a> <var>n</var> in <a>the range</a> 1 to <var>number</var>, inclusive:
+
+  <ol>
+   <li><p>If <var>ioQueue</var>[<var>n</var>] is <a>end-of-queue</a>, <a>break</a>.
+
+   <li><p>Otherwise, <a for=list>append</a> <var>ioQueue</var>[<var>n</var>] to <var>prefix</var>.
+  </ol>
+ </li>
+
+ <li><p>Return <var>prefix</var>.
+</ol>
+
+<p>To <dfn id=concept-stream-push for="I/O queue" export>push</dfn> an <a for=list>item</a>
+<var>item</var> to an <a for=/>I/O queue</a> <var>ioQueue</var>, run these steps:
+
+<ol>
+ <li>
+  <p>If the last <a for=list>item</a> in <var>ioQueue</var> is <a>end-of-queue</a>, then:
+
+  <ol>
+   <li><p>If <var>item</var> is <a>end-of-queue</a>, do nothing.
+
+   <li><p>Otherwise, <a for=list>insert</a> <var>item</var> before the last <a for=list>item</a> in
+   <var>ioQueue</var>.
+  </ol>
+ </li>
+
+ <li><p>Otherwise, <a for=list>append</a> <var>item</var> to <var>ioQueue</var>.
+</ol>
+
+<p>To <a for="I/O queue">push</a> a sequence of items to an <a for=/>I/O queue</a>
+<var>ioQueue</var> is to push each item in the sequence to <var>ioQueue</var>, in the given order.
+
+<p>To <dfn id=concept-stream-prepend for="I/O queue">restore</dfn> an <a for=list>item</a> other
+than <a>end-of-queue</a> to an <a for=/>I/O queue</a>, perform the <a for=/>list</a>
+<a for=list>prepend</a> operation. To <a for="I/O queue">restore</a> a <a for=/>list</a> of
+<a for=list>items</a> excluding <a>end-of-queue</a> to an <a for=/>I/O queue</a>, insert those
+items, in the given order, before the first item in the queue.
+
+<p class=example id=example-tokens>Inserting the bytes « 0xF0, 0x9F » in an I/O queue
+« 0x92 0xA9, <a>end-of-queue</a> », results in an I/O queue
+« 0xF0, 0x9F, 0x92 0xA9, <a>end-of-queue</a> ». The next item to be read would be 0xF0. <!-- 💩 -->
+
+<p>To <dfn for="from I/O queue">convert</dfn> an <a for=/>I/O queue</a> <var>ioQueue</var> into a
+<a for=/>list</a>, <a>string</a>, or <a>byte sequence</a>, return the result of
+<a for="I/O queue">reading</a> an indefinite number of <a for=list>items</a> from
+<var>ioQueue</var>.
+
+<p>To <dfn for="to I/O queue">convert</dfn> a <a for=/>list</a>, <a>string</a>, or
+<a>byte sequence</a> <var>input</var> into an <a for=/>I/O queue</a>, run these steps:
+
+<ol>
+ <li><p>Assert: if <var>input</var> is a <a for=/>list</a>, then it does not <a for=list>contain</a>
+ <a>end-of-queue</a>.
+
+ <li><p>Return an <a for=/>I/O queue</a> containing the <a for=list>items</a> in <var>input</var>,
+ in order, followed by <a>end-of-queue</a>.
+</ol>
+
+<p class=XXX>The Infra standard is expected to define some infrastructure around type conversions.
+See <a href="https://github.com/whatwg/infra/issues/319">whatwg/infra issue #319</a>. [[INFRA]]
+
+<p class=note><a for=/>I/O queues</a> are defined as <a for=/>lists</a>, not
+<a spec=infra>queues</a>, because they feature a <a for="I/O queue">restore</a> operation. However,
+this restore operation is an internal detail of the algorithms in this specification, and is not to
+be used by other standards. Implementations are free to find alternative ways to implement such
+algorithms, as detailed in [[#implementation-considerations]].
+
+<hr>
+
+<p>To obtain a <dfn>scalar value from surrogates</dfn>, given a <a for=/>leading surrogate</a>
+<var>leading</var> and a <a for=/>trailing surrogate</a> <var>trailing</var>, return
+0x10000 + ((<var>leading</var> &minus; 0xD800) &lt;&lt; 10) + (<var>trailing</var> &minus; 0xDC00).
+
+
+
+<h2 id=encodings>Encodings</h2>
+
+<p>An <dfn export>encoding</dfn> defines a mapping from a <a>scalar value</a> sequence to
+a <a>byte</a> sequence (and vice versa). Each <a for=/>encoding</a> has a
+<dfn id=name export for=encoding>name</dfn>, and one or more
+<dfn id=label export for=encoding lt=label>labels</dfn>.
+
+<p class="note no-backref">This specification defines three <a for=/>encodings</a> with the same
+names as <i>encoding schemes</i> defined in the Unicode standard: <a>UTF-8</a>, <a>UTF-16LE</a>, and
+<a>UTF-16BE</a>. The <a for=/>encodings</a> differ from the <i>encoding schemes</i> by byte order
+mark (also known as BOM) handling not being part of the <a for=/>encodings</a> themselves and
+instead being part of wrapper algorithms in this specification, whereas byte order mark handling is
+part of the definition of the <i>encoding schemes</i> in the Unicode Standard. <a>UTF-8</a> used
+together with the <a>UTF-8 decode</a> algorithm matches the <i>encoding scheme</i> of the same name.
+This specification does not provide wrapper algorithms that would combine with <a>UTF-16LE</a> and
+<a>UTF-16BE</a> to match the similarly-named <i>encoding schemes</i>. [[UNICODE]]
+
+
+<h3 id=encoders-and-decoders>Encoders and decoders</h3>
+
+<p>Each <a for=/>encoding</a> has an associated <dfn>decoder</dfn> and most of them have an
+associated <dfn>encoder</dfn>. Instances of <a for=/>decoders</a> and <a for=/>encoders</a> have a
+<dfn>handler</dfn> algorithm and might also have state. A <a>handler</a> algorithm takes an input
+<a for=/>I/O queue</a> and an <a for=list>item</a>, and returns
+<dfn>finished</dfn>, one or more <a for=list>items</a>, <dfn>error</dfn>
+optionally with a <a>code point</a>, or <dfn>continue</dfn>.
+
+<p class="note no-backref">The <a>replacement</a> and <a>UTF-16BE/LE</a> <a for=/>encodings</a> have
+no <a for=/>encoder</a>.
+
+<p>An <dfn>error mode</dfn> as used below is "<code>replacement</code>" or "<code>fatal</code>" for
+a <a for=/>decoder</a> and "<code>fatal</code>" or "<code>html</code>" for an <a for=/>encoder</a>.
+
+<p class=note>An XML processor would set <a for=/>error mode</a> to "<code>fatal</code>".
+[[XML]]
+
+<p class=note>"<code>html</code>" exists as <a for=/>error mode</a> due to HTML forms requiring a
+non-terminating legacy <a for=/>encoder</a>. The "<code>html</code>" <a for=/>error mode</a> causes
+a sequence to be emitted that cannot be distinguished from legitimate input and can therefore lead
+to silent data loss. Developers are strongly encouraged to use the <a>UTF-8</a>
+<a for=/>encoding</a> to prevent this from happening. [[HTML]]
+
+<hr>
+
+<p>To <dfn lt="process a queue|processing a queue" id=concept-encoding-run>process a queue</dfn>
+given an <a for=/>encoding</a>'s <a for=/>decoder</a> or <a for=/>encoder</a> instance
+<var>encoderDecoder</var>, <a for=/>I/O queue</a> <var>input</var>, <a for=/>I/O queue</a>
+<var>output</var>, and <a for=/>error mode</a> <var>mode</var>:
+
+<ol>
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>result</var> be the result of <a>processing an item</a> with the result of
+   <a>reading</a> from <var>input</var>, <var>encoderDecoder</var>, <var>input</var>,
+   <var>output</var>, and <var>mode</var>.
+
+   <li><p>If <var>result</var> is not <a>continue</a>, then return <var>result</var>.
+  </ol>
+</ol>
+
+<p>To <dfn lt="process an item|processing an item" id=concept-encoding-process>process an item</dfn>
+given an <a for=list>item</a> <var>item</var>, <a for=/>encoding</a>'s <a for=/>encoder</a> or
+<a for=/>decoder</a> instance <var>encoderDecoder</var>, <a for=/>I/O queue</a> <var>input</var>,
+<a for=/>I/O queue</a> <var>output</var>, and <a for=/>error mode</a> <var>mode</var>:
+
+<ol>
+ <li><p>Assert: if <var>encoderDecoder</var> is an <a for=/>encoder</a> instance, <var>mode</var> is
+ not "<code>replacement</code>".
+
+ <li><p>Assert: if <var>encoderDecoder</var> is a <a for=/>decoder</a> instance, <var>mode</var> is
+ not "<code>html</code>".
+
+ <li><p>Assert: if <var>encoderDecoder</var> is an <a for=/>encoder</a> instance, <var>item</var> is
+ not a <a>surrogate</a>.
+
+ <li><p>Let <var>result</var> be the result of running <var>encoderDecoder</var>'s <a>handler</a> on
+ <var>input</var> and <var>item</var>.
+
+ <li>
+  <p>If <var>result</var> is <a>finished</a>:
+
+  <ol>
+   <li><p><a>Push</a> <a>end-of-queue</a> to <var>output</var>.
+
+   <li><p>Return <var>result</var>.
+  </ol>
+ </li>
+
+ <li>
+  <p>Otherwise, if <var>result</var> is one or more <a for=list>items</a>:
+
+  <ol>
+   <li><p>Assert: if <var>encoderDecoder</var> is a <a for=/>decoder</a> instance, <var>result</var>
+   does not contain any <a>surrogates</a>.
+
+   <li><p><a>Push</a> <var>result</var> to <var>output</var>.
+  </ol>
+
+ <li>
+  <p>Otherwise, if <var>result</var> is an <a>error</a>, switch on <var>mode</var> and run the
+  associated steps:
+
+  <dl class=switch>
+   <dt>"<code>replacement</code>"
+   <dd><a>Push</a> U+FFFD (�) to <var>output</var>.
+
+   <dt>"<code>html</code>"
+   <dd><a>Push</a> 0x26 (&amp;), 0x23 (#), followed by the shortest sequence of 0x30 (0) to
+   0x39 (9), inclusive, representing <var>result</var>'s <a>code point</a>'s
+   <a for="code point">value</a> in base ten, followed by 0x3B (;) to <var>output</var>.
+
+   <dt>"<code>fatal</code>"
+   <dd>Return <var>result</var>.
+  </dl>
+
+ <li><p>Return <a>continue</a>.
+</ol>
+
+
+<h3 id=names-and-labels>Names and labels</h3>
+
+<p>The table below lists all <a for=/>encodings</a>
+and their <a for=encoding>labels</a> user agents must support.
+User agents must not support any other <a for=/>encodings</a>
+or <a for=encoding>labels</a>.
+
+<p class=note>For each encoding, <a lt="ASCII lowercase">ASCII-lowercasing</a> its
+<a for=encoding>name</a> yields one of its <a for=encoding>labels</a>.
+
+<p>Authors must use the <a>UTF-8</a> <a for=/>encoding</a> and must use its
+(<a>ASCII case-insensitive</a>) "<code>utf-8</code>" <a for=encoding>label</a> to identify it.
+
+<p>New protocols and formats, as well as existing formats deployed in new contexts, must use the
+<a>UTF-8</a> <a for=/>encoding</a> exclusively. If these protocols and formats need to expose the
+<a for=/>encoding</a>'s <a for=encoding>name</a> or <a for=encoding>label</a>, they must expose it
+as "<code>utf-8</code>".
+<!-- “UTF-8 or death” — Emil A Eklund -->
+
+<p>To
+<dfn export lt="get an encoding|getting an encoding" id=concept-encoding-get>get an encoding</dfn>
+from a string <var>label</var>, run these steps:
+
+<ol>
+ <li><p>Remove any leading and trailing <a>ASCII whitespace</a> from
+ <var>label</var>.
+
+ <li><p>If <var>label</var> is an <a>ASCII case-insensitive</a> match for any of the labels listed
+ in the table below, then return the corresponding <a for=/>encoding</a>; otherwise return failure.
+</ol>
+
+<p class=note>This is a more basic and restrictive algorithm of mapping labels to
+<a for=/>encodings</a> than
+<a href=https://www.unicode.org/reports/tr22/tr22-8.html#Charset_Alias_Matching>section 1.4 of Unicode Technical Standard #22</a>
+prescribes, as that is necessary to be compatible with deployed content.
+
+<table>
+ <thead>
+  <tr>
+   <th>Name
+   <th>Labels
+ <tbody>
+  <tr><th colspan=2><a href=#the-encoding>The Encoding</a>
+  <tr>
+   <td rowspan=6><a>UTF-8</a>
+   <td>"<code>unicode-1-1-utf-8</code>"
+  <tr><td>"<code>unicode11utf8</code>"
+  <tr><td>"<code>unicode20utf8</code>"
+  <tr><td>"<code>utf-8</code>"
+  <tr><td>"<code>utf8</code>"
+  <tr><td>"<code>x-unicode20utf8</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-single-byte-encodings>Legacy single-byte encodings</a>
+  <tr>
+   <td rowspan=4><a>IBM866</a>
+   <td>"<code>866</code>"
+  <tr><td>"<code>cp866</code>"
+  <tr><td>"<code>csibm866</code>"
+  <tr><td>"<code>ibm866</code>"
+  <tr>
+   <td rowspan=9><a>ISO-8859-2</a>
+   <td>"<code>csisolatin2</code>"
+  <tr><td>"<code>iso-8859-2</code>"
+  <tr><td>"<code>iso-ir-101</code>"
+  <tr><td>"<code>iso8859-2</code>"
+  <tr><td>"<code>iso88592</code>"
+  <tr><td>"<code>iso_8859-2</code>"
+  <tr><td>"<code>iso_8859-2:1987</code>"
+  <tr><td>"<code>l2</code>"
+  <tr><td>"<code>latin2</code>"
+  <tr>
+   <td rowspan=9><a>ISO-8859-3</a>
+   <td>"<code>csisolatin3</code>"
+  <tr><td>"<code>iso-8859-3</code>"
+  <tr><td>"<code>iso-ir-109</code>"
+  <tr><td>"<code>iso8859-3</code>"
+  <tr><td>"<code>iso88593</code>"
+  <tr><td>"<code>iso_8859-3</code>"
+  <tr><td>"<code>iso_8859-3:1988</code>"
+  <tr><td>"<code>l3</code>"
+  <tr><td>"<code>latin3</code>"
+  <tr>
+   <td rowspan=9><a>ISO-8859-4</a>
+   <td>"<code>csisolatin4</code>"
+  <tr><td>"<code>iso-8859-4</code>"
+  <tr><td>"<code>iso-ir-110</code>"
+  <tr><td>"<code>iso8859-4</code>"
+  <tr><td>"<code>iso88594</code>"
+  <tr><td>"<code>iso_8859-4</code>"
+  <tr><td>"<code>iso_8859-4:1988</code>"
+  <tr><td>"<code>l4</code>"
+  <tr><td>"<code>latin4</code>"
+  <tr>
+   <td rowspan=8><a>ISO-8859-5</a>
+   <td>"<code>csisolatincyrillic</code>"
+  <tr><td>"<code>cyrillic</code>"
+  <tr><td>"<code>iso-8859-5</code>"
+  <tr><td>"<code>iso-ir-144</code>"
+  <tr><td>"<code>iso8859-5</code>"
+  <tr><td>"<code>iso88595</code>"
+  <tr><td>"<code>iso_8859-5</code>"
+  <tr><td>"<code>iso_8859-5:1988</code>"
+  <tr>
+   <td rowspan=14><a>ISO-8859-6</a>
+   <td>"<code>arabic</code>"
+  <tr><td>"<code>asmo-708</code>"
+  <tr><td>"<code>csiso88596e</code>"
+  <tr><td>"<code>csiso88596i</code>"
+  <tr><td>"<code>csisolatinarabic</code>"
+  <tr><td>"<code>ecma-114</code>"
+  <tr><td>"<code>iso-8859-6</code>"
+  <tr><td>"<code>iso-8859-6-e</code>"
+  <tr><td>"<code>iso-8859-6-i</code>"
+  <tr><td>"<code>iso-ir-127</code>"
+  <tr><td>"<code>iso8859-6</code>"
+  <tr><td>"<code>iso88596</code>"
+  <tr><td>"<code>iso_8859-6</code>"
+  <tr><td>"<code>iso_8859-6:1987</code>"
+  <tr>
+   <td rowspan=12><a>ISO-8859-7</a>
+   <td>"<code>csisolatingreek</code>"
+  <tr><td>"<code>ecma-118</code>"
+  <tr><td>"<code>elot_928</code>"
+  <tr><td>"<code>greek</code>"
+  <tr><td>"<code>greek8</code>"
+  <tr><td>"<code>iso-8859-7</code>"
+  <tr><td>"<code>iso-ir-126</code>"
+  <tr><td>"<code>iso8859-7</code>"
+  <tr><td>"<code>iso88597</code>"
+  <tr><td>"<code>iso_8859-7</code>"
+  <tr><td>"<code>iso_8859-7:1987</code>"
+  <tr><td>"<code>sun_eu_greek</code>"
+  <tr>
+   <td rowspan=11><a>ISO-8859-8</a>
+   <td>"<code>csiso88598e</code>"
+  <tr><td>"<code>csisolatinhebrew</code>"
+  <tr><td>"<code>hebrew</code>"
+  <tr><td>"<code>iso-8859-8</code>"
+  <tr><td>"<code>iso-8859-8-e</code>"
+  <tr><td>"<code>iso-ir-138</code>"
+  <tr><td>"<code>iso8859-8</code>"
+  <tr><td>"<code>iso88598</code>"
+  <tr><td>"<code>iso_8859-8</code>"
+  <tr><td>"<code>iso_8859-8:1988</code>"
+  <tr><td>"<code>visual</code>"
+  <tr>
+   <td rowspan=3><a>ISO-8859-8-I</a>
+   <td>"<code>csiso88598i</code>"
+  <tr><td>"<code>iso-8859-8-i</code>"
+  <tr><td>"<code>logical</code>"
+  <tr>
+   <td rowspan=7><a>ISO-8859-10</a>
+   <td>"<code>csisolatin6</code>"
+  <tr><td>"<code>iso-8859-10</code>"
+  <tr><td>"<code>iso-ir-157</code>"
+  <tr><td>"<code>iso8859-10</code>"
+  <tr><td>"<code>iso885910</code>"
+  <tr><td>"<code>l6</code>"
+  <tr><td>"<code>latin6</code>"
+  <tr>
+   <td rowspan=3><a>ISO-8859-13</a>
+   <td>"<code>iso-8859-13</code>"
+  <tr><td>"<code>iso8859-13</code>"
+  <tr><td>"<code>iso885913</code>"
+  <tr>
+   <td rowspan=3><a>ISO-8859-14</a>
+   <td>"<code>iso-8859-14</code>"
+  <tr><td>"<code>iso8859-14</code>"
+  <tr><td>"<code>iso885914</code>"
+  <tr>
+   <td rowspan=6><a>ISO-8859-15</a>
+   <td>"<code>csisolatin9</code>"
+  <tr><td>"<code>iso-8859-15</code>"
+  <tr><td>"<code>iso8859-15</code>"
+  <tr><td>"<code>iso885915</code>"
+  <tr><td>"<code>iso_8859-15</code>"
+  <tr><td>"<code>l9</code>"
+  <tr>
+   <td><a>ISO-8859-16</a>
+   <td>"<code>iso-8859-16</code>"
+  <tr>
+   <td rowspan=5><a>KOI8-R</a>
+   <td>"<code>cskoi8r</code>"
+  <tr><td>"<code>koi</code>"
+  <tr><td>"<code>koi8</code>"
+  <tr><td>"<code>koi8-r</code>"
+  <tr><td>"<code>koi8_r</code>"
+  <tr>
+   <td rowspan=2><a>KOI8-U</a>
+   <td>"<code>koi8-ru</code>"
+  <tr><td>"<code>koi8-u</code>"
+  <tr>
+   <td rowspan=4><a>macintosh</a>
+   <td>"<code>csmacintosh</code>"
+  <tr><td>"<code>mac</code>"
+  <tr><td>"<code>macintosh</code>"
+  <tr><td>"<code>x-mac-roman</code>"
+  <tr>
+   <td rowspan=6><a>windows-874</a>
+   <td>"<code>dos-874</code>"
+  <tr><td>"<code>iso-8859-11</code>"
+  <tr><td>"<code>iso8859-11</code>"
+  <tr><td>"<code>iso885911</code>"
+  <tr><td>"<code>tis-620</code>"
+  <tr><td>"<code>windows-874</code>"
+  <tr>
+   <td rowspan=3><a>windows-1250</a>
+   <td>"<code>cp1250</code>"
+  <tr><td>"<code>windows-1250</code>"
+  <tr><td>"<code>x-cp1250</code>"
+  <tr>
+   <td rowspan=3><a>windows-1251</a>
+   <td>"<code>cp1251</code>"
+  <tr><td>"<code>windows-1251</code>"
+  <tr><td>"<code>x-cp1251</code>"
+  <tr>
+   <td rowspan=17><a>windows-1252</a>
+   <td>"<code>ansi_x3.4-1968</code>"
+  <tr><td>"<code>ascii</code>"
+  <tr><td>"<code>cp1252</code>"
+  <tr><td>"<code>cp819</code>"
+  <tr><td>"<code>csisolatin1</code>"
+  <tr><td>"<code>ibm819</code>"
+  <tr><td>"<code>iso-8859-1</code>"
+  <tr><td>"<code>iso-ir-100</code>"
+  <tr><td>"<code>iso8859-1</code>"
+  <tr><td>"<code>iso88591</code>"
+  <tr><td>"<code>iso_8859-1</code>"
+  <tr><td>"<code>iso_8859-1:1987</code>"
+  <tr><td>"<code>l1</code>"
+  <tr><td>"<code>latin1</code>"
+  <tr><td>"<code>us-ascii</code>"
+  <tr><td>"<code>windows-1252</code>"
+  <tr><td>"<code>x-cp1252</code>"
+  <tr>
+   <td rowspan=3><a>windows-1253</a>
+   <td>"<code>cp1253</code>"
+  <tr><td>"<code>windows-1253</code>"
+  <tr><td>"<code>x-cp1253</code>"
+  <tr>
+   <td rowspan=12><a>windows-1254</a>
+   <td>"<code>cp1254</code>"
+  <tr><td>"<code>csisolatin5</code>"
+  <tr><td>"<code>iso-8859-9</code>"
+  <tr><td>"<code>iso-ir-148</code>"
+  <tr><td>"<code>iso8859-9</code>"
+  <tr><td>"<code>iso88599</code>"
+  <tr><td>"<code>iso_8859-9</code>"
+  <tr><td>"<code>iso_8859-9:1989</code>"
+  <tr><td>"<code>l5</code>"
+  <tr><td>"<code>latin5</code>"
+  <tr><td>"<code>windows-1254</code>"
+  <tr><td>"<code>x-cp1254</code>"
+  <tr>
+   <td rowspan=3><a>windows-1255</a>
+   <td>"<code>cp1255</code>"
+  <tr><td>"<code>windows-1255</code>"
+  <tr><td>"<code>x-cp1255</code>"
+  <tr>
+   <td rowspan=3><a>windows-1256</a>
+   <td>"<code>cp1256</code>"
+  <tr><td>"<code>windows-1256</code>"
+  <tr><td>"<code>x-cp1256</code>"
+  <tr>
+   <td rowspan=3><a>windows-1257</a>
+   <td>"<code>cp1257</code>"
+  <tr><td>"<code>windows-1257</code>"
+  <tr><td>"<code>x-cp1257</code>"
+  <tr>
+   <td rowspan=3><a>windows-1258</a>
+   <td>"<code>cp1258</code>"
+  <tr><td>"<code>windows-1258</code>"
+  <tr><td>"<code>x-cp1258</code>"
+  <tr>
+   <td rowspan=2><a>x-mac-cyrillic</a>
+   <td>"<code>x-mac-cyrillic</code>"
+  <tr><td>"<code>x-mac-ukrainian</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-chinese-(simplified)-encodings>Legacy multi-byte Chinese (simplified) encodings</a>
+  <tr>
+   <td rowspan=9><a>GBK</a>
+   <td>"<code>chinese</code>"
+  <tr><td>"<code>csgb2312</code>"
+  <tr><td>"<code>csiso58gb231280</code>"
+  <tr><td>"<code>gb2312</code>"
+  <tr><td>"<code>gb_2312</code>"
+  <tr><td>"<code>gb_2312-80</code>"
+  <tr><td>"<code>gbk</code>"
+  <tr><td>"<code>iso-ir-58</code>"
+  <tr><td>"<code>x-gbk</code>"
+  <tr>
+   <td><a>gb18030</a>
+   <td>"<code>gb18030</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-chinese-(traditional)-encodings>Legacy multi-byte Chinese (traditional) encodings</a>
+  <tr>
+   <td rowspan=5><a>Big5</a>
+   <td>"<code>big5</code>"
+  <tr><td>"<code>big5-hkscs</code>"
+  <tr><td>"<code>cn-big5</code>"
+  <tr><td>"<code>csbig5</code>"
+  <tr><td>"<code>x-x-big5</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-japanese-encodings>Legacy multi-byte Japanese encodings</a>
+  <tr>
+   <td rowspan=3><a>EUC-JP</a>
+   <td>"<code>cseucpkdfmtjapanese</code>"
+  <tr><td>"<code>euc-jp</code>"
+  <tr><td>"<code>x-euc-jp</code>"
+  <tr>
+   <td rowspan=2><a>ISO-2022-JP</a>
+   <td>"<code>csiso2022jp</code>"
+  <tr><td>"<code>iso-2022-jp</code>"
+  <tr>
+   <td rowspan=8><a>Shift_JIS</a>
+   <td>"<code>csshiftjis</code>"
+  <tr><td>"<code>ms932</code>"
+  <tr><td>"<code>ms_kanji</code>"
+  <tr><td>"<code>shift-jis</code>"
+  <tr><td>"<code>shift_jis</code>"
+  <tr><td>"<code>sjis</code>"
+  <tr><td>"<code>windows-31j</code>"
+  <tr><td>"<code>x-sjis</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-multi-byte-korean-encodings>Legacy multi-byte Korean encodings</a>
+  <tr>
+   <td rowspan=10><a>EUC-KR</a>
+   <td>"<code>cseuckr</code>"
+  <tr><td>"<code>csksc56011987</code>"
+  <tr><td>"<code>euc-kr</code>"
+  <tr><td>"<code>iso-ir-149</code>"
+  <tr><td>"<code>korean</code>"
+  <tr><td>"<code>ks_c_5601-1987</code>"
+  <tr><td>"<code>ks_c_5601-1989</code>"
+  <tr><td>"<code>ksc5601</code>"
+  <tr><td>"<code>ksc_5601</code>"
+  <tr><td>"<code>windows-949</code>"
+ <tbody>
+  <tr><th colspan=2><a href=#legacy-miscellaneous-encodings>Legacy miscellaneous encodings</a>
+  <tr>
+   <td rowspan=6><a>replacement</a>
+   <td>"<code>csiso2022kr</code>"
+  <tr><td>"<code>hz-gb-2312</code>"
+  <tr><td>"<code>iso-2022-cn</code>"
+  <tr><td>"<code>iso-2022-cn-ext</code>"
+  <tr><td>"<code>iso-2022-kr</code>"
+  <tr><td>"<code>replacement</code>"
+  <tr>
+   <td rowspan=2><a>UTF-16BE</a>
+   <td>"<code>unicodefffe</code>"
+  <tr><td>"<code>utf-16be</code>"
+  <tr>
+   <td rowspan=7><a>UTF-16LE</a>
+   <td>"<code>csunicode</code>"
+  <tr><td>"<code>iso-10646-ucs-2</code>"
+  <tr><td>"<code>ucs-2</code>"
+  <tr><td>"<code>unicode</code>"
+  <tr><td>"<code>unicodefeff</code>"
+  <tr><td>"<code>utf-16</code>"
+  <tr><td>"<code>utf-16le</code>"
+  <tr>
+   <td><a>x-user-defined</a>
+   <td>"<code>x-user-defined</code>"
+</table>
+
+<p class=note>All <a for=/>encodings</a> and their <a for=encoding>labels</a> are also available as
+non-normative <a href=encodings.json>encodings.json</a> resource.
+
+<p class=note id=supported-encodings>The set of supported <a for=/>encodings</a> is primarily based
+on the intersection of the sets supported by major browser engines when the development of this
+standard started, while removing encodings that were rarely used legitimately but that could be used
+in attacks. The inclusion of some encodings is questionable in the light of anecdotal evidence of
+the level of use by existing Web content. That is, while they have been broadly supported by
+browsers, it is unclear if they are broadly used by Web content. However, an effort has not been
+made to eagerly remove <a>single-byte encodings</a> that were broadly supported by browsers or are
+part of the ISO 8859 series. In particular, the necessity of the inclusion of <a>IBM866</a>,
+<a>macintosh</a>, <a>x-mac-cyrillic</a>, <a>ISO-8859-3</a>, <a>ISO-8859-10</a>, <a>ISO-8859-14</a>,
+and <a>ISO-8859-16</a> is doubtful for the purpose of supporting existing content, but there are no
+plans to remove these.</p>
+
+
+<h3 id=output-encodings>Output encodings</h3>
+
+<p>To <dfn export>get an output encoding</dfn> from an <a for=/>encoding</a>
+<var>encoding</var>, run these steps:
+
+<ol>
+ <li><p>If <var>encoding</var> is <a>replacement</a> or <a>UTF-16BE/LE</a>, then return
+ <a>UTF-8</a>.
+
+ <li><p>Return <var>encoding</var>.
+</ol>
+
+<p class=note>The <a>get an output encoding</a> algorithm is useful for URL parsing and HTML
+form submission, which both need exactly this.
+
+
+
+<h2 id=indexes>Indexes</h2>
+
+<p>Most legacy <a for=/>encodings</a> make use of an <dfn id=index>index</dfn>. An
+<a>index</a> is an ordered list of entries, each entry consisting of a pointer and a
+corresponding code point. Within an <a>index</a> pointers are unique and code points can be
+duplicated.
+
+<p class="note no-backref">An efficient implementation likely has two
+<a lt=index>indexes</a> per <a for=/>encoding</a>. One optimized for its
+<a for=/>decoder</a> and one for its <a for=/>encoder</a>.
+
+<p>To find the pointers and their corresponding code points in an <a>index</a>,
+let <var>lines</var> be the result of splitting the resource's contents on U+000A.
+Then remove each item in <var>lines</var> that is the empty string or starts with U+0023.
+Then the pointers and their corresponding code points are found by splitting each item in <var>lines</var> on U+0009.
+The first subitem is the pointer (as a decimal number) and the second is the corresponding code point (as a hexadecimal number).
+Other subitems are not relevant.
+
+<p class="note no-backref">To signify changes an <a>index</a> includes an
+<i>Identifier</i> and a <i>Date</i>. If an <i>Identifier</i> has
+changed, so has the <a>index</a>.
+
+<p>The <dfn>index code point</dfn> for <var>pointer</var> in
+<var>index</var> is the code point corresponding to
+<var>pointer</var> in <var>index</var>, or null if
+<var>pointer</var> is not in <var>index</var>.
+
+<p>The <dfn>index pointer</dfn> for <var>code point</var> in
+<var>index</var> is the <em>first</em> pointer corresponding to
+<var>code point</var> in <var>index</var>, or null if
+<var>code point</var> is not in <var>index</var>.
+
+<div class=note id=visualization>
+ <p>There is a non-normative visualization for each <a>index</a> other than
+ <a>index gb18030 ranges</a> and <a>index ISO-2022-JP katakana</a>. <a>index jis0208</a> also has an
+ alternative <a>Shift_JIS</a> visualization. Additionally, there is visualization of the Basic
+ Multilingual Plane coverage of each index other than <a>index gb18030 ranges</a> and
+ <a>index ISO-2022-JP katakana</a>.
+
+ <p>The legend for the visualizations is:
+
+ <ul class=visualizationlegend>
+  <li class=unmapped>Unmapped
+  <li class=mid>Two bytes in UTF-8
+  <li class="mid contiguous">Two bytes in UTF-8, code point follows immediately the code point of
+  previous pointer
+  <li class=upper>Three bytes in UTF-8 (non-PUA)
+  <li class="upper contiguous">Three bytes in UTF-8 (non-PUA), code point follows immediately the
+  code point of previous pointer
+  <li class=pua>Private Use
+  <li class="pua contiguous">Private Use, code point follows immediately the code point of previous
+  pointer
+  <li class=astral>Four bytes in UTF-8
+  <li class="astral contiguous">Four bytes in UTF-8, code point follows immediately the code point
+  of previous pointer
+  <li class=duplicate>Duplicate code point already mapped at an earlier index
+  <li class=compatibility>CJK Compatibility Ideograph
+  <li class=ext>CJK Unified Ideographs Extension A
+ </ul>
+</div>
+
+<p>These are the <a lt=index>indexes</a> defined by this
+specification, excluding <a>index single-byte</a>, which have their own table:
+
+<table>
+ <tbody><tr><th colspan=4><a>Index</a><th>Notes
+ <tr>
+  <td><dfn export>index Big5</dfn>
+  <td><a href=index-big5.txt>index-big5.txt</a>
+  <td><a href=big5.html>index Big5 visualization</a>
+  <td><a href=big5-bmp.html>index Big5 BMP coverage</a>
+  <td>This matches the Big5 standard in combination with the
+  Hong Kong Supplementary Character Set and other common extensions.
+ <tr>
+  <td><dfn export>index EUC-KR</dfn>
+  <td><a href=index-euc-kr.txt>index-euc-kr.txt</a>
+  <td><a href=euc-kr.html>index EUC-KR visualization</a>
+  <td><a href=euc-kr-bmp.html>index EUC-KR BMP coverage</a>
+  <td>This matches the KS X 1001 standard and the Unified Hangul Code, more commonly known together
+  as Windows Codepage 949. It covers the Hangul Syllables block of Unicode in its entirety. The
+  Hangul block whose top left corner in the visualization is at pointer 9026 is in the Unicode
+  order. Taken separately, the rest of the Hangul syllables in this index are in the Unicode order,
+  too.
+ <tr>
+  <td><dfn export>index gb18030</dfn>
+  <td><a href=index-gb18030.txt>index-gb18030.txt</a>
+  <td><a href=gb18030.html>index gb18030 visualization</a>
+  <td><a href=gb18030-bmp.html>index gb18030 BMP coverage</a>
+  <td>This matches the GB18030-2022 standard for code points encoded as two bytes, except for
+  0xA3 0xA0 which maps to U+3000 to be compatible with deployed content. This index covers the
+  CJK Unified Ideographs block of Unicode in its entirety. Entries from that block that are above or
+  to the left of (the first) U+3000 in the visualization are in the Unicode order.
+  <!-- https://bugzilla.mozilla.org/show_bug.cgi?id=131837
+       https://bugs.webkit.org/show_bug.cgi?id=17014
+       https://www.w3.org/Bugs/Public/show_bug.cgi?id=25396
+       https://github.com/whatwg/encoding/issues/17 -->
+ <tr>
+  <td><dfn export>index gb18030 ranges</dfn>
+  <td colspan=3><a href=index-gb18030-ranges.txt>index-gb18030-ranges.txt</a>
+  <td>This <a>index</a> works different from all others. Listing all code points would result
+  in over a million items whereas they can be represented neatly in 207 ranges combined with trivial
+  limit checks. It therefore only superficially matches the GB18030-2000 standard for code points
+  encoded as four bytes. The change for the GB18030-2005 revision is handled inline by the
+  <a>index gb18030 ranges code point</a> and <a>index gb18030 ranges pointer</a> algorithms below
+  that accompany this index. And the changes for the GB18030-2022 revision are handled differently
+  again to not further increase the number of byte sequences mapping to Private Use code points. The
+  relevant Private Use code points are mapped in the <a>gb18030 encoder</a> directly through a side
+  table to preserve compatibility with how they were mapped before.
+ <tr>
+  <td><dfn export>index jis0208</dfn>
+  <td><a href=index-jis0208.txt>index-jis0208.txt</a>
+  <td><a href=jis0208.html>index jis0208 visualization</a>, <a href=shift_jis.html>Shift_JIS visualization</a>
+  <td><a href=jis0208-bmp.html>index jis0208 BMP coverage</a>
+  <td>This is the JIS X 0208 standard including formerly proprietary
+  extensions from IBM and NEC.
+  <!-- NEC = Nippon Electronics Corporation -->
+ <tr>
+  <td><dfn export>index jis0212</dfn>
+  <td><a href=index-jis0212.txt>index-jis0212.txt</a>
+  <td><a href=jis0212.html>index jis0212 visualization</a>
+  <td><a href=jis0212-bmp.html>index jis0212 BMP coverage</a>
+  <td>This is the JIS X 0212 standard. It is only used by the <a>EUC-JP decoder</a>
+  due to lack of widespread support elsewhere.
+  <!--
+   No JIX X 0212 EUC-JP encoder support:
+     https://bugzilla.mozilla.org/show_bug.cgi?id=600715
+     https://code.google.com/p/chromium/issues/detail?id=78847
+
+   No JIX X 0212 ISO-2022-JP support:
+     https://www.w3.org/Bugs/Public/show_bug.cgi?id=26885
+  -->
+ <tr>
+  <td><dfn export>index ISO-2022-JP katakana</dfn>
+  <td colspan=3><a href=index-iso-2022-jp-katakana.txt>index-iso-2022-jp-katakana.txt</a>
+  <td>This maps halfwidth to fullwidth katakana as per Unicode Normalization Form KC, except that
+  U+FF9E and U+FF9F map to U+309B and U+309C rather than U+3099 and U+309A. It is only used by the
+  <a>ISO-2022-JP encoder</a>. [[UNICODE]]
+</table>
+
+<p>The <dfn>index gb18030 ranges code point</dfn> for <var>pointer</var> is
+the return value of these steps:
+
+<ol>
+ <li><p>If <var>pointer</var> is greater than 39419 and less than
+ 189000, or <var>pointer</var> is greater than 1237575, return null.
+
+ <li><p>If <var>pointer</var> is 7457, return code point U+E7C7.
+ <!-- 7457 is 0x81 0x35 0xF4 0x37 -->
+
+ <li><p>Let <var>offset</var> be the last pointer in <a>index gb18030 ranges</a> that is less than
+ or equal to <var>pointer</var> and let <var>code point offset</var> be its corresponding code
+ point.
+
+ <li><p>Return a code point whose value is
+ <var>code point offset</var> + <var>pointer</var> &minus; <var>offset</var>.
+</ol>
+
+<p>The <dfn>index gb18030 ranges pointer</dfn> for <var>code point</var> is
+the return value of these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is U+E7C7, return pointer 7457.
+
+ <li><p>Let <var>offset</var> be the last code point in <a>index gb18030 ranges</a> that is less
+ than or equal to <var>code point</var> and let <var>pointer offset</var> be its corresponding
+ pointer.
+
+ <li><p>Return a pointer whose value is
+ <var>pointer offset</var> + <var>code point</var> &minus; <var>offset</var>.
+</ol>
+
+<p>The <dfn>index Shift_JIS pointer</dfn> for <var>code point</var> is the return value of these
+steps:
+
+<ol>
+ <li>
+  <p>Let <var>index</var> be <a>index jis0208</a> excluding all entries whose pointer is in
+  the range 8272 to 8835, inclusive.
+  <!-- selected NEC duplicates from IBM extensions later in the index; need to use IBM
+       extensions when going back to bytes -->
+
+  <p class=note>The <a>index jis0208</a> contains duplicate code points so the exclusion of
+  these entries causes later code points to be used.
+
+ <li><p>Return the <a>index pointer</a> for <var>code point</var> in
+ <var>index</var>.
+</ol>
+
+<p>The <dfn>index Big5 pointer</dfn> for <var>code point</var> is the return value of
+these steps:
+
+<ol>
+ <li>
+  <p>Let <var>index</var> be <a>index Big5</a> excluding all entries whose pointer is less
+  than (0xA1 - 0x81) × 157.
+
+  <p class=note>Avoid returning Hong Kong Supplementary Character Set extensions literally.
+
+ <li>
+  <p>If <var>code point</var> is U+2550, U+255E, U+2561, U+256A, U+5341, or U+5345,
+  return the <em>last</em> pointer corresponding to <var>code point</var> in
+  <var>index</var>.
+  <!-- https://www.w3.org/Bugs/Public/show_bug.cgi?id=27878 -->
+
+  <p class=note>There are other duplicate code points, but for those the <em>first</em> pointer is
+  to be used.
+
+ <li><p>Return the <a>index pointer</a> for <var>code point</var> in
+ <var>index</var>.
+</ol>
+
+<hr>
+
+<p class="note no-backref">All <a lt=index>indexes</a> are also available as a non-normative
+<a href=indexes.json>indexes.json</a> resource. (<a>Index gb18030 ranges</a> has a slightly
+different format here, to be able to represent ranges.)
+
+
+
+<h2 id=specification-hooks>Hooks for standards</h2>
+
+<div class=note>
+ <p>The algorithms defined below (<a>UTF-8 decode</a>, <a>UTF-8 decode without BOM</a>,
+ <a>UTF-8 decode without BOM or fail</a>, and <a>UTF-8 encode</a>) are intended for usage by other
+ standards.
+
+ <p>For decoding, <a>UTF-8 decode</a> is to be used by new formats. For identifiers or byte
+ sequences within a format or protocol, use <a>UTF-8 decode without BOM</a> or
+ <a>UTF-8 decode without BOM or fail</a>.
+
+ <p>For encoding, <a>UTF-8 encode</a> is to be used.
+
+ <p>Standards are to ensure that the input I/O queues they pass to <a>UTF-8 encode</a> (as well as
+ the legacy <a>encode</a>) are effectively I/O queues of scalar values, i.e., they contain no
+ <a>surrogates</a>.
+
+ <p>These hooks (as well as <a>decode</a> and <a>encode</a>) will block until the input I/O queue
+ has been consumed in its entirety. In order to use the output tokens as they are pushed into the
+ stream, callers are to invoke the hooks with an empty output I/O queue and read from it
+ <a>in parallel</a>. Note that some care is needed when using
+ <a>UTF-8 decode without BOM or fail</a>, as any error found during decoding will prevent the
+ <a>end-of-queue</a> item from ever being pushed into the output I/O queue.
+</div>
+
+<p>To <dfn export>UTF-8 decode</dfn> an I/O queue of bytes <var>ioQueue</var> given an optional I/O
+queue of scalar values <var>output</var> (default « »), run these steps:
+
+<ol>
+ <li><p>Let <var>buffer</var> be the result of <a for="I/O queue">peeking</a> three bytes from
+ <var>ioQueue</var>, converted to a byte sequence.
+
+ <li><p>If <var>buffer</var> is 0xEF 0xBB 0xBF, then <a for="I/O queue">read</a> three bytes from
+ <var>ioQueue</var>. (Do nothing with those bytes.)
+
+ <li><p><a>Process a queue</a> with an instance of <a>UTF-8</a>'s <a for=/>decoder</a>,
+ <var>ioQueue</var>, <var>output</var>, and "<code>replacement</code>".
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p>To <dfn export>UTF-8 decode without BOM</dfn> an I/O queue of bytes <var>ioQueue</var> given an
+optional I/O queue of scalar values <var>output</var> (default « »), run these steps:
+
+<ol>
+ <li><p><a>Process a queue</a> with an instance of <a>UTF-8</a>'s <a for=/>decoder</a>,
+ <var>ioQueue</var>, <var>output</var>, and "<code>replacement</code>".
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p>To <dfn export>UTF-8 decode without BOM or fail</dfn> an I/O queue of bytes <var>ioQueue</var>
+given an optional I/O queue of scalar values <var>output</var> (default « »), run these steps:
+<!-- Needed by https://tools.ietf.org/html/rfc6455#section-8.1 and
+     https://webassembly.github.io/spec/js-api/#dom-module-customsections-moduleobject-sectionname
+     -->
+
+<ol>
+ <li><p>Let <var>potentialError</var> be the result of <a>processing a queue</a> with an instance of
+ <a>UTF-8</a>'s <a for=/>decoder</a>, <var>ioQueue</var>, <var>output</var>, and
+ "<code>fatal</code>".
+
+ <li><p>If <var>potentialError</var> is an <a>error</a>, then return failure.
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<hr>
+
+<p>To <dfn export>UTF-8 encode</dfn> an I/O queue of scalar values <var>ioQueue</var> given an
+optional I/O queue of bytes <var>output</var> (default « »), return the result of
+<a lt=encode for=/>encoding</a> <var>ioQueue</var> with encoding <a>UTF-8</a> and <var>output</var>.
+
+
+<h3 id=legacy-hooks>Legacy hooks for standards</h3>
+
+<div class=note>
+ <p>Standards are strongly discouraged from using <a>decode</a>, <a>BOM sniff</a>, and
+ <a for=/>encode</a>, except as needed for compatibility. Standards needing these legacy hooks will
+ most likely also need to use <a>get an encoding</a> (to turn a label into an <a for=/>encoding</a>)
+ and <a>get an output encoding</a> (to turn an <a for=/>encoding</a> into another
+ <a for=/>encoding</a> that is suitable to pass into <a>encode</a>).
+
+ <p>For the extremely niche case of URL percent-encoding, custom encoder error handling is needed.
+ The <a>get an encoder</a> and <a>encode or fail</a> algorithms are to be used for that. Other
+ algorithms are not to be used directly.
+</div>
+
+<p>To <dfn export>decode</dfn> an I/O queue of bytes <var>ioQueue</var> given a fallback encoding
+<var>encoding</var> and an optional I/O queue of scalar values <var>output</var> (default « »), run
+these steps:
+
+<ol>
+ <li><p>Let <var>BOMEncoding</var> be the result of <a>BOM sniffing</a> <var>ioQueue</var>.
+
+ <li>
+  <p>If <var>BOMEncoding</var> is non-null:
+
+  <ol>
+   <li><p>Set <var>encoding</var> to <var>BOMEncoding</var>.
+
+   <li><p><a>Read</a> three bytes from <var>ioQueue</var>, if <var>BOMEncoding</var> is
+   <a>UTF-8</a>; otherwise <a>read</a> two bytes. (Do nothing with those bytes.)
+  </ol>
+
+  <p class=note>For compatibility with deployed content, the byte order mark is more authoritative
+  than anything else. In a context where HTTP is used this is in violation of the semantics of the
+  `<code>Content-Type</code>` header.
+
+ <li><p><a>Process a queue</a> with an instance of <var>encoding</var>'s <a for=/>decoder</a>,
+ <var>ioQueue</var>, <var>output</var>, and "<code>replacement</code>".
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p>To <dfn export>BOM sniff</dfn> an I/O queue of bytes <var>ioQueue</var>, run these steps:
+
+<ol>
+ <li><p>Let <var>BOM</var> be the result of <a for="I/O queue">peeking</a> 3 bytes from
+ <var>ioQueue</var>, converted to a byte sequence.
+
+ <li>
+  <p>For each of the rows in the table below, starting with the first one and going down, if
+  <var>BOM</var> <a for="byte sequence">starts with</a> the bytes given in the first column, then
+  return the <a for=/>encoding</a> given in the cell in the second column of that row. Otherwise,
+  return null.
+
+  <table>
+   <tbody><tr><th>Byte order mark<th>Encoding
+   <tr><td>0xEF 0xBB 0xBF<td><a>UTF-8</a>
+   <tr><td>0xFE 0xFF<td><a>UTF-16BE</a>
+   <tr><td>0xFF 0xFE<td><a>UTF-16LE</a>
+  </table>
+</ol>
+
+<p class=note>This hook is a workaround for the fact that <a>decode</a> has no way to communicate
+back to the caller that it has found a byte order mark and is therefore not using the provided
+encoding. The hook is to be invoked before <a>decode</a>, and it will return an encoding
+corresponding to the byte order mark found, or null otherwise.
+
+<hr>
+
+<p>To <dfn export>encode</dfn> an I/O queue of scalar values <var>ioQueue</var> given an encoding
+<var>encoding</var> and an optional I/O queue of bytes <var>output</var> (default « »), run these
+steps:
+
+<ol>
+ <li><p>Let <var>encoder</var> be the result of <a>getting an encoder</a> from <var>encoding</var>.
+
+ <li><p><a>Process a queue</a> with <var>encoder</var>, <var>ioQueue</var>, <var>output</var>, and
+ "<code>html</code>".
+
+ <li><p>Return <var>output</var>.
+</ol>
+
+<p class="note no-backref">This is a legacy hook for HTML forms. Layering <a>UTF-8 encode</a> on top
+is safe as it never triggers <a>errors</a>. [[HTML]]
+
+<hr>
+
+<p>To <dfn export lt="get an encoder|getting an encoder">get an encoder</dfn> from an
+<a for=/>encoding</a> <var>encoding</var>:
+
+<ol>
+ <li><p>Assert: <var>encoding</var> is not <a>replacement</a> or <a>UTF-16BE/LE</a>.
+
+ <li><p>Return an instance of <var>encoding</var>'s <a for=/>encoder</a>.
+</ol>
+
+<p>To <dfn export>encode or fail</dfn> an I/O queue of scalar values <var>ioQueue</var> given an
+<a for=/>encoder</a> instance <var>encoder</var> and an I/O queue of bytes <var>output</var>, run
+these steps:
+
+<ol>
+ <li><p>Let <var>potentialError</var> be the result of <a>processing a queue</a> with
+ <var>encoder</var>, <var>ioQueue</var>, <var>output</var>, and "<code>fatal</code>".
+
+ <li><p><a for="I/O queue">Push</a> <a>end-of-queue</a> to <var>output</var>.
+
+ <li><p>If <var>potentialError</var> is an <a>error</a>, then return <a>error</a>'s
+ <a>code point</a>'s <a for="code point">value</a>.
+
+ <li><p>Return null.
+</ol>
+
+<div class=note id=pit-of-iso-2022-jp>
+ <p>This is a legacy hook for URL percent-encoding. The caller will have to keep an
+ <a for=/>encoder</a> instance alive as the <a>ISO-2022-JP encoder</a> can be in two different
+ states when returning an <a>error</a>. That also means that if the caller emits bytes to encode the
+ error in some way, these have to be in the range 0x00 to 0x7F, inclusive, excluding 0x0E, 0x0F,
+ 0x1B, 0x5C, and 0x7E. [[URL]]
+
+ <p>In particular, if upon returning an <a>error</a> the <a>ISO-2022-JP encoder</a> is in the
+ <a lt="ISO-2022-JP decoder Roman">Roman</a> state, the caller cannot output 0x5C (\) as it will not
+ decode as U+005C (\). For this reason, applications using <a>encode or fail</a> for unintended
+ purposes ought to take care to prevent the use of the <a>ISO-2022-JP encoder</a> in combination
+ with replacement schemes, such as those of JavaScript and CSS, that use U+005C (\) as part of the
+ replacement syntax (e.g., <code>\u2603</code>) or make sure to pass the replacement syntax through
+ the encoder (in contrast to URL percent-encoding).
+
+ <p>The return value is either the number representing the <a>code point</a> that could not be
+ encoded or null, if there was no <a>error</a>. When it returns non-null the caller will have to
+ invoke it again, supplying the same <a for=/>encoder</a> instance and a new output I/O queue.
+</div>
+
+
+
+<h2 id=api>API</h2>
+
+<p>This section uses terminology from Web IDL. Browser user agents must support this API. JavaScript
+implementations should support this API. Other user agents or programming languages are encouraged
+to use an API suitable to their needs, which might not be this one. [[!WEBIDL]]
+
+<div class=example id=example-textencoder>
+ <p>The following example uses the {{TextEncoder}} object to encode
+ an array of strings into an
+ {{ArrayBuffer}}. The result is a
+ {{Uint8Array}} containing the number
+ of strings (as a {{Uint32Array}}),
+ followed by the length of the first string (as a
+ {{Uint32Array}}), the
+ <a>UTF-8</a> encoded string data, the length of the second string (as
+ a {{Uint32Array}}), the string data,
+ and so on.
+ <pre><code class=lang-javascript>
+function encodeArrayOfStrings(strings) {
+  var encoder, encoded, len, bytes, view, offset;
+
+  encoder = new TextEncoder();
+  encoded = [];
+
+  len = Uint32Array.BYTES_PER_ELEMENT;
+  for (var i = 0; i &lt; strings.length; i++) {
+    len += Uint32Array.BYTES_PER_ELEMENT;
+    encoded[i] = encoder.encode(strings[i]);
+    len += encoded[i].byteLength;
+  }
+
+  bytes = new Uint8Array(len);
+  view = new DataView(bytes.buffer);
+  offset = 0;
+
+  view.setUint32(offset, strings.length);
+  offset += Uint32Array.BYTES_PER_ELEMENT;
+  for (var i = 0; i &lt; encoded.length; i += 1) {
+    len = encoded[i].byteLength;
+    view.setUint32(offset, len);
+    offset += Uint32Array.BYTES_PER_ELEMENT;
+    bytes.set(encoded[i], offset);
+    offset += len;
+  }
+  return bytes.buffer;
+}</code></pre>
+
+ <p>The following example decodes an {{ArrayBuffer}} containing data encoded in the
+ format produced by the previous example, or an equivalent algorithm for encodings other than
+ <a>UTF-8</a>, back into an array of strings.
+
+ <pre><code class=lang-javascript>
+function decodeArrayOfStrings(buffer, encoding) {
+  var decoder, view, offset, num_strings, strings, len;
+
+  decoder = new TextDecoder(encoding);
+  view = new DataView(buffer);
+  offset = 0;
+  strings = [];
+
+  num_strings = view.getUint32(offset);
+  offset += Uint32Array.BYTES_PER_ELEMENT;
+  for (var i = 0; i &lt; num_strings; i++) {
+    len = view.getUint32(offset);
+    offset += Uint32Array.BYTES_PER_ELEMENT;
+    strings[i] = decoder.decode(
+      new DataView(view.buffer, offset, len));
+    offset += len;
+  }
+  return strings;
+}</code></pre>
+</div>
+
+
+<h3 id=interface-mixin-textdecodercommon>Interface mixin {{TextDecoderCommon}}</h3>
+
+<pre class=idl>
+interface mixin TextDecoderCommon {
+  readonly attribute DOMString encoding;
+  readonly attribute boolean fatal;
+  readonly attribute boolean ignoreBOM;
+};
+</pre>
+
+<p>The {{TextDecoderCommon}} interface mixin defines common getters that are shared between
+{{TextDecoder}} and {{TextDecoderStream}} objects. These objects have an associated:
+
+<dl>
+ <dt><dfn id=textdecoder-encoding for=TextDecoderCommon>encoding</dfn>
+ <dd>An <a for=/>encoding</a>.
+
+ <dt><dfn for=TextDecoderCommon oldids=textdecoder-decoder,textdecoderstream-decoder>decoder</dfn>
+ <dd>A <a for=/>decoder</a> instance.
+
+ <dt><dfn for=TextDecoderCommon oldids=textdecoder-stream,textdecoderstream-stream,textdecodercommon-stream>I/O queue</dfn>
+ <dd>An <a for=/>I/O queue</a> of bytes.
+
+ <dt><dfn id=textdecoder-ignore-bom-flag for=TextDecoderCommon>ignore BOM</dfn>
+ <dd>A boolean, initially false.
+
+ <dt><dfn id=textdecoder-bom-seen-flag for=TextDecoderCommon>BOM seen</dfn>
+ <dd>A boolean, initially false.
+
+ <dt><dfn id=textdecoder-error-mode for=TextDecoderCommon>error mode</dfn>
+ <dd>An <a for=/>error mode</a>, initially "<code>replacement</code>".
+</dl>
+
+<p>The <dfn id=concept-td-serialize>serialize I/O queue</dfn> algorithm, given a
+{{TextDecoderCommon}} <var>decoder</var> and an <a for=/>I/O queue</a> of scalar values
+<var>ioQueue</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>output</var> be the empty string.
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of <a>reading</a> from <var>ioQueue</var>.
+
+   <li><p>If <var>item</var> is <a>end-of-queue</a>, then return <var>output</var>.
+
+   <li>
+    <p>If <var>decoder</var>'s <a for=TextDecoderCommon>encoding</a> is <a>UTF-8</a> or
+    <a>UTF-16BE/LE</a>, and <var>decoder</var>'s <a for=TextDecoderCommon>ignore BOM</a> and
+    <a for=TextDecoderCommon>BOM seen</a> are false, then:
+
+    <ol>
+     <li><p>Set <var>decoder</var>'s <a for=TextDecoderCommon>BOM seen</a> to true.
+
+     <li><p>If <var>item</var> is U+FEFF, then <a for=iteration>continue</a>.
+    </ol>
+
+   <li><p>Append <var>item</var> to <var>output</var>.
+  </ol>
+</ol>
+
+<p class=note>This algorithm is intentionally different with respect to BOM handling from
+the <a for=/>decode</a> algorithm used by the rest of the platform to give API users more
+control.
+
+<hr>
+
+<p>The <dfn attribute id=dom-textdecoder-encoding for=TextDecoderCommon><code>encoding</code></dfn>
+getter steps are to return <a>this</a>'s <a for=TextDecoderCommon>encoding</a>'s
+<a for=encoding>name</a>, <a>ASCII lowercased</a>.
+
+<p>The <dfn attribute id=dom-textdecoder-fatal for=TextDecoderCommon><code>fatal</code></dfn> getter
+steps are to return true if <a>this</a>'s <a for=TextDecoderCommon>error mode</a> is
+"<code>fatal</code>", otherwise false.
+
+<p>The
+<dfn attribute id=dom-textdecoder-ignorebom for=TextDecoderCommon><code>ignoreBOM</code></dfn>
+getter steps are to return <a>this</a>'s <a for=TextDecoderCommon>ignore BOM</a>.
+
+
+<h3 id=interface-textdecoder>Interface {{TextDecoder}}</h3>
+
+<pre class=idl>
+dictionary TextDecoderOptions {
+  boolean fatal = false;
+  boolean ignoreBOM = false;
+};
+
+dictionary TextDecodeOptions {
+  boolean stream = false;
+};
+
+[Exposed=*]
+interface TextDecoder {
+  constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options = {});
+
+  USVString decode(optional AllowSharedBufferSource input, optional TextDecodeOptions options = {});
+};
+TextDecoder includes TextDecoderCommon;
+</pre>
+
+<p>A {{TextDecoder}} object has an associated
+<dfn for=TextDecoder id=textdecoder-do-not-flush-flag>do not flush</dfn>, which is a boolean,
+initially false.
+
+<dl class=domintro>
+ <dt><code><var>decoder</var> = new <a constructor for=TextDecoder lt=TextDecoder()>TextDecoder([<var>label</var> = "utf-8" [, <var>options</var>]])</a></code>
+ <dd>
+  <p>Returns a new {{TextDecoder}} object.
+  <p>If <var>label</var> is either not a label or is a <a for=encoding>label</a> for
+  <a>replacement</a>, <a>throws</a> a {{RangeError}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>encoding</a></code>
+ <dd><p>Returns <a for=TextDecoderCommon>encoding</a>'s <a for=encoding>name</a>, lowercased.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>fatal</a></code>
+ <dd><p>Returns true if <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>", otherwise
+ false.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>ignoreBOM</a></code>
+ <dd><p>Returns the value of <a for=TextDecoderCommon>ignore BOM</a>.
+
+ <dt><code><var>decoder</var> . <a method for=TextDecoder lt=decode()>decode([<var>input</var> [, <var>options</var>]])</a></code>
+ <dd>
+  <p>Returns the result of running <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>.
+  The method can be invoked zero or more times with <var>options</var>'s <code>stream</code> set to
+  true, and then once without <var>options</var>'s <code>stream</code> (or set to false), to process
+  a fragmented input. If the invocation without <var>options</var>'s <code>stream</code> (or set to
+  false) has no <var>input</var>, it's clearest to omit both arguments.
+
+  <pre class=example id=example-end-of-stream><code class=lang-javascript>
+var string = "", decoder = new TextDecoder(encoding), buffer;
+while(buffer = next_chunk()) {
+  string += decoder.decode(buffer, {stream:true});
+}
+string += decoder.decode(); // end-of-queue</code></pre>
+
+  <p>If the <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>" and
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> returns <a>error</a>,
+  <a>throws</a> a {{TypeError}}.
+</dl>
+
+<p>The
+<dfn constructor for=TextDecoder lt="TextDecoder(label, options)" id=dom-textdecoder><code>new TextDecoder(<var>label</var>, <var>options</var>)</code></dfn>
+constructor steps are:
+
+<ol>
+ <li><p>Let <var>encoding</var> be the result of <a>getting an encoding</a> from <var>label</var>.
+
+ <li><p>If <var>encoding</var> is failure or <a>replacement</a>, then <a>throw</a> a {{RangeError}}.
+
+ <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>encoding</a> to <var>encoding</var>.
+
+ <li><p>If <var>options</var>["{{TextDecoderOptions/fatal}}"] is true, then set <a>this</a>'s
+ <a for=TextDecoderCommon>error mode</a> to "<code>fatal</code>".
+
+ <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>ignore BOM</a> to
+ <var>options</var>["{{TextDecoderOptions/ignoreBOM}}"].
+</ol>
+
+<p>The <dfn method for=TextDecoder><code>decode(<var>input</var>, <var>options</var>)</code></dfn>
+method steps are:
+
+<ol>
+ <li><p>If <a>this</a>'s <a for=TextDecoder>do not flush</a> is false, then set <a>this</a>'s
+ <a for=TextDecoderCommon>decoder</a> to a new instance of <a>this</a>'s
+ <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>, <a>this</a>'s
+ <a for=TextDecoderCommon>I/O queue</a> to the <a for=/>I/O queue</a> of bytes
+ « <a>end-of-queue</a> », and <a>this</a>'s <a for=TextDecoderCommon>BOM seen</a> to false.
+
+ <li><p>Set <a>this</a>'s <a for=TextDecoder>do not flush</a> to
+ <var>options</var>["{{TextDecodeOptions/stream}}"].
+
+ <li>
+  <p>If <var>input</var> is given, then <a>push</a> a
+  <a lt="get a copy of the buffer source">copy of</a> <var>input</var> to <a>this</a>'s
+  <a for=TextDecoderCommon>I/O queue</a>.
+
+  <p class=note>Implementations are strongly encouraged to use an implementation strategy that
+  avoids this copy. When doing so they will have to make sure that changes to <var>input</var> do
+  not affect future calls to <a method><code>decode()</code></a>.
+
+  <p class=warning id=sharedarraybuffer-warning>The memory exposed by <code>SharedArrayBuffer</code>
+  objects does not adhere to data race freedom properties required by the memory model of
+  programming languages typically used for implementations. When implementing, take care to use the
+  appropriate facilities when accessing memory exposed by <code>SharedArrayBuffer</code> objects.
+
+ <li><p>Let <var>output</var> be the <a for=/>I/O queue</a> of scalar values
+ « <a>end-of-queue</a> ».
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of <a>reading</a> from <a>this</a>'s
+   <a for=TextDecoderCommon>I/O queue</a>.
+
+   <li>
+    <p>If <var>item</var> is <a>end-of-queue</a> and <a>this</a>'s
+    <a for=TextDecoder>do not flush</a> is true, then return the result of running
+    <a>serialize I/O queue</a> with <a>this</a> and <var>output</var>.
+
+    <p class=note>The way streaming works is to not handle <a>end-of-queue</a> here when
+    <a>this</a>'s <a for=TextDecoder>do not flush</a> is true and to not set it to false. That way
+    in a subsequent invocation <a>this</a>'s <a for=TextDecoderCommon>decoder</a> is not set anew in
+    the first step of the algorithm and its state is preserved.
+
+   <li>
+    <p>Otherwise:
+
+    <ol>
+     <li><p>Let <var>result</var> be the result of <a>processing an item</a> with <var>item</var>,
+     <a>this</a>'s <a for=TextDecoderCommon>decoder</a>, <a>this</a>'s
+     <a for=TextDecoderCommon>I/O queue</a>, <var>output</var>, and <a>this</a>'s
+     <a for=TextDecoderCommon>error mode</a>.
+
+     <li><p>If <var>result</var> is <a>finished</a>, then return the result of running
+     <a>serialize I/O queue</a> with <a>this</a> and <var>output</var>.
+
+     <li><p>Otherwise, if <var>result</var> is <a>error</a>, <a>throw</a> a {{TypeError}}.
+    </ol>
+  </ol>
+</ol>
+
+<h3 id=interface-mixin-textencodercommon>Interface mixin {{TextEncoderCommon}}</h3>
+
+<pre class=idl>
+interface mixin TextEncoderCommon {
+  readonly attribute DOMString encoding;
+};
+</pre>
+
+<p>The {{TextEncoderCommon}} interface mixin defines common getters that are shared between
+{{TextEncoder}} and {{TextEncoderStream}} objects.
+
+<p>The <dfn attribute id=dom-textencoder-encoding for=TextEncoderCommon><code>encoding</code></dfn>
+getter steps are to return "<code>utf-8</code>".
+
+
+<h3 id=interface-textencoder>Interface {{TextEncoder}}</h3>
+
+<pre class=idl>
+dictionary TextEncoderEncodeIntoResult {
+  unsigned long long read;
+  unsigned long long written;
+};
+
+[Exposed=*]
+interface TextEncoder {
+  constructor();
+
+  [NewObject] Uint8Array encode(optional USVString input = "");
+  TextEncoderEncodeIntoResult encodeInto(USVString source, [AllowShared] Uint8Array destination);
+};
+TextEncoder includes TextEncoderCommon;
+</pre>
+
+<p class="note no-backref">A {{TextEncoder}} object offers no <var>label</var> argument as it only
+supports <a>UTF-8</a>. It also offers no <code>stream</code> option as no <a for=/>encoder</a>
+requires buffering of scalar values.
+
+<hr>
+
+<dl class=domintro>
+ <dt><code><var>encoder</var> = new <a constructor for=TextEncoder>TextEncoder()</a></code>
+ <dd><p>Returns a new {{TextEncoder}} object.
+
+ <dt><code><var>encoder</var> . <a attribute for=TextEncoderCommon>encoding</a></code>
+ <dd><p>Returns "<code>utf-8</code>".
+
+ <dt><code><var>encoder</var> . <a method for=TextEncoder lt=encode()>encode([<var>input</var> = ""])</a></code>
+ <dd><p>Returns the result of running <a>UTF-8</a>'s <a for=/>encoder</a>.
+
+ <dt><code><var>encoder</var> . <a method=for=TextEncoder lt="encodeInto(source, destination)">encodeInto(<var>source</var>, <var>destination</var>)</a></code>
+ <dd><p>Runs the <a>UTF-8 encoder</a> on <var>source</var>, stores the result of that operation into
+ <var>destination</var>, and returns the progress made as an object wherein
+ {{TextEncoderEncodeIntoResult/read}} is the number of converted <a>code units</a> of
+ <var>source</var> and {{TextEncoderEncodeIntoResult/written}} is the number of bytes modified in
+ <var>destination</var>.
+</dl>
+
+<p>The
+<dfn constructor for=TextEncoder lt=TextEncoder() id=dom-textencoder><code>new TextEncoder()</code></dfn>
+constructor steps are to do nothing.
+
+<p>The <dfn method for=TextEncoder><code>encode(<var>input</var>)</code></dfn> method steps are:
+
+<ol>
+ <li><p><a for="to I/O queue">Convert</a> <var>input</var> to an <a for=/>I/O queue</a> of scalar
+ values.
+
+ <li><p>Let <var>output</var> be the <a for=/>I/O queue</a> of bytes « <a>end-of-queue</a> ».
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of
+   <a>reading</a> from <var>input</var>.
+
+   <li><p>Let <var>result</var> be the result of <a>processing an item</a> with <var>item</var>, an
+   instance of the <a>UTF-8 encoder</a>, <var>input</var>, <var>output</var>, and
+   "<code>fatal</code>".
+
+   <li>
+    <p>Assert: <var>result</var> is not an <a>error</a>.
+
+    <p class=note>The <a>UTF-8 encoder</a> cannot return <a>error</a>.
+
+   <li><p>If <var>result</var> is <a>finished</a>, then <a for="from I/O queue">convert</a>
+   <var>output</var> into a byte sequence and return a {{Uint8Array}} object wrapping an
+   {{ArrayBuffer}} containing <var>output</var>.
+   <!-- XXX https://www.w3.org/Bugs/Public/show_bug.cgi?id=26966 -->
+  </ol>
+</ol>
+
+<p>The
+<dfn method for=TextEncoder><code>encodeInto(<var>source</var>, <var>destination</var>)</code></dfn>
+method steps are:
+
+<ol>
+ <li><p>Let <var>read</var> be 0.
+
+ <li><p>Let <var>written</var> be 0.
+
+ <li><p>Let <var>encoder</var> be an instance of the <a>UTF-8 encoder</a>.
+
+ <li>
+  <p>Let <var>unused</var> be the <a for=/>I/O queue</a> of scalar values « <a>end-of-queue</a> ».
+
+  <p class=note>The <a>handler</a> algorithm invoked below requires this argument, but it is not
+  used by the <a>UTF-8 encoder</a>.
+
+ <li><p><a for="to I/O queue">Convert</a> <var>source</var> to an <a for=/>I/O queue</a> of scalar
+ values.
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of <a>reading</a> from <var>source</var>.
+
+   <li><p>Let <var>result</var> be the result of running <var>encoder</var>'s <a>handler</a> on
+   <var>unused</var> and <var>item</var>.
+
+   <li><p>If <var>result</var> is <a>finished</a>, then <a for=iteration>break</a>.
+
+   <li>
+    <p>Otherwise:
+
+    <ol>
+     <li>
+      <p>If <var>destination</var>'s <a for="BufferSource">byte length</a> &minus;
+      <var>written</var> is greater than or equal to the number of bytes in <var>result</var>, then:
+
+      <ol>
+       <li><p>If <var>item</var> is greater than U+FFFF, then increment <var>read</var> by 2.
+
+       <li><p>Otherwise, increment <var>read</var> by 1.
+
+       <li>
+        <p><a for="ArrayBufferView">Write</a> the bytes in <var>result</var> into
+        <var>destination</var>, with <a for="ArrayBufferView/write"><i>startingOffset</i></a> set to
+        <var>written</var>.
+
+        <p class=warning>See the
+        <a href=#sharedarraybuffer-warning>warning for <code>SharedArrayBuffer</code> objects</a>
+        above.
+
+       <li><p>Increment <var>written</var> by the number of bytes in <var>result</var>.
+      </ol>
+
+     <li><p>Otherwise, <a for=iteration>break</a>.
+    </ol>
+  </ol>
+
+ <li><p>Return «[ "{{TextEncoderEncodeIntoResult/read}}" → <var>read</var>,
+ "{{TextEncoderEncodeIntoResult/written}}" → <var>written</var> ]».
+</ol>
+
+<div class=example id=example-textencoder-encodeinto>
+ <p>The <a method=for=TextEncoder lt="encodeInto(source, destination)">encodeInto()</a> method can
+ be used to encode a string into an existing {{ArrayBuffer}} object. Various details below are left
+ as an exercise for the reader, but this demonstrates an approach one could take to use this method:
+
+ <pre><code class=lang-javascript>
+function convertString(buffer, input, callback) {
+  let bufferSize = 256,
+      bufferStart = malloc(buffer, bufferSize),
+      writeOffset = 0,
+      readOffset = 0;
+  while (true) {
+    const view = new Uint8Array(buffer, bufferStart + writeOffset, bufferSize - writeOffset),
+          {read, written} = cachedEncoder.encodeInto(input.substring(readOffset), view);
+    readOffset += read;
+    writeOffset += written;
+    if (readOffset === input.length) {
+      callback(bufferStart, writeOffset);
+      free(buffer, bufferStart);
+      return;
+    }
+    bufferSize *= 2;
+    bufferStart = realloc(buffer, bufferStart, bufferSize);
+  }
+}
+</code></pre>
+</div>
+
+
+<h3 id=interface-textdecoderstream>Interface {{TextDecoderStream}}</h3>
+
+<pre class=idl>
+[Exposed=*]
+interface TextDecoderStream {
+  constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options = {});
+};
+TextDecoderStream includes TextDecoderCommon;
+TextDecoderStream includes GenericTransformStream;
+</pre>
+
+<dl class=domintro>
+ <dt><code><var>decoder</var> = new
+ <a constructor for=TextDecoderStream lt=TextDecoderStream()>TextDecoderStream([<var>label</var> =
+ "utf-8" [, <var>options</var>]])</a></code>
+ <dd>
+  <p>Returns a new {{TextDecoderStream}} object.
+  <p>If <var>label</var> is either not a label or is a <a for=encoding>label</a> for
+  <a>replacement</a>, <a>throws</a> a {{RangeError}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>encoding</a></code>
+ <dd><p>Returns <a for=TextDecoderCommon>encoding</a>'s <a for=encoding>name</a>, lowercased.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>fatal</a></code>
+ <dd><p>Returns true if <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>", and
+ false otherwise.
+
+ <dt><code><var>decoder</var> . <a attribute for=TextDecoderCommon>ignoreBOM</a></code>
+ <dd><p>Returns the value of <a for=TextDecoderCommon>ignore BOM</a>.
+
+ <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>readable</a></code>
+ <dd>
+  <p>Returns a <a>readable stream</a> whose <a>chunks</a> are strings resulting from running
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> on the chunks written to
+  {{GenericTransformStream/writable}}.
+
+ <dt><code><var>decoder</var> . <a attribute for=GenericTransformStream>writable</a></code>
+ <dd>
+  <p>Returns a <a>writable stream</a> which accepts
+  <code><a typedef>AllowSharedBufferSource</a></code> chunks and runs
+  them through <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> before making them
+  available to {{GenericTransformStream/readable}}.
+
+  <p>Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a
+  {{ReadableStream}} source.
+
+  <pre class=example id=example-textdecoderstream-writable><code class=lang-javascript>
+var decoder = new TextDecoderStream(encoding);
+byteReadable
+  .pipeThrough(decoder)
+  .pipeTo(textWritable);</code></pre>
+
+  <p>If the <a for=TextDecoderCommon>error mode</a> is "<code>fatal</code>" and
+  <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a> returns <a>error</a>, both
+  {{GenericTransformStream/readable}} and {{GenericTransformStream/writable}} will be errored with a
+  {{TypeError}}.
+</dl>
+
+<p>The
+<dfn constructor for=TextDecoderStream lt="TextDecoderStream(label, options)" id=dom-textdecoderstream><code>new TextDecoderStream(<var>label</var>, <var>options</var>)</code></dfn>
+constructor steps are:
+
+<ol>
+ <li><p>Let <var>encoding</var> be the result of <a>getting an encoding</a> from <var>label</var>.
+
+ <li><p>If <var>encoding</var> is failure or <a>replacement</a>, then <a>throw</a> a {{RangeError}}.
+
+ <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>encoding</a> to <var>encoding</var>.
+
+ <li><p>If <var>options</var>["{{TextDecoderOptions/fatal}}"] is true, then set <a>this</a>'s
+ <a for=TextDecoderCommon>error mode</a> to "<code>fatal</code>".
+
+ <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>ignore BOM</a> to
+ <var>options</var>["{{TextDecoderOptions/ignoreBOM}}"].
+
+ <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>decoder</a> to a new instance of <a>this</a>'s
+ <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>, and set <a>this</a>'s
+ <a for=TextDecoderCommon>I/O queue</a> to a new <a for=/>I/O queue</a>.
+
+ <li><p>Let <var>transformAlgorithm</var> be an algorithm which takes a <var>chunk</var> argument
+ and runs the <a>decode and enqueue a chunk</a> algorithm with <a>this</a> and <var>chunk</var>.
+
+ <li><p>Let <var>flushAlgorithm</var> be an algorithm which takes no arguments and runs the
+ <a>flush and enqueue</a> algorithm with <a>this</a>.
+
+ <li><p>Let <var>transformStream</var> be a [=new=] {{TransformStream}}.
+
+ <li><p>[=TransformStream/Set up=] <var>transformStream</var> with
+ <a for="TransformStream/set up"><var ignore>transformAlgorithm</var></a> set to
+ <var>transformAlgorithm</var> and
+ <a for="TransformStream/set up"><var ignore>flushAlgorithm</var></a> set to
+ <var>flushAlgorithm</var>.
+
+ <li><p>Set <a>this</a>'s <a for=GenericTransformStream>transform</a> to <var>transformStream</var>.
+</ol>
+
+<p>The <dfn>decode and enqueue a chunk</dfn> algorithm, given a {{TextDecoderStream}} object
+<var>decoder</var> and a <var>chunk</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>bufferSource</var> be the result of
+ <a lt="converted to an IDL value">converting</a> <var>chunk</var> to an
+ <code><a typedef>AllowSharedBufferSource</a></code>.
+
+ <li>
+  <p><a>Push</a> a <a lt="get a copy of the buffer source">copy of</a> <var>bufferSource</var> to
+  <var>decoder</var>'s <a for=TextDecoderCommon>I/O queue</a>.
+
+  <p class=warning>See the
+  <a href=#sharedarraybuffer-warning>warning for <code>SharedArrayBuffer</code> objects</a> above.
+
+ <li><p>Let <var>output</var> be the <a for=/>I/O queue</a> of scalar values
+ « <a>end-of-queue</a> ».
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of <a>reading</a> from <var>decoder</var>'s
+   <a for=TextDecoderCommon>I/O queue</a>.
+
+   <li>
+    <p>If <var>item</var> is <a>end-of-queue</a>, then:
+
+    <ol>
+     <li><p>Let <var>outputChunk</var> be the result of running <a>serialize I/O queue</a> with
+     <var>decoder</var> and <var>output</var>.
+
+     <li><p>If <var>outputChunk</var> is non-empty, then <a for=TransformStream>enqueue</a>
+     <var>outputChunk</var> in <var>decoder</var>'s <a for=GenericTransformStream>transform</a>.
+
+     <li><p>Return.
+    </ol>
+
+   <li><p>Let <var>result</var> be the result of <a>processing an item</a> with <var>item</var>,
+   <var>decoder</var>'s <a for=TextDecoderCommon>decoder</a>, <var>decoder</var>'s
+   <a for=TextDecoderCommon>I/O queue</a>, <var>output</var>, and <var>decoder</var>'s
+   <a for=TextDecoderCommon>error mode</a>.
+
+   <li><p>If <var>result</var> is <a>error</a>, then <a>throw</a> a {{TypeError}}.
+  </ol>
+</ol>
+
+<p>The <dfn>flush and enqueue</dfn> algorithm, which handles the end of data from the input
+{{ReadableStream}} object, given a {{TextDecoderStream}} object <var>decoder</var>, runs these
+steps:
+
+<ol>
+ <li><p>Let <var>output</var> be the <a for=/>I/O queue</a> of scalar values
+ « <a>end-of-queue</a> ».
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of <a>reading</a> from <var>decoder</var>'s
+   <a for=TextDecoderCommon>I/O queue</a>.
+
+   <li><p>Let <var>result</var> be the result of <a>processing an item</a> with <var>item</var>,
+   <var>decoder</var>'s <a for=TextDecoderCommon>decoder</a>, <var>decoder</var>'s
+   <a for=TextDecoderCommon>I/O queue</a>, <var>output</var>, and <var>decoder</var>'s
+   <a for=TextDecoderCommon>error mode</a>.
+
+   <li>
+    <p>If <var>result</var> is <a>finished</a>, then:
+
+    <ol>
+     <li><p>Let <var>outputChunk</var> be the result of running <a>serialize I/O queue</a> with
+     <var>decoder</var> and <var>output</var>.
+
+     <li><p>If <var>outputChunk</var> is non-empty, then <a for=TransformStream>enqueue</a>
+     <var>outputChunk</var> in <var>decoder</var>'s <a for=GenericTransformStream>transform</a>.
+
+     <li><p>Return.
+    </ol>
+   </li>
+
+   <li><p>Otherwise, if <var>result</var> is <a>error</a>, <a>throw</a> a {{TypeError}}.
+  </ol>
+ </li>
+</ol>
+
+
+<h3 id=interface-textencoderstream>Interface {{TextEncoderStream}}</h3>
+
+<pre class=idl>
+[Exposed=*]
+interface TextEncoderStream {
+  constructor();
+};
+TextEncoderStream includes TextEncoderCommon;
+TextEncoderStream includes GenericTransformStream;
+</pre>
+
+<p>A {{TextEncoderStream}} object has an associated:
+
+<dl>
+ <dt><dfn for=TextEncoderStream>encoder</dfn>
+ <dd>An <a for=/>encoder</a> instance.
+
+ <dt><dfn for=TextEncoderStream id=textencoderstream-pending-high-surrogate>leading surrogate</dfn>
+ <dd>Null or a <a for=/>leading surrogate</a>, initially null.
+</dl>
+
+<p class="note no-backref">A {{TextEncoderStream}} object offers no <var>label</var> argument as it
+only supports <a>UTF-8</a>.
+
+<dl class=domintro>
+ <dt><code><var>encoder</var> = new <a constructor for=TextEncoderStream>TextEncoderStream()</a></code>
+ <dd><p>Returns a new {{TextEncoderStream}} object.
+
+ <dt><code><var>encoder</var> . <a attribute for=TextEncoderCommon>encoding</a></code>
+ <dd><p>Returns "<code>utf-8</code>".
+
+ <dt><code><var>encoder</var> . <a attribute for=GenericTransformStream>readable</a></code>
+ <dd>
+  <p>Returns a <a>readable stream</a> whose <a>chunks</a> are {{Uint8Array}}s resulting from running
+  <a>UTF-8</a>'s <a for=/>encoder</a> on the chunks written to {{GenericTransformStream/writable}}.
+
+ <dt><code><var>encoder</var> . <a attribute for=GenericTransformStream>writable</a></code>
+ <dd>
+  <p>Returns a <a>writable stream</a> which accepts string chunks and runs them through
+  <a>UTF-8</a>'s <a for=/>encoder</a> before making them available to
+  {{GenericTransformStream/readable}}.
+
+  <p>Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a
+  {{ReadableStream}} source.
+
+  <pre class=example id=example-textencoderstream-writable><code class=lang-javascript>
+textReadable
+  .pipeThrough(new TextEncoderStream())
+  .pipeTo(byteWritable);</code></pre>
+</dl>
+
+<p>The
+<dfn constructor for=TextEncoderStream lt=TextEncoderStream() id=dom-textencoderstream><code>new TextEncoderStream()</code></dfn>
+constructor steps are:
+
+<ol>
+ <li><p>Set <a>this</a>'s <a for=TextEncoderStream>encoder</a> to an instance of the
+ <a>UTF-8 encoder</a>.
+
+ <li><p>Let <var>transformAlgorithm</var> be an algorithm which takes a <var>chunk</var> argument
+ and runs the <a>encode and enqueue a chunk</a> algorithm with <a>this</a> and <var>chunk</var>.
+
+ <li><p>Let <var>flushAlgorithm</var> be an algorithm which runs the <a>encode and flush</a>
+ algorithm with <a>this</a>.
+
+ <li><p>Let <var>transformStream</var> be a [=new=] {{TransformStream}}.
+
+ <li><p>[=TransformStream/Set up=] <var>transformStream</var> with
+ <a for="TransformStream/set up"><var ignore>transformAlgorithm</var></a> set to
+ <var>transformAlgorithm</var> and
+ <a for="TransformStream/set up"><var ignore>flushAlgorithm</var></a> set to
+ <var>flushAlgorithm</var>.
+
+ <li><p>Set <a>this</a>'s <a for=GenericTransformStream>transform</a> to <var>transformStream</var>.
+</ol>
+
+<hr>
+
+<p>The <dfn>encode and enqueue a chunk</dfn> algorithm, given a {{TextEncoderStream}} object
+<var>encoder</var> and <var>chunk</var>, runs these steps:
+
+<ol>
+ <li><p>Let <var>input</var> be the result of <a lt="converted to an IDL value">converting</a>
+ <var>chunk</var> to a {{DOMString}}.
+
+ <li><p><a for="to I/O queue">Convert</a> <var>input</var> to an <a for=/>I/O queue</a> of
+ <a>code units</a>.
+
+ <p class=note>{{DOMString}}, as well as an <a for=/>I/O queue</a> of code units rather than scalar
+ values, are used here so that a surrogate pair that is split between chunks can be reassembled into
+ the appropriate scalar value. The behavior is otherwise identical to {{USVString}}. In particular,
+ lone surrogates will be replaced with U+FFFD.
+
+ <li><p>Let <var>output</var> be the <a for=/>I/O queue</a> of bytes « <a>end-of-queue</a> ».
+
+ <li>
+  <p>While true:
+
+  <ol>
+   <li><p>Let <var>item</var> be the result of <a>reading</a> from <var>input</var>.
+
+   <li>
+    <p>If <var>item</var> is <a>end-of-queue</a>, then:
+
+    <ol>
+     <li><p><a for="from I/O queue">Convert</a> <var>output</var> into a byte sequence.
+
+     <li>
+      <p>If <var>output</var> is non-empty, then:
+
+      <ol>
+       <li><p>Let <var>chunk</var> be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing
+       <var>output</var>.
+
+       <li><p><a for=TransformStream>Enqueue</a> <var>chunk</var> into <var>encoder</var>'s
+       <a for=GenericTransformStream>transform</a>.
+      </ol>
+
+     <li><p>Return.
+    </ol>
+
+   <li><p>Let <var>result</var> be the result of executing the <a>convert code unit to scalar
+   value</a> algorithm with <var>encoder</var>, <var>item</var> and <var>input</var>.
+
+   <li><p>If <var>result</var> is not <a>continue</a>, then <a>process an item</a> with
+   <var>result</var>, <var>encoder</var>'s <a for=TextEncoderStream>encoder</a>, <var>input</var>,
+   <var>output</var>, and "<code>fatal</code>".
+  </ol>
+</ol>
+
+<p>The <dfn>convert code unit to scalar value</dfn> algorithm, given a {{TextEncoderStream}} object
+<var>encoder</var>, a <a>code unit</a> <var>item</var>, and an <a for=/>I/O queue</a> of code units
+<var>input</var>, runs these steps:
+
+<ol>
+ <li>
+  <p>If <var>encoder</var>'s <a for=TextEncoderStream>leading surrogate</a> is non-null, then:
+
+  <ol>
+   <li><p>Let <var>leadingSurrogate</var> be <var>encoder</var>'s
+   <a for=TextEncoderStream>leading surrogate</a>.
+
+   <li><p>Set <var>encoder</var>'s <a for=TextEncoderStream>leading surrogate</a> to null.
+
+   <li><p>If <var>item</var> is a <a for=/>trailing surrogate</a>, then return a
+   <a>scalar value from surrogates</a> given <var>leadingSurrogate</var> and <var>item</var>.
+
+   <li><p><a>Restore</a> <var>item</var> to <var>input</var>.
+
+   <li><p>Return U+FFFD.
+  </ol>
+
+ <li><p>If <var>item</var> is a <a for=/>leading surrogate</a>, then set <var>encoder</var>'s
+ <a for=TextEncoderStream>leading surrogate</a> to <var>item</var> and return <a>continue</a>.
+
+ <li><p>If <var>item</var> is a <a for=/>trailing surrogate</a>, then return U+FFFD.
+
+ <li><p>Return <var>item</var>.
+</ol>
+
+<p class=note>This is equivalent to the "<a for=string>convert</a> a <a for=/>string</a> into a
+<a for=/>scalar value string</a>" algorithm from the Infra Standard, but allows for surrogate pairs
+that are split between strings. [[!INFRA]]
+
+<p>The <dfn>encode and flush</dfn> algorithm, given a {{TextEncoderStream}} object
+<var>encoder</var>, runs these steps:
+
+<ol>
+ <li>
+  <p>If <var>encoder</var>'s <a for=TextEncoderStream>leading surrogate</a> is non-null, then:
+
+  <ol>
+   <li>
+    <p>Let <var>chunk</var> be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing
+    0xEF 0xBF 0xBD.
+
+    <p class=note>This is U+FFFD (�) in <a>UTF-8</a> bytes.
+
+   <li><p><a for=TransformStream>Enqueue</a> <var>chunk</var> into <var>encoder</var>'s
+   <a for=GenericTransformStream>transform</a>.
+  </ol>
+</ol>
+
+
+
+<h2 id=the-encoding>The encoding</h2>
+
+<h3 id=utf-8 dfn export>UTF-8</h3>
+
+<h4 id=utf-8-decoder dfn export>UTF-8 decoder</h4>
+
+<p class=note>A byte order mark has priority over a label as it has been found to be more accurate
+in deployed content. Therefore it is not part of the <a>UTF-8 decoder</a> algorithm, but rather the
+<a>decode</a> and <a>UTF-8 decode</a> algorithms.
+
+<p><a>UTF-8</a>'s <a for=/>decoder</a> has an associated
+<dfn>UTF-8 code point</dfn>, <dfn>UTF-8 bytes seen</dfn>, and
+<dfn>UTF-8 bytes needed</dfn> (all initially 0), a <dfn>UTF-8 lower boundary</dfn>
+(initially 0x80), and a <dfn>UTF-8 upper boundary</dfn> (initially 0xBF).
+
+<p><a>UTF-8</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>UTF-8 bytes needed</a> is not 0, set
+ <a>UTF-8 bytes needed</a> to 0 and return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li>
+  <p>If <a>UTF-8 bytes needed</a> is 0, based on <var>byte</var>:
+
+  <dl class=switch>
+   <dt>0x00 to 0x7F
+   <dd><p>Return a code point whose value is <var>byte</var>.
+
+   <dt>0xC2 to 0xDF
+   <dd>
+    <ol>
+     <li><p>Set <a>UTF-8 bytes needed</a> to 1.
+
+     <li>
+      <p>Set <a>UTF-8 code point</a> to <var>byte</var> &amp; 0x1F.
+
+      <p class=note>The five least significant bits of <var>byte</var>.
+    </ol>
+
+   <dt>0xE0 to 0xEF
+   <dd>
+    <ol>
+     <li><p>If <var>byte</var> is 0xE0, set
+     <a>UTF-8 lower boundary</a> to 0xA0.
+
+     <li><p>If <var>byte</var> is 0xED, set
+     <a>UTF-8 upper boundary</a> to 0x9F.
+
+     <li><p>Set <a>UTF-8 bytes needed</a> to 2.
+
+     <li>
+      <p>Set <a>UTF-8 code point</a> to <var>byte</var> &amp; 0xF.
+
+      <p class=note>The four least significant bits of <var>byte</var>.
+    </ol>
+
+   <dt>0xF0 to 0xF4
+   <dd>
+    <ol>
+     <li><p>If <var>byte</var> is 0xF0, set
+     <a>UTF-8 lower boundary</a> to 0x90.
+
+     <li><p>If <var>byte</var> is 0xF4, set
+     <a>UTF-8 upper boundary</a> to 0x8F.
+
+     <li><p>Set <a>UTF-8 bytes needed</a> to 3.
+
+     <li>
+      <p>Set <a>UTF-8 code point</a> to <var>byte</var> &amp; 0x7.
+
+      <p class=note>The three least significant bits of <var>byte</var>.
+    </ol>
+
+   <dt>Otherwise
+   <dd><p>Return <a>error</a>.
+  </dl>
+
+  <p>Return <a>continue</a>.
+
+ <li>
+  <p>If <var>byte</var> is not in the range <a>UTF-8 lower boundary</a> to
+  <a>UTF-8 upper boundary</a>, inclusive, then:
+
+  <ol>
+   <li><p>Set <a>UTF-8 code point</a>,
+   <a>UTF-8 bytes needed</a>, and <a>UTF-8 bytes seen</a> to 0,
+   set <a>UTF-8 lower boundary</a> to 0x80, and set
+   <a>UTF-8 upper boundary</a> to 0xBF.
+
+   <li><p><a>Restore</a> <var>byte</var> to <var>ioQueue</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>Set <a>UTF-8 lower boundary</a> to 0x80 and
+ <a>UTF-8 upper boundary</a> to 0xBF.
+
+ <li>
+  <p>Set <a>UTF-8 code point</a> to (<a>UTF-8 code point</a> &lt;&lt; 6) |
+  (<var>byte</var> &amp; 0x3F)
+
+  <p class="note no-backref">Shift the existing bits of <a>UTF-8 code point</a> left by six
+  places and set the newly-vacated six least significant bits to the six least significant bits of
+  <var>byte</var>.
+
+ <li><p>Increase <a>UTF-8 bytes seen</a> by one.
+
+ <li><p>If <a>UTF-8 bytes seen</a> is not equal to
+ <a>UTF-8 bytes needed</a>, return <a>continue</a>.
+
+ <li><p>Let <var>code point</var> be <a>UTF-8 code point</a>.
+
+ <li><p>Set <a>UTF-8 code point</a>,
+ <a>UTF-8 bytes needed</a>, and <a>UTF-8 bytes seen</a> to 0.
+
+ <li><p>Return a code point whose value is <var>code point</var>.
+</ol>
+
+<p class=note>The constraints in the <a>UTF-8 decoder</a> above match
+“Best Practices for Using U+FFFD” from the Unicode standard. No other
+behavior is permitted per the Encoding Standard (other algorithms that
+achieve the same result are fine, even encouraged).
+[[!UNICODE]]
+
+
+<h4 id=utf-8-encoder dfn export>UTF-8 encoder</h4>
+
+<p><a>UTF-8</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li>
+  <p>Set <var>count</var> and <var>offset</var> based on the
+  range <var>code point</var> is in:
+
+  <dl class=switch>
+   <dt>U+0080 to U+07FF, inclusive
+   <dd>1 and 0xC0
+   <dt>U+0800 to U+FFFF, inclusive
+   <dd>2 and 0xE0
+   <dt>U+10000 to U+10FFFF, inclusive
+   <dd>3 and 0xF0
+  </dl>
+
+ <li><p>Let <var>bytes</var> be a byte sequence whose first byte is
+ (<var>code point</var> >> (6 × <var>count</var>)) + <var>offset</var>.
+
+ <li>
+  <p>While <var>count</var> is greater than 0:
+
+  <ol>
+   <li><p>Set <var>temp</var> to
+   <var>code point</var> >> (6 × (<var>count</var> &minus; 1)).
+
+   <li><p>Append to <var>bytes</var> 0x80 | (<var>temp</var> &amp; 0x3F).
+
+   <li><p>Decrease <var>count</var> by one.
+  </ol>
+
+ <li><p>Return bytes <var>bytes</var>, in order.
+</ol>
+
+<p class=note>This algorithm has identical results to the one described in the Unicode standard. It
+is included here for completeness. [[!UNICODE]]
+
+
+
+<h2 id=legacy-single-byte-encodings>Legacy single-byte encodings</h2>
+
+<p>An <a for=/>encoding</a> where each byte is either a single code point or
+nothing, is a <dfn>single-byte encoding</dfn>.
+<a>Single-byte encodings</a> share the
+<a for=/>decoder</a> and <a for=/>encoder</a>. <dfn>Index single-byte</dfn>,
+as referenced by the <a>single-byte decoder</a> and
+<a>single-byte encoder</a>,  is defined by the following table, and
+depends on the <a>single-byte encoding</a> in use. All but two
+<a>single-byte encodings</a> have a
+unique <a>index</a>.
+
+<table>
+ <tr><td><dfn export>IBM866</dfn><td><a href=index-ibm866.txt>index-ibm866.txt</a><td><a href=ibm866.html>index IBM866 visualization</a><td><a href=ibm866-bmp.html>index IBM866 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-2</dfn><td><a href=index-iso-8859-2.txt>index-iso-8859-2.txt</a><td><a href=iso-8859-2.html>index ISO-8859-2 visualization</a><td><a href=iso-8859-2-bmp.html>index ISO-8859-2 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-3</dfn><td><a href=index-iso-8859-3.txt>index-iso-8859-3.txt</a><td><a href=iso-8859-3.html>index ISO-8859-3 visualization</a><td><a href=iso-8859-3-bmp.html>index ISO-8859-3 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-4</dfn><td><a href=index-iso-8859-4.txt>index-iso-8859-4.txt</a><td><a href=iso-8859-4.html>index ISO-8859-4 visualization</a><td><a href=iso-8859-4-bmp.html>index ISO-8859-4 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-5</dfn><td><a href=index-iso-8859-5.txt>index-iso-8859-5.txt</a><td><a href=iso-8859-5.html>index ISO-8859-5 visualization</a><td><a href=iso-8859-5-bmp.html>index ISO-8859-5 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-6</dfn><td><a href=index-iso-8859-6.txt>index-iso-8859-6.txt</a><td><a href=iso-8859-6.html>index ISO-8859-6 visualization</a><td><a href=iso-8859-6-bmp.html>index ISO-8859-6 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-7</dfn><td><a href=index-iso-8859-7.txt>index-iso-8859-7.txt</a><td><a href=iso-8859-7.html>index ISO-8859-7 visualization</a><td><a href=iso-8859-7-bmp.html>index ISO-8859-7 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-8</dfn><td rowspan=2><a href=index-iso-8859-8.txt>index-iso-8859-8.txt</a><td rowspan=2><a href=iso-8859-8.html>index ISO-8859-8 visualization</a><td rowspan=2><a href=iso-8859-8-bmp.html>index ISO-8859-8 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-8-I</dfn>
+ <tr><td><dfn export>ISO-8859-10</dfn><td><a href=index-iso-8859-10.txt>index-iso-8859-10.txt</a><td><a href=iso-8859-10.html>index ISO-8859-10 visualization</a><td><a href=iso-8859-10-bmp.html>index ISO-8859-10 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-13</dfn><td><a href=index-iso-8859-13.txt>index-iso-8859-13.txt</a><td><a href=iso-8859-13.html>index ISO-8859-13 visualization</a><td><a href=iso-8859-13-bmp.html>index ISO-8859-13 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-14</dfn><td><a href=index-iso-8859-14.txt>index-iso-8859-14.txt</a><td><a href=iso-8859-14.html>index ISO-8859-14 visualization</a><td><a href=iso-8859-14-bmp.html>index ISO-8859-14 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-15</dfn><td><a href=index-iso-8859-15.txt>index-iso-8859-15.txt</a><td><a href=iso-8859-15.html>index ISO-8859-15 visualization</a><td><a href=iso-8859-15-bmp.html>index ISO-8859-15 BMP coverage</a>
+ <tr><td><dfn export>ISO-8859-16</dfn><td><a href=index-iso-8859-16.txt>index-iso-8859-16.txt</a><td><a href=iso-8859-16.html>index ISO-8859-16 visualization</a><td><a href=iso-8859-16-bmp.html>index ISO-8859-16 BMP coverage</a>
+ <tr><td><dfn export>KOI8-R</dfn><td><a href=index-koi8-r.txt>index-koi8-r.txt</a><td><a href=koi8-r.html>index KOI8-R visualization</a><td><a href=koi8-r-bmp.html>index KOI8-R BMP coverage</a>
+ <tr><td><dfn export>KOI8-U</dfn><td><a href=index-koi8-u.txt>index-koi8-u.txt</a><td><a href=koi8-u.html>index KOI8-U visualization</a><td><a href=koi8-u-bmp.html>index KOI8-U BMP coverage</a>
+ <tr><td><dfn export>macintosh</dfn><td><a href=index-macintosh.txt>index-macintosh.txt</a><td><a href=macintosh.html>index macintosh visualization</a><td><a href=macintosh-bmp.html>index macintosh BMP coverage</a>
+ <tr><td><dfn export>windows-874</dfn><td><a href=index-windows-874.txt>index-windows-874.txt</a><td><a href=windows-874.html>index windows-874 visualization</a><td><a href=windows-874-bmp.html>index windows-874 BMP coverage</a>
+ <tr><td><dfn export>windows-1250</dfn><td><a href=index-windows-1250.txt>index-windows-1250.txt</a><td><a href=windows-1250.html>index windows-1250 visualization</a><td><a href=windows-1250-bmp.html>index windows-1250 BMP coverage</a>
+ <tr><td><dfn export>windows-1251</dfn><td><a href=index-windows-1251.txt>index-windows-1251.txt</a><td><a href=windows-1251.html>index windows-1251 visualization</a><td><a href=windows-1251-bmp.html>index windows-1251 BMP coverage</a>
+ <tr><td><dfn export>windows-1252</dfn><td><a href=index-windows-1252.txt>index-windows-1252.txt</a><td><a href=windows-1252.html>index windows-1252 visualization</a><td><a href=windows-1252-bmp.html>index windows-1252 BMP coverage</a>
+ <tr><td><dfn export>windows-1253</dfn><td><a href=index-windows-1253.txt>index-windows-1253.txt</a><td><a href=windows-1253.html>index windows-1253 visualization</a><td><a href=windows-1253-bmp.html>index windows-1253 BMP coverage</a>
+ <tr><td><dfn export>windows-1254</dfn><td><a href=index-windows-1254.txt>index-windows-1254.txt</a><td><a href=windows-1254.html>index windows-1254 visualization</a><td><a href=windows-1254-bmp.html>index windows-1254 BMP coverage</a>
+ <tr><td><dfn export>windows-1255</dfn><td><a href=index-windows-1255.txt>index-windows-1255.txt</a><td><a href=windows-1255.html>index windows-1255 visualization</a><td><a href=windows-1255-bmp.html>index windows-1255 BMP coverage</a>
+ <tr><td><dfn export>windows-1256</dfn><td><a href=index-windows-1256.txt>index-windows-1256.txt</a><td><a href=windows-1256.html>index windows-1256 visualization</a><td><a href=windows-1256-bmp.html>index windows-1256 BMP coverage</a>
+ <tr><td><dfn export>windows-1257</dfn><td><a href=index-windows-1257.txt>index-windows-1257.txt</a><td><a href=windows-1257.html>index windows-1257 visualization</a><td><a href=windows-1257-bmp.html>index windows-1257 BMP coverage</a>
+ <tr><td><dfn export>windows-1258</dfn><td><a href=index-windows-1258.txt>index-windows-1258.txt</a><td><a href=windows-1258.html>index windows-1258 visualization</a><td><a href=windows-1258-bmp.html>index windows-1258 BMP coverage</a>
+ <tr><td><dfn export>x-mac-cyrillic</dfn><td><a href=index-x-mac-cyrillic.txt>index-x-mac-cyrillic.txt</a><td><a href=x-mac-cyrillic.html>index x-mac-cyrillic visualization</a><td><a href=x-mac-cyrillic-bmp.html>index x-mac-cyrillic BMP coverage</a>
+ </table>
+
+<p class=note><a>ISO-8859-8</a> and <a>ISO-8859-8-I</a> are
+distinct <a for=/>encoding</a> <a for=encoding>names</a>, because
+<a>ISO-8859-8</a> has influence on the layout direction. And although
+historically this might have been the case for <a>ISO-8859-6</a> and
+"ISO-8859-6-I" as well, that is no longer true.
+<!-- https://www.w3.org/Bugs/Public/show_bug.cgi?id=19505 -->
+
+<h3 id=single-byte-decoder dfn export>single-byte decoder</h3>
+
+<p><a>Single-byte encodings</a>'s
+<a for=/>decoder</a>'s <a>handler</a>, given <var>ioQueue</var> and
+<var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return a code point whose value
+ is <var>byte</var>.
+
+ <li><p>Let <var>code point</var> be the <a>index code point</a>
+ for <var>byte</var> &minus; 0x80 in <a>index single-byte</a>.
+
+ <li><p>If <var>code point</var> is null, return <a>error</a>.
+
+ <li><p>Return a code point whose value is <var>code point</var>.
+</ol>
+
+<h3 id=single-byte-encoder export dfn>single-byte encoder</h3>
+
+<p><a>Single-byte encodings</a>'s
+<a for=/>encoder</a>'s <a>handler</a>, given <var>ioQueue</var> and
+<var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>Let <var>pointer</var> be the <a>index pointer</a> for
+ <var>code point</var> in <a>index single-byte</a>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Return a byte whose value is <var>pointer</var> + 0x80.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-chinese-(simplified)-encodings>Legacy multi-byte Chinese (simplified) encodings</h2>
+
+<h3 id=gbk dfn export>GBK</h3>
+
+<h4 id=gbk-decoder dfn export>GBK decoder</h4>
+
+<p><a>GBK</a>'s <a for=/>decoder</a> is <a>gb18030</a>'s <a for=/>decoder</a>.
+
+
+<h4 id=gbk-encoder dfn export>GBK encoder</h4>
+
+<p><a>GBK</a>'s <a for=/>encoder</a> is <a>gb18030</a>'s <a for=/>encoder</a>
+with its <a>is GBK</a> set to true.
+
+<p class="note no-backref">Not fully aliasing <a>GBK</a> with <a>gb18030</a>
+is a conservative move to decrease the chances of breaking legacy servers and other
+consumers of content generated with <a>GBK</a>'s <a for=/>encoder</a>.
+
+
+<h3 id=gb18030 dfn export>gb18030</h3>
+
+<h4 id=gb18030-decoder dfn export>gb18030 decoder</h4>
+
+<p><a>gb18030</a>'s <a for=/>decoder</a> has an associated <dfn>gb18030 first</dfn>,
+<dfn>gb18030 second</dfn>, and <dfn>gb18030 third</dfn> (all initially 0x00).
+
+<p><a>gb18030</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>gb18030 first</a>, <a>gb18030 second</a>, and <a>gb18030 third</a>
+ are 0x00, return <a>finished</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a>, and
+ <a>gb18030 first</a>, <a>gb18030 second</a>, or <a>gb18030 third</a>
+ is not 0x00, set <a>gb18030 first</a>, <a>gb18030 second</a>, and
+ <a>gb18030 third</a> to 0x00, and return <a>error</a>.
+
+ <li>
+  <p>If <a>gb18030 third</a> is not 0x00, then:
+
+  <ol>
+   <li>
+    <p>If <var>byte</var> is not in the range 0x30 to 0x39, inclusive, then:
+
+    <ol>
+     <li><p><a>Restore</a> « <a>gb18030 second</a>, <a>gb18030 third</a>, <var>byte</var> » to
+     <var>ioQueue</var>.
+
+     <li><p>Set <a>gb18030 first</a>, <a>gb18030 second</a>, and <a>gb18030 third</a> to 0x00.
+
+     <li><p>Return <a>error</a>.
+    </ol>
+
+   <li><p>Let <var>code point</var> be the <a>index gb18030 ranges code point</a> for
+   ((<a>gb18030 first</a> &minus; 0x81) × (10 × 126 × 10)) +
+   ((<a>gb18030 second</a> &minus; 0x30) × (10 × 126)) +
+   ((<a>gb18030 third</a> &minus; 0x81) × 10) + <var>byte</var> &minus; 0x30.
+
+   <li><p>Set <a>gb18030 first</a>, <a>gb18030 second</a>, and <a>gb18030 third</a> to 0x00.
+
+   <li><p>If <var>code point</var> is null, return <a>error</a>.
+
+   <li><p>Return a code point whose value is <var>code point</var>.
+  </ol>
+
+ <li>
+  <p>If <a>gb18030 second</a> is not 0x00, then:
+
+  <ol>
+   <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+   <a>gb18030 third</a> to <var>byte</var> and return <a>continue</a>.
+
+   <li><p><a>Restore</a> « <a>gb18030 second</a>, <var>byte</var> » to <var>ioQueue</var>, set
+   <a>gb18030 first</a> and <a>gb18030 second</a> to 0x00, and return <a>error</a>.
+  </ol>
+
+ <li>
+  <p>If <a>gb18030 first</a> is not 0x00, then:
+
+  <ol>
+   <li><p>If <var>byte</var> is in the range 0x30 to 0x39, inclusive, set
+   <a>gb18030 second</a> to <var>byte</var> and return <a>continue</a>.
+
+   <li><p>Let <var>lead</var> be <a>gb18030 first</a>, let
+   <var>pointer</var> be null, and set <a>gb18030 first</a> to 0x00.
+
+   <li><p>Let <var>offset</var> be 0x40 if <var>byte</var> is less than 0x7F, otherwise 0x41.
+
+   <li><p>If <var>byte</var> is in the range 0x40 to 0x7E, inclusive, or
+   0x80 to 0xFE, inclusive, set <var>pointer</var> to
+   (<var>lead</var> &minus; 0x81) × 190 + (<var>byte</var> &minus; <var>offset</var>).
+
+   <li><p>Let <var>code point</var> be null if <var>pointer</var> is null, otherwise the
+   <a>index code point</a> for <var>pointer</var> in <a>index gb18030</a>.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>restore</a> <var>byte</var> to
+   <var>ioQueue</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is 0x80, return code point U+20AC.
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+ <a>gb18030 first</a> to <var>byte</var> and return <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=gb18030-encoder dfn export>gb18030 encoder</h4>
+
+<p><a>gb18030</a>'s <a for=/>encoder</a> has an associated <dfn id=gbk-flag>is GBK</dfn>
+(initially false).
+
+<p><a>gb18030</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li>
+  <p>If <var>code point</var> is U+E5E5, return <a>error</a> with <var>code point</var>.
+
+  <p class=note><a>Index gb18030</a> maps 0xA3 0xA0 to U+3000 rather than U+E5E5 for
+  compatibility with deployed content. Therefore it cannot roundtrip.
+
+ <li><p>If <a>is GBK</a> is true and <var>code point</var> is
+ U+20AC, return byte 0x80.
+
+ <li>
+  <p>If there is a row in the table below whose first column is <var>code point</var>, then return
+  the two bytes on the same row listed in the second column:
+
+  <table>
+   <tr>
+    <th>Code point
+    <th>Bytes
+   <tr>
+    <td>U+E78D
+    <td>0xA6 0xD9
+   <tr>
+    <td>U+E78E
+    <td>0xA6 0xDA
+   <tr>
+    <td>U+E78F
+    <td>0xA6 0xDB
+   <tr>
+    <td>U+E790
+    <td>0xA6 0xDC
+   <tr>
+    <td>U+E791
+    <td>0xA6 0xDD
+   <tr>
+    <td>U+E792
+    <td>0xA6 0xDE
+   <tr>
+    <td>U+E793
+    <td>0xA6 0xDF
+   <tr>
+    <td>U+E794
+    <td>0xA6 0xEC
+   <tr>
+    <td>U+E795
+    <td>0xA6 0xED
+   <tr>
+    <td>U+E796
+    <td>0xA6 0xF3
+   <tr>
+    <td>U+E81E
+    <td>0xFE 0x59
+   <tr>
+    <td>U+E826
+    <td>0xFE 0x61
+   <tr>
+    <td>U+E82B
+    <td>0xFE 0x66
+   <tr>
+    <td>U+E82C
+    <td>0xFE 0x67
+   <tr>
+    <td>U+E832
+    <td>0xFE 0x6D
+   <tr>
+    <td>U+E843
+    <td>0xFE 0x7E
+   <tr>
+    <td>U+E854
+    <td>0xFE 0x90
+   <tr>
+    <td>U+E864
+    <td>0xFE 0xA0
+  </table>
+
+  <p class=note>This asymmetric encoder table preserves compatibility with the GB18030-2005
+  standard. See also the explanation at <a>index gb18030 ranges</a>.
+
+ <li><p>Let <var>pointer</var> be the <a>index pointer</a> for
+ <var>code point</var> in <a>index gb18030</a>.
+
+ <li>
+  <p>If <var>pointer</var> is non-null, then:
+
+  <ol>
+   <li><p>Let <var>lead</var> be <var>pointer</var> / 190 + 0x81.
+
+   <li><p>Let <var>trail</var> be <var>pointer</var> % 190.
+
+   <li><p>Let <var>offset</var> be 0x40 if <var>trail</var> is less than 0x3F,<!--0x7F-0x40-->
+   otherwise 0x41.
+
+   <li><p>Return two bytes whose values are <var>lead</var> and
+   <var>trail</var> + <var>offset</var>.
+  </ol>
+
+ <li><p>If <a>is GBK</a> is true, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Set <var>pointer</var> to the
+ <a>index gb18030 ranges pointer</a> for <var>code point</var>.
+
+ <li><p>Let <var>byte1</var> be <var>pointer</var> / (10 × 126 × 10).
+
+ <li><p>Set <var>pointer</var> to <var>pointer</var> % (10 × 126 × 10).
+
+ <li><p>Let <var>byte2</var> be <var>pointer</var> / (10 × 126).
+
+ <li><p>Set <var>pointer</var> to <var>pointer</var> % (10 × 126).
+
+ <li><p>Let <var>byte3</var> be <var>pointer</var> / 10.
+
+ <li><p>Let <var>byte4</var> be <var>pointer</var> % 10.
+
+ <li><p>Return four bytes whose values are <var>byte1</var> + 0x81,
+ <var>byte2</var> + 0x30, <var>byte3</var> + 0x81,
+ <var>byte4</var> + 0x30.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-chinese-(traditional)-encodings>Legacy multi-byte Chinese (traditional) encodings</h2>
+
+<!--
+ Lead:  0x81 to 0xFE
+ Trail: 0x40 to 0x7E or 0xA1 to 0xFE
+-->
+
+
+<h3 id=big5 dfn export>Big5</h3>
+
+<h4 id=big5-decoder dfn export>Big5 decoder</h4>
+
+<p><a>Big5</a>'s <a for=/>decoder</a> has an associated
+<dfn>Big5 lead</dfn> (initially 0x00).
+
+<a>Big5</a>'s <a for=/>decoder</a>'s <a>handler</a>, given <var>ioQueue</var>
+and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and <a>Big5 lead</a>
+ is not 0x00, set <a>Big5 lead</a> to 0x00 and return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and <a>Big5 lead</a>
+ is 0x00, return <a>finished</a>.
+
+ <li>
+  <p>If <a>Big5 lead</a> is not 0x00, let <var>lead</var> be
+  <a>Big5 lead</a>, let <var>pointer</var> be null, set
+  <a>Big5 lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>Let <var>offset</var> be 0x40 if <var>byte</var> is less than 0x7F, otherwise 0x62.
+   <!-- 0x62 = 0xA1-0x7E+1+0x40 -->
+
+   <li><p>If <var>byte</var> is in the range 0x40 to 0x7E, inclusive, or
+   0xA1 to 0xFE, inclusive, set <var>pointer</var> to
+   (<var>lead</var> &minus; 0x81) × 157 + (<var>byte</var> &minus; <var>offset</var>).
+
+   <li>
+    <p>If there is a row in the table below whose first column is
+    <var>pointer</var>, return the <em>two</em> code points listed in
+    its second column (the third column is irrelevant):
+
+    <table>
+     <tbody><tr><th>Pointer<th>Code points<th>Notes<!-- https://www.unicode.org/Public/UNIDATA/NamedSequences.txt -->
+     <tr><td>1133<!-- 0x88 0x62 --><td>U+00CA U+0304<td>Ê̄ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON)
+     <tr><td>1135<!-- 0x88 0x64 --><td>U+00CA U+030C<td>Ê̌ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON)
+     <tr><td>1164<!-- 0x88 0xA3 --><td>U+00EA U+0304<td>ê̄ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON)
+     <tr><td>1166<!-- 0x88 0xA5 --><td>U+00EA U+030C<td>ê̌ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON)
+    </table>
+    <!-- we do this to avoid PUA -->
+
+    <p class=note>Since <a lt=index>indexes</a> are limited to
+    single code points this table is used for these pointers.
+
+   <li><p>Let <var>code point</var> be null if <var>pointer</var> is null, otherwise the
+   <a>index code point</a> for <var>pointer</var> in <a>index Big5</a>.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>restore</a> <var>byte</var> to
+   <var>ioQueue</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+ <a>Big5 lead</a> to <var>byte</var> and return <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=big5-encoder dfn export>Big5 encoder</h4>
+
+<p><a>Big5</a>'s <a for=/>encoder</a>'s <a>handler</a>, given <var>ioQueue</var>
+and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>Let <var>pointer</var> be the <a>index Big5 pointer</a> for
+ <var>code point</var>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 157 + 0x81.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 157.
+
+ <li><p>Let <var>offset</var> be 0x40 if <var>trail</var> is less than 0x3F,<!--0x7F-0x40-->
+ otherwise 0x62.<!--0xA1-0x3F-->
+
+ <li><p>Return two bytes whose values are <var>lead</var> and
+ <var>trail</var> + <var>offset</var>.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-japanese-encodings>Legacy multi-byte Japanese encodings</h2>
+
+<h3 id=euc-jp dfn export>EUC-JP</h3>
+<!-- https://www.iana.org/assignments/charset-reg/CP51932 -->
+
+<h4 id=euc-jp-decoder dfn export>EUC-JP decoder</h4>
+
+<p><a>EUC-JP</a>'s <a for=/>decoder</a> has an associated
+<dfn id=euc-jp-jis0212-flag>EUC-JP jis0212</dfn> (initially false) and
+<dfn>EUC-JP lead</dfn> (initially 0x00).
+
+<p><a>EUC-JP</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>EUC-JP lead</a> is not 0x00, set <a>EUC-JP lead</a> to 0x00, and return
+ <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>EUC-JP lead</a> is 0x00, return <a>finished</a>.
+
+ <li><p>If <a>EUC-JP lead</a> is 0x8E and <var>byte</var> is
+ in the range 0xA1 to 0xDF, inclusive, set <a>EUC-JP lead</a> to 0x00 and return
+ a code point whose value is 0xFF61 &minus; 0xA1 + <var>byte</var>.
+ <!-- Katakana; subtraction is done first to avoid upsetting compilers -->
+
+ <li><p>If <a>EUC-JP lead</a> is 0x8F and <var>byte</var> is in the range
+ 0xA1 to 0xFE, inclusive, set <a>EUC-JP jis0212</a> to true, set
+ <a>EUC-JP lead</a> to <var>byte</var>, and return <a>continue</a>.
+
+ <li>
+  <p>If <a>EUC-JP lead</a> is not 0x00, let <var>lead</var> be <a>EUC-JP lead</a>, set
+  <a>EUC-JP lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>Let <var>code point</var> be null.
+
+   <li><p>If <var>lead</var> and <var>byte</var> are both in the range 0xA1 to 0xFE, inclusive, then
+   set <var>code point</var> to the <a>index code point</a> for
+   (<var>lead</var> &minus; 0xA1) × 94 + <var>byte</var> &minus; 0xA1
+   in <a>index jis0208</a> if <a>EUC-JP jis0212</a> is false and in
+   <a>index jis0212</a> otherwise.
+
+   <li><p>Set <a>EUC-JP jis0212</a> to false.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>restore</a> <var>byte</var> to
+   <var>ioQueue</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is 0x8E, 0x8F, or in the range 0xA1 to
+ 0xFE, inclusive, set <a>EUC-JP lead</a> to <var>byte</var> and return
+ <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=euc-jp-encoder dfn export>EUC-JP encoder</h4>
+
+<p><a>EUC-JP</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>If <var>code point</var> is U+00A5, return byte 0x5C.
+
+ <li><p>If <var>code point</var> is U+203E, return byte 0x7E.
+
+ <li><p>If <var>code point</var> is in the range U+FF61 to U+FF9F, inclusive, return
+ two bytes whose values are 0x8E and <var>code point</var> &minus; 0xFF61 + 0xA1.
+
+ <li><p>If <var>code point</var> is U+2212, set it to U+FF0D.
+
+ <li>
+  <p>Let <var>pointer</var> be the <a>index pointer</a> for <var>code point</var> in
+  <a>index jis0208</a>.
+
+  <p class=note>If <var>pointer</var> is non-null, it is less than 8836 due to the nature of
+  <a>index jis0208</a> and the <a>index pointer</a> operation.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0xA1.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0xA1.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and
+ <var>trail</var>.
+</ol>
+
+
+<h3 id=iso-2022-jp dfn export>ISO-2022-JP</h3>
+<!--
+ https://tools.ietf.org/html/rfc1468
+ https://tools.ietf.org/html/rfc2237 (ISO-2022-JP-1; not used)
+ "ESC ) I" is from ISO-2022-JP-3 reportedly
+-->
+
+<h4 id=iso-2022-jp-decoder dfn export>ISO-2022-JP decoder</h4>
+
+<p><a>ISO-2022-JP</a>'s <a for=/>decoder</a> has an associated
+<dfn>ISO-2022-JP decoder state</dfn> (initially
+<a lt="ISO-2022-JP decoder ASCII">ASCII</a>),
+<dfn>ISO-2022-JP decoder output state</dfn> (initially
+<a lt="ISO-2022-JP decoder ASCII">ASCII</a>),
+<dfn>ISO-2022-JP lead</dfn> (initially 0x00), and
+<dfn id=iso-2022-jp-output-flag>ISO-2022-JP output</dfn> (initially false).
+
+<p><a>ISO-2022-JP</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps, switching on
+<a>ISO-2022-JP decoder state</a>:
+
+<dl class=switch>
+ <dt><dfn lt="ISO-2022-JP decoder ASCII">ASCII</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x00 to 0x7F, excluding 0x0E, 0x0F, and 0x1B
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return a code point whose
+   value is <var>byte</var>.
+
+   <dt><a>end-of-queue</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder Roman">Roman</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x5C
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return code point U+00A5.
+
+   <dt>0x7E
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return code point U+203E.
+
+   <dt>0x00 to 0x7F, excluding 0x0E, 0x0F, 0x1B, 0x5C, and 0x7E
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return a code point whose
+   value is <var>byte</var>.
+
+   <dt><a>end-of-queue</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder katakana">katakana</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x21 to 0x5F
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return a code point whose
+   value is 0xFF61 &minus; 0x21 + <var>byte</var>.
+   <!-- Katakana; subtraction is done first to avoid upsetting compilers -->
+
+   <dt><a>end-of-queue</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder lead byte">Lead byte</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>continue</a>.
+
+   <dt>0x21 to 0x7E
+   <dd><p>Set <a>ISO-2022-JP output</a> to false,
+   <a>ISO-2022-JP lead</a> to <var>byte</var>,
+   <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder trail byte">trail byte</a>, and return
+   <a>continue</a>.
+
+   <dt><a>end-of-queue</a>
+   <dd><p>Return <a>finished</a>.
+
+   <dt>Otherwise
+   <dd><p>Set <a>ISO-2022-JP output</a> to false and return <a>error</a>.
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder trail byte">Trail byte</dfn>
+ <dd>
+  <p>Based on <var>byte</var>:
+  <dl class=switch>
+   <dt>0x1B
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape start">escape start</a> and return
+   <a>error</a>.
+   <!-- ISO-2022-JP decoder output state is still lead byte -->
+
+   <dt>0x21 to 0x7E
+   <dd>
+    <ol>
+     <li><p>Set the <a>ISO-2022-JP decoder state</a> to
+     <a lt="ISO-2022-JP decoder lead byte">lead byte</a>.
+
+     <li><p>Let <var>pointer</var> be
+     (<a>ISO-2022-JP lead</a> &minus; 0x21) × 94 + <var>byte</var> &minus; 0x21.
+
+     <li><p>Let <var>code point</var> be the <a>index code point</a> for
+     <var>pointer</var> in <a>index jis0208</a>.
+
+     <li><p>If <var>code point</var> is null, return <a>error</a>.
+
+     <li><p>Return a code point whose value is <var>code point</var>.
+    </ol>
+
+   <dt><a>end-of-queue</a>
+   <dd><p>Set the <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder lead byte">lead byte</a> and return <a>error</a>.
+
+   <dt>Otherwise
+   <dd><p>Set <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder lead byte">lead byte</a> and return
+   <a>error</a>.
+   <!-- ISO-2022-JP decoder output state is still lead byte -->
+  </dl>
+
+ <dt><dfn lt="ISO-2022-JP decoder escape start">Escape start</dfn>
+ <dd>
+  <ol>
+   <li><p>If <var>byte</var> is either <!--$-->0x24 or <!--(-->0x28, set
+   <a>ISO-2022-JP lead</a> to <var>byte</var>,
+   <a>ISO-2022-JP decoder state</a> to
+   <a lt="ISO-2022-JP decoder escape">escape</a>, and return
+   <a>continue</a>.
+
+   <li><p>If <var>byte</var> is not <a>end-of-queue</a>, then <a>restore</a>
+   <var>byte</var> to <var>ioQueue</var>.
+
+   <li><p>Set <a>ISO-2022-JP output</a> to false,
+   <a>ISO-2022-JP decoder state</a> to
+   <a>ISO-2022-JP decoder output state</a>, and return <a>error</a>.
+  </ol>
+
+ <dt><dfn lt="ISO-2022-JP decoder escape">Escape</dfn>
+ <dd>
+  <ol>
+   <li><p>Let <var>lead</var> be <a>ISO-2022-JP lead</a> and set
+   <a>ISO-2022-JP lead</a> to 0x00.
+
+   <li><p>Let <var>state</var> be null.
+
+   <li><p>If <var>lead</var> is 0x28 and <var>byte</var> is 0x42<!--B-->, set
+   <var>state</var> to <a lt="ISO-2022-JP decoder ASCII">ASCII</a>.
+
+   <li><p>If <var>lead</var> is 0x28 and <var>byte</var> is 0x4A<!--J-->, set
+   <var>state</var> to <a lt="ISO-2022-JP decoder Roman">Roman</a>.
+
+   <li><p>If <var>lead</var> is 0x28 and <var>byte</var> is 0x49<!--I-->, set
+   <var>state</var> to <a lt="ISO-2022-JP decoder katakana">katakana</a>.
+
+   <li><p>If <var>lead</var> is 0x24 and <var>byte</var> is either
+   0x40<!--@--> or 0x42<!--B-->, set <var>state</var> to
+   <a lt="ISO-2022-JP decoder lead byte">lead byte</a>.
+
+   <li>
+    <p>If <var>state</var> is non-null, then:
+
+    <ol>
+     <li><p>Set <a>ISO-2022-JP decoder state</a> and
+     <a>ISO-2022-JP decoder output state</a> to <var>state</var>.
+
+     <li><p>Let <var>output</var> be the value of <a>ISO-2022-JP output</a>.
+
+     <li><p>Set <a>ISO-2022-JP output</a> to true.
+
+     <li><p>Return <a>continue</a>, if <var>output</var> is false, and
+     <a>error</a> otherwise.
+    </ol>
+
+   <li><p>If <var>byte</var> is <a>end-of-queue</a>, then <a>restore</a> <var>lead</var> to
+   <var>ioQueue</var>; otherwise, <a>restore</a> « <var>lead</var>, <var>byte</var> » to
+   <var>ioQueue</var>.
+
+   <li><p>Set <a>ISO-2022-JP output</a> to false,
+   <a>ISO-2022-JP decoder state</a> to <a>ISO-2022-JP decoder output state</a>
+   and return <a>error</a>.
+  </ol>
+</dl>
+
+
+<h4 id=iso-2022-jp-encoder dfn export>ISO-2022-JP encoder</h4>
+
+<div class="note no-backref">
+ <p>The <a>ISO-2022-JP encoder</a> is the only <a for=/>encoder</a> for which the concatenation of
+ multiple outputs can result in an <a>error</a> when run through the corresponding
+ <a for=/>decoder</a>.
+
+ <p class=example id=example-iso-2022-jp-encoder-oddity>Encoding U+00A5 gives 0x1B 0x28 0x4A 0x5C
+ 0x1B 0x28 0x42. Doing that twice, concatenating the results, and then decoding yields U+00A5 U+FFFD
+ U+00A5.
+</div>
+
+<p><a>ISO-2022-JP</a>'s <a for=/>encoder</a> has an associated
+<dfn>ISO-2022-JP encoder state</dfn> which is <dfn lt="ISO-2022-JP encoder ASCII">ASCII</dfn>,
+<dfn lt="ISO-2022-JP encoder Roman">Roman</dfn>, or
+<dfn lt="ISO-2022-JP encoder jis0208">jis0208</dfn> (initially
+<a lt="ISO-2022-JP encoder ASCII">ASCII</a>).
+
+<p><a>ISO-2022-JP</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a> and
+ <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, set
+ <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, and return three bytes
+ 0x1B 0x28 0x42.
+
+ <li><p>If <var>code point</var> is <a>end-of-queue</a> and
+ <a>ISO-2022-JP encoder state</a> is
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, return <a>finished</a>.
+
+ <li>
+  <p>If <a>ISO-2022-JP encoder state</a> is
+  <a lt="ISO-2022-JP encoder ASCII">ASCII</a> or
+  <a lt="ISO-2022-JP encoder Roman">Roman</a>, and <var>code point</var> is U+000E, U+000F,
+  or U+001B, return <a>error</a> with U+FFFD.
+
+  <p class=note>This returns U+FFFD rather than <var>code point</var> to prevent attacks.
+  <!-- https://github.com/whatwg/encoding/issues/15 -->
+
+ <li><p>If <a>ISO-2022-JP encoder state</a> is
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a> and <var>code point</var> is an
+ <a>ASCII code point</a>, return a byte whose value is <var>code point</var>.
+
+ <li>
+  <p>If <a>ISO-2022-JP encoder state</a> is <a lt="ISO-2022-JP encoder Roman">Roman</a> and
+  <var>code point</var> is an <a>ASCII code point</a>, excluding U+005C and U+007E, or is U+00A5 or
+  U+203E, then:
+
+  <ol>
+   <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return a byte
+   whose value is <var>code point</var>.
+
+   <li><p>If <var>code point</var> is U+00A5, return byte 0x5C.
+
+   <li><p>If <var>code point</var> is U+203E, return byte 0x7E.
+  </ol>
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, and
+ <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>,
+ <a>restore</a> <var>code point</var> to
+ <var>ioQueue</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, and return three bytes
+ 0x1B 0x28 0x42.
+
+ <li><p>If <var>code point</var> is either U+00A5 or U+203E, and
+ <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder Roman">Roman</a>,
+ <a>restore</a> <var>code point</var> to
+ <var>ioQueue</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder Roman">Roman</a>, and return three bytes
+ 0x1B 0x28 0x4A.
+
+ <li><p>If <var>code point</var> is U+2212, set it to U+FF0D.
+
+ <li><p>If <var>code point</var> is in the range U+FF61 to U+FF9F, inclusive, set it to the
+ <a>index code point</a> for <var>code point</var> &minus; 0xFF61 in
+ <a>index ISO-2022-JP katakana</a>.
+
+ <li>
+  <p>Let <var>pointer</var> be the <a>index pointer</a> for <var>code point</var> in
+  <a>index jis0208</a>.
+
+  <p class=note>If <var>pointer</var> is non-null, it is less than 8836 due to the nature of
+  <a>index jis0208</a> and the <a>index pointer</a> operation.
+
+ <li>
+  <p>If <var>pointer</var> is null, then:
+
+  <ol>
+   <li><p>If <a>ISO-2022-JP encoder state</a> is <a lt="ISO-2022-JP encoder jis0208">jis0208</a>,
+   then <a>restore</a> <var>code point</var> to <var>ioQueue</var>, set
+   <a>ISO-2022-JP encoder state</a> to <a lt="ISO-2022-JP encoder ASCII">ASCII</a>, and return three
+   bytes 0x1B 0x28 0x42.
+
+   <li><p>Return <a>error</a> with <var>code point</var>.
+  </ol>
+
+ <li><p>If <a>ISO-2022-JP encoder state</a> is not
+ <a lt="ISO-2022-JP encoder jis0208">jis0208</a>,
+ <a>restore</a> <var>code point</var> to
+ <var>ioQueue</var>, set <a>ISO-2022-JP encoder state</a> to
+ <a lt="ISO-2022-JP encoder jis0208">jis0208</a>, and return three bytes
+ 0x1B 0x24 0x42.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 94 + 0x21.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 94 + 0x21.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and
+ <var>trail</var>.
+</ol>
+
+
+<h3 id=shift_jis dfn export>Shift_JIS</h3>
+
+<h4 id=shift_jis-decoder dfn export>Shift_JIS decoder</h4>
+
+<p><a>Shift_JIS</a>'s <a for=/>decoder</a> has an associated
+<dfn>Shift_JIS lead</dfn> (initially 0x00).
+
+<p><a>Shift_JIS</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>Shift_JIS lead</a> is not 0x00, set <a>Shift_JIS lead</a> to 0x00 and
+ return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>Shift_JIS lead</a> is 0x00, return <a>finished</a>.
+
+ <li>
+  <p>If <a>Shift_JIS lead</a> is not 0x00, let <var>lead</var> be <a>Shift_JIS lead</a>, let
+  <var>pointer</var> be null, set <a>Shift_JIS lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>Let <var>offset</var> be 0x40 if <var>byte</var> is less than 0x7F, otherwise 0x41.
+
+   <li><p>Let <var>lead offset</var> be 0x81 if <var>lead</var> is less than 0xA0, otherwise 0xC1.
+
+   <li><p>If <var>byte</var> is in the range 0x40 to 0x7E, inclusive, or
+   0x80 to 0xFC, inclusive, set <var>pointer</var> to
+   (<var>lead</var> &minus; <var>lead offset</var>) × 188 + <var>byte</var> &minus; <var>offset</var>.
+
+   <li>
+    <p>If <var>pointer</var> is in the range 8836 to 10715, inclusive, return a code point whose
+    value is 0xE000 &minus; 8836 + <var>pointer</var>.
+    <!-- subtraction is done first to avoid upsetting compilers -->
+
+    <p class=note>This is interoperable legacy from Windows known as EUDC.
+    <!-- PUA -->
+
+   <li><p>Let <var>code point</var> be null if <var>pointer</var> is null, otherwise the
+   <a>index code point</a> for <var>pointer</var> in <a>index jis0208</a>.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>restore</a> <var>byte</var> to
+   <var>ioQueue</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a> or 0x80, return a code point
+ whose value is <var>byte</var>.
+ <!-- Opera has 0x7E -->
+
+ <li><p>If <var>byte</var> is in the range 0xA1 to 0xDF, inclusive, return
+ a code point whose value is 0xFF61 &minus; 0xA1 + <var>byte</var>.
+ <!-- Katakana; subtraction is done first to avoid upsetting compilers -->
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0x9F, inclusive, or 0xE0 to 0xFC,
+ inclusive, set <a>Shift_JIS lead</a> to <var>byte</var> and return
+ <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=shift_jis-encoder dfn export>Shift_JIS encoder</h4>
+
+<p><a>Shift_JIS</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a> or U+0080, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>If <var>code point</var> is U+00A5, return byte 0x5C.
+
+ <li><p>If <var>code point</var> is U+203E, return byte 0x7E.
+
+ <li><p>If <var>code point</var> is in the range U+FF61 to U+FF9F, inclusive, return
+ a byte whose value is <var>code point</var> &minus; 0xFF61 + 0xA1.
+
+ <li><p>If <var>code point</var> is U+2212, set it to U+FF0D.
+
+ <li><p>Let <var>pointer</var> be the <a>index Shift_JIS pointer</a> for
+ <var>code point</var>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 188.
+
+ <li><p>Let <var>lead offset</var> be 0x81 if <var>lead</var> is less than 0x1F, otherwise 0xC1.
+ <!-- 0xA0-0x81 -->
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 188.
+
+ <li><p>Let <var>offset</var> be 0x40 if <var>trail</var> is less than 0x3F, otherwise 0x41.
+
+ <li><p>Return two bytes whose values are
+ <var>lead</var> + <var>lead offset</var> and
+ <var>trail</var> + <var>offset</var>.
+</ol>
+
+
+
+<h2 id=legacy-multi-byte-korean-encodings>Legacy multi-byte Korean encodings</h2>
+
+<h3 id=euc-kr dfn export>EUC-KR</h3>
+
+<h4 id=euc-kr-decoder dfn export>EUC-KR decoder</h4>
+
+<p><a>EUC-KR</a>'s <a for=/>decoder</a> has an associated
+<dfn>EUC-KR lead</dfn> (initially 0x00).
+
+<p><a>EUC-KR</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>EUC-KR lead</a> is not 0x00, set <a>EUC-KR lead</a> to 0x00
+ and return <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>EUC-KR lead</a> is 0x00, return <a>finished</a>.
+
+ <li>
+  <p>If <a>EUC-KR lead</a> is not 0x00, let <var>lead</var> be <a>EUC-KR lead</a>, let
+  <var>pointer</var> be null, set <a>EUC-KR lead</a> to 0x00, and then:
+
+  <ol>
+   <li><p>If <var>byte</var> is in the range  0x41 to 0xFE, inclusive, set
+   <var>pointer</var> to
+   (<var>lead</var> &minus; 0x81) × 190 + (<var>byte</var> &minus; 0x41).
+
+   <li><p>Let <var>code point</var> be null if <var>pointer</var> is null, otherwise the
+   <a>index code point</a> for <var>pointer</var> in <a>index EUC-KR</a>.
+
+   <li><p>If <var>code point</var> is non-null, return a code point whose value is
+   <var>code point</var>.
+
+   <li><p>If <var>byte</var> is an <a>ASCII byte</a>, <a>restore</a> <var>byte</var> to
+   <var>ioQueue</var>.
+
+   <li><p>Return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>If <var>byte</var> is in the range 0x81 to 0xFE, inclusive, set
+ <a>EUC-KR lead</a> to <var>byte</var> and return <a>continue</a>.
+
+ <li><p>Return <a>error</a>.
+</ol>
+
+
+<h4 id=euc-kr-encoder dfn export>EUC-KR encoder</h4>
+
+<p><a>EUC-KR</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>Let <var>pointer</var> be the <a>index pointer</a> for
+ <var>code point</var> in <a>index EUC-KR</a>.
+
+ <li><p>If <var>pointer</var> is null, return <a>error</a> with
+ <var>code point</var>.
+
+ <li><p>Let <var>lead</var> be <var>pointer</var> / 190 + 0x81.
+
+ <li><p>Let <var>trail</var> be <var>pointer</var> % 190 + 0x41.
+
+ <li><p>Return two bytes whose values are <var>lead</var> and <var>trail</var>.
+</ol>
+
+
+
+<h2 id=legacy-miscellaneous-encodings>Legacy miscellaneous encodings</h2>
+
+<h3 id=replacement dfn export>replacement</h3>
+
+<p class=note>The <a>replacement</a> <a for=/>encoding</a> exists to prevent certain
+attacks that abuse a mismatch between <a for=/>encodings</a> supported on
+the server and the client.
+
+
+<h4 id=replacement-decoder dfn export>replacement decoder</h4>
+
+<p><a>replacement</a>'s <a for=/>decoder</a> has an associated
+<dfn id=replacement-error-returned-flag>replacement error returned</dfn> (initially false).
+
+<p><a>replacement</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a>, return <a>finished</a>.
+
+ <li><p>If <a>replacement error returned</a> is false, set
+ <a>replacement error returned</a> to true and return <a>error</a>.
+
+ <li><p>Return <a>finished</a>.
+</ol>
+
+
+<h3 id=common-infrastructure-for-utf-16be-and-utf-16le>Common infrastructure for <a>UTF-16BE/LE</a></h3>
+
+<p><dfn export>UTF-16BE/LE</dfn> is <a>UTF-16BE</a> or <a>UTF-16LE</a>.
+
+
+<h4 id=shared-utf-16-decoder dfn export>shared UTF-16 decoder</h4>
+
+<p class=note>A byte order mark has priority over a label as it has been found to be more accurate
+in deployed content. Therefore it is not part of the <a>shared UTF-16 decoder</a> algorithm, but
+rather the <a>decode</a> algorithm.
+
+<p><a>shared UTF-16 decoder</a> has an associated <dfn>UTF-16 lead byte</dfn> and
+<dfn id=utf-16-lead-surrogate>UTF-16 leading surrogate</dfn> (both initially null), and
+<dfn id=utf-16be-decoder-flag>is UTF-16BE decoder</dfn> (initially false).
+
+<p><a>shared UTF-16 decoder</a>'s <a>handler</a>, given <var>ioQueue</var> and
+<var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and either
+ <a>UTF-16 lead byte</a> or <a>UTF-16 leading surrogate</a> is non-null, set
+ <a>UTF-16 lead byte</a> and <a>UTF-16 leading surrogate</a> to null, and return
+ <a>error</a>.
+
+ <li><p>If <var>byte</var> is <a>end-of-queue</a> and
+ <a>UTF-16 lead byte</a> and <a>UTF-16 leading surrogate</a> are null, return
+ <a>finished</a>.
+
+ <li><p>If <a>UTF-16 lead byte</a> is null, set <a>UTF-16 lead byte</a> to
+ <var>byte</var> and return <a>continue</a>.
+
+ <li>
+  <p>Let <var>code unit</var> be the result of:
+
+  <dl class=switch>
+   <dt><a>is UTF-16BE decoder</a> is true
+   <dd><p>(<a>UTF-16 lead byte</a> &lt;&lt; 8) + <var>byte</var>.
+   <dt><a>is UTF-16BE decoder</a> is false
+   <dd><p>(<var>byte</var> &lt;&lt; 8) + <a>UTF-16 lead byte</a>.
+  </dl>
+
+  <p>Then set <a>UTF-16 lead byte</a> to null.
+
+ <li>
+  <p>If <a>UTF-16 leading surrogate</a> is non-null:
+
+  <ol>
+   <li><p>Let <var>leadingSurrogate</var> be <a>UTF-16 leading surrogate</a>.
+
+   <li><p>Set <a>UTF-16 leading surrogate</a> to null.
+
+   <li><p>If <var>code unit</var> is a <a for=/>trailing surrogate</a>, then return a
+   <a>scalar value from surrogates</a> given <var>leadingSurrogate</var> and <var>code unit</var>.
+
+   <li><p>Let <var>byte1</var> be <var>code unit</var> >> 8.
+
+   <li><p>Let <var>byte2</var> be <var>code unit</var> &amp; 0x00FF.
+
+   <li><p>Let <var>bytes</var> be a <a for=/>list</a> of two bytes whose values are <var>byte1</var>
+   and <var>byte2</var>, if <a>is UTF-16BE decoder</a> is true; otherwise <var>byte2</var> and
+   <var>byte1</var>.
+
+   <li><p><a>Restore</a> <var>bytes</var> to <var>ioQueue</var> and return <a>error</a>.
+  </ol>
+
+ <li><p>If <var>code unit</var> is a <a for=/>leading surrogate</a>, then set
+ <a>UTF-16 leading surrogate</a> to <var>code unit</var> and return <a>continue</a>.
+
+ <li><p>If <var>code unit</var> is a <a for=/>trailing surrogate</a>, then return <a>error</a>.
+
+ <li><p>Return code point <var>code unit</var>.
+</ol>
+
+
+<h3 id=utf-16be dfn export>UTF-16BE</h3>
+
+<h4 id=utf-16be-decoder dfn export>UTF-16BE decoder</h4>
+
+<p><a>UTF-16BE</a>'s <a for=/>decoder</a> is <a>shared UTF-16 decoder</a> with
+its <a>is UTF-16BE decoder</a> set to true.
+
+
+<h3 id=utf-16le dfn export>UTF-16LE</h3>
+
+<p class=note>"<code>utf-16</code>" is a <a for=encoding>label</a> for <a>UTF-16LE</a> to deal with
+deployed content.
+
+
+<h4 id=utf-16le-decoder dfn export>UTF-16LE decoder</h4>
+
+<p><a>UTF-16LE</a>'s <a for=/>decoder</a> is <a>shared UTF-16 decoder</a>.
+
+
+<h3 id=x-user-defined dfn export>x-user-defined</h3>
+
+<p class=note>While technically this is a <a>single-byte encoding</a>,
+it is defined separately as it can be implemented algorithmically.
+
+<!--
+This encoding is silly, however, the web depends on it:
+
+https://krijnhoetmer.nl/irc-logs/whatwg/20121003#l-461
+https://krijnhoetmer.nl/irc-logs/whatwg/20121010#l-812
+
+https://stackoverflow.com/questions/6986789/why-are-some-bytes-prefixed-with-0xf7-when-using-charset-x-user-defined-with-xm
+-->
+
+<h4 id=x-user-defined-decoder dfn export>x-user-defined decoder</h4>
+
+<p><a>x-user-defined</a>'s <a for=/>decoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>byte</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>byte</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>byte</var> is an <a>ASCII byte</a>, return
+ a code point whose value is <var>byte</var>.
+
+ <li><p>Return a code point whose value is 0xF780 + <var>byte</var> &minus; 0x80.
+</ol>
+
+
+<h4 id=x-user-defined-encoder dfn export>x-user-defined encoder</h4>
+
+<p><a>x-user-defined</a>'s <a for=/>encoder</a>'s <a>handler</a>, given
+<var>ioQueue</var> and <var>code point</var>, runs these steps:
+
+<ol>
+ <li><p>If <var>code point</var> is <a>end-of-queue</a>, return
+ <a>finished</a>.
+
+ <li><p>If <var>code point</var> is an <a>ASCII code point</a>, return
+ a byte whose value is <var>code point</var>.
+
+ <li><p>If <var>code point</var> is in the range U+F780 to U+F7FF, inclusive, return
+ a byte whose value is <var>code point</var> &minus; 0xF780 + 0x80.
+
+ <li><p>Return <a>error</a> with <var>code point</var>.
+</ol>
+
+
+
+<h2 id=browser-ui>Browser UI</h2>
+
+<p>Browsers are encouraged to not enable overriding the encoding of a resource. If such a feature is
+nonetheless present, browsers should not offer <a>UTF-16BE/LE</a> as an option, due to the
+aforementioned security issues. Browsers should also disable this feature if the resource was
+decoded using <a>UTF-16BE/LE</a>.
+
+
+
+<h2 class=no-num id=implementation-considerations>Implementation considerations</h2>
+
+<p>Instead of supporting <a for=/>I/O queues</a> with arbitrary <a for="I/O queue">restore</a>, the
+<a for=/>decoders</a> for <a for=/>encodings</a> in this standard could be implemented with:
+
+<ol>
+ <li><p>The ability to unread the current byte.
+
+ <li>
+  <p>A single-byte buffer for <a>gb18030</a> (an <a>ASCII byte</a>) and <a>ISO-2022-JP</a> (0x24 or
+  0x28).
+
+  <p class=example id=example-gb18030-implementation-strategy>For <a>gb18030</a> when hitting a
+  bogus byte while <a>gb18030 third</a> is not 0x00, <a>gb18030 second</a> could be moved into the
+  single-byte buffer to be returned next, and <a>gb18030 third</a> would be the new
+  <a>gb18030 first</a>, checked for not being 0x00 after the single-byte buffer was returned and
+  emptied. This is possible as the range for the first and third byte in <a>gb18030</a> is
+  identical.
+</ol>
+
+<p>The <a>ISO-2022-JP encoder</a> needs <a>ISO-2022-JP encoder state</a> as additional state, but
+other than that, none of the <a for=/>encoders</a> for <a for=/>encodings</a> in this standard
+require additional state or buffers.
+
+
+
+<h2 class=no-num id=acknowledgments>Acknowledgments</h2>
+
+<p>There have been a lot of people that have helped make encodings more
+interoperable over the years and thereby furthered the goals of this
+standard. Likewise many people have helped making this standard what it is
+today.
+
+<p>With that, many thanks to
+Adam Rice,
+Alan Chaney,
+Alexander Shtuchkin,
+Allen Wirfs-Brock,
+Andreu Botella,
+Aneesh Agrawal,
+Arkadiusz Michalski,
+Asmus Freytag,
+Ben Noordhuis,
+Bnaya Peretz,
+Boris Zbarsky,
+Bruno Haible,
+Cameron McCormack,
+Charles McCathieNeville,
+Christopher Foo,
+CodifierNL, <!-- Codifier on GitHub -->
+David Carlisle,
+Domenic Denicola,
+Dominique Hazaël-Massieux,
+Doug Ewell,
+Erik van der Poel,
+譚永鋒 (Frank Yung-Fong Tang),
+Glenn Maynard,
+Gordon P. Hemsley,
+Henri Sivonen,
+Ian Hickson,
+J. King,
+James Graham,
+Jeffrey Yasskin,
+John Tamplin,
+Joshua Bell,
+村井純 (Jun Murai),
+신정식 (Jungshik Shin),
+Jxck,
+강 성훈 (Kang Seonghoon),<!-- space is intentional: https://www.w3.org/Bugs/Public/show_bug.cgi?id=27675#c2 -->
+川幡太一 (Kawabata Taichi),
+Ken Lunde,
+Ken Whistler,
+Kenneth Russell,
+田村健人 (Kent Tamura),
+Leif Halvard Silli,
+Luke Wagner,
+Maciej Hirsz,
+Makoto Kato,
+Mark Callow,
+Mark Crispin,
+Mark Davis,
+Martin Dürst,
+Masatoshi Kimura,
+Mattias Buelens,
+Ms2ger,
+Nigel Megitt,
+Nigel Tao,
+Norbert Lindenberg,
+Øistein E. Andersen,
+Peter Krefting,
+Philip Jägenstedt,
+Philip Taylor,
+Richard Ishida,
+Robbert Broersma,
+Robert Mustacchi,
+Ryan Dahl,
+Sam Sneddon,
+Shawn Steele,
+Simon Montagu,
+Simon Pieters,
+Simon Sapin,
+Stephen Checkoway,
+寺田健 (Takeshi Terada),
+Vyacheslav Matva,
+Wolf Lammen, and
+成瀬ゆい (Yui Naruse)
+for being awesome.
+
+<p>This standard is written by <a href=https://annevankesteren.nl/ lang=nl>Anne van Kesteren</a>
+(<a href=https://www.apple.com/>Apple</a>, <a href=mailto:annevk@annevk.nl>annevk@annevk.nl</a>).
+The <a href=#api>API</a> chapter was initially written by Joshua Bell
+(<a href=https://www.google.com/>Google</a>).

Name +	Labels +
The Encoding +
UTF-8 +	"`unicode-1-1-utf-8`" +
	"`unicode11utf8`" +
	"`unicode20utf8`" +
	"`utf-8`" +
	"`utf8`" +
	"`x-unicode20utf8`" +
Legacy single-byte encodings +
IBM866 +	"`866`" +
	"`cp866`" +
	"`csibm866`" +
	"`ibm866`" +
ISO-8859-2 +	"`csisolatin2`" +
	"`iso-8859-2`" +
	"`iso-ir-101`" +
	"`iso8859-2`" +
	"`iso88592`" +
	"`iso_8859-2`" +
	"`iso_8859-2:1987`" +
	"`l2`" +
	"`latin2`" +
ISO-8859-3 +	"`csisolatin3`" +
	"`iso-8859-3`" +
	"`iso-ir-109`" +
	"`iso8859-3`" +
	"`iso88593`" +
	"`iso_8859-3`" +
	"`iso_8859-3:1988`" +
	"`l3`" +
	"`latin3`" +
ISO-8859-4 +	"`csisolatin4`" +
	"`iso-8859-4`" +
	"`iso-ir-110`" +
	"`iso8859-4`" +
	"`iso88594`" +
	"`iso_8859-4`" +
	"`iso_8859-4:1988`" +
	"`l4`" +
	"`latin4`" +
ISO-8859-5 +	"`csisolatincyrillic`" +
	"`cyrillic`" +
	"`iso-8859-5`" +
	"`iso-ir-144`" +
	"`iso8859-5`" +
	"`iso88595`" +
	"`iso_8859-5`" +
	"`iso_8859-5:1988`" +
ISO-8859-6 +	"`arabic`" +
	"`asmo-708`" +
	"`csiso88596e`" +
	"`csiso88596i`" +
	"`csisolatinarabic`" +
	"`ecma-114`" +
	"`iso-8859-6`" +
	"`iso-8859-6-e`" +
	"`iso-8859-6-i`" +
	"`iso-ir-127`" +
	"`iso8859-6`" +
	"`iso88596`" +
	"`iso_8859-6`" +
	"`iso_8859-6:1987`" +
ISO-8859-7 +	"`csisolatingreek`" +
	"`ecma-118`" +
	"`elot_928`" +
	"`greek`" +
	"`greek8`" +
	"`iso-8859-7`" +
	"`iso-ir-126`" +
	"`iso8859-7`" +
	"`iso88597`" +
	"`iso_8859-7`" +
	"`iso_8859-7:1987`" +
	"`sun_eu_greek`" +
ISO-8859-8 +	"`csiso88598e`" +
	"`csisolatinhebrew`" +
	"`hebrew`" +
	"`iso-8859-8`" +
	"`iso-8859-8-e`" +
	"`iso-ir-138`" +
	"`iso8859-8`" +
	"`iso88598`" +
	"`iso_8859-8`" +
	"`iso_8859-8:1988`" +
	"`visual`" +
ISO-8859-8-I +	"`csiso88598i`" +
	"`iso-8859-8-i`" +
	"`logical`" +
ISO-8859-10 +	"`csisolatin6`" +
	"`iso-8859-10`" +
	"`iso-ir-157`" +
	"`iso8859-10`" +
	"`iso885910`" +
	"`l6`" +
	"`latin6`" +
ISO-8859-13 +	"`iso-8859-13`" +
	"`iso8859-13`" +
	"`iso885913`" +
ISO-8859-14 +	"`iso-8859-14`" +
	"`iso8859-14`" +
	"`iso885914`" +
ISO-8859-15 +	"`csisolatin9`" +
	"`iso-8859-15`" +
	"`iso8859-15`" +
	"`iso885915`" +
	"`iso_8859-15`" +
	"`l9`" +
ISO-8859-16 +	"`iso-8859-16`" +
KOI8-R +	"`cskoi8r`" +
	"`koi`" +
	"`koi8`" +
	"`koi8-r`" +
	"`koi8_r`" +
KOI8-U +	"`koi8-ru`" +
KOI8-U +	"`koi8-u`" +
macintosh +	"`csmacintosh`" +
	"`mac`" +
	"`macintosh`" +
	"`x-mac-roman`" +
windows-874 +	"`dos-874`" +
	"`iso-8859-11`" +
	"`iso8859-11`" +
	"`iso885911`" +
	"`tis-620`" +
	"`windows-874`" +
windows-1250 +	"`cp1250`" +
	"`windows-1250`" +
	"`x-cp1250`" +
windows-1251 +	"`cp1251`" +
	"`windows-1251`" +
	"`x-cp1251`" +
windows-1252 +	"`ansi_x3.4-1968`" +
	"`ascii`" +
	"`cp1252`" +
	"`cp819`" +
	"`csisolatin1`" +
	"`ibm819`" +
	"`iso-8859-1`" +
	"`iso-ir-100`" +
	"`iso8859-1`" +
	"`iso88591`" +
	"`iso_8859-1`" +
	"`iso_8859-1:1987`" +
	"`l1`" +
	"`latin1`" +
	"`us-ascii`" +
	"`windows-1252`" +
	"`x-cp1252`" +
windows-1253 +	"`cp1253`" +
	"`windows-1253`" +
	"`x-cp1253`" +
windows-1254 +	"`cp1254`" +
	"`csisolatin5`" +
	"`iso-8859-9`" +
	"`iso-ir-148`" +
	"`iso8859-9`" +
	"`iso88599`" +
	"`iso_8859-9`" +
	"`iso_8859-9:1989`" +
	"`l5`" +
	"`latin5`" +
	"`windows-1254`" +
	"`x-cp1254`" +
windows-1255 +	"`cp1255`" +
	"`windows-1255`" +
	"`x-cp1255`" +
windows-1256 +	"`cp1256`" +
	"`windows-1256`" +
	"`x-cp1256`" +
windows-1257 +	"`cp1257`" +
	"`windows-1257`" +
	"`x-cp1257`" +
windows-1258 +	"`cp1258`" +
	"`windows-1258`" +
	"`x-cp1258`" +
x-mac-cyrillic +	"`x-mac-cyrillic`" +
x-mac-cyrillic +	"`x-mac-ukrainian`" +
Legacy multi-byte Chinese (simplified) encodings +
GBK +	"`chinese`" +
	"`csgb2312`" +
	"`csiso58gb231280`" +
	"`gb2312`" +
	"`gb_2312`" +
	"`gb_2312-80`" +
	"`gbk`" +
	"`iso-ir-58`" +
	"`x-gbk`" +
gb18030 +	"`gb18030`" +
Legacy multi-byte Chinese (traditional) encodings +
Big5 +	"`big5`" +
	"`big5-hkscs`" +
	"`cn-big5`" +
	"`csbig5`" +
	"`x-x-big5`" +
Legacy multi-byte Japanese encodings +
EUC-JP +	"`cseucpkdfmtjapanese`" +
	"`euc-jp`" +
	"`x-euc-jp`" +
ISO-2022-JP +	"`csiso2022jp`" +
ISO-2022-JP +	"`iso-2022-jp`" +
Shift_JIS +	"`csshiftjis`" +
	"`ms932`" +
	"`ms_kanji`" +
	"`shift-jis`" +
	"`shift_jis`" +
	"`sjis`" +
	"`windows-31j`" +
	"`x-sjis`" +
Legacy multi-byte Korean encodings +
EUC-KR +	"`cseuckr`" +
	"`csksc56011987`" +
	"`euc-kr`" +
	"`iso-ir-149`" +
	"`korean`" +
	"`ks_c_5601-1987`" +
	"`ks_c_5601-1989`" +
	"`ksc5601`" +
	"`ksc_5601`" +
	"`windows-949`" +
Legacy miscellaneous encodings +
replacement +	"`csiso2022kr`" +
	"`hz-gb-2312`" +
	"`iso-2022-cn`" +
	"`iso-2022-cn-ext`" +
	"`iso-2022-kr`" +
	"`replacement`" +
UTF-16BE +	"`unicodefffe`" +
UTF-16BE +	"`utf-16be`" +
UTF-16LE +	"`csunicode`" +
	"`iso-10646-ucs-2`" +
	"`ucs-2`" +
	"`unicode`" +
	"`unicodefeff`" +
	"`utf-16`" +
	"`utf-16le`" +
x-user-defined +	"`x-user-defined`" +
Index				Notes +
index Big5 +	index-big5.txt +	index Big5 visualization +	index Big5 BMP coverage +	This matches the Big5 standard in combination with the + Hong Kong Supplementary Character Set and other common extensions. +
index EUC-KR +	index-euc-kr.txt +	index EUC-KR visualization +	index EUC-KR BMP coverage +	This matches the KS X 1001 standard and the Unified Hangul Code, more commonly known together + as Windows Codepage 949. It covers the Hangul Syllables block of Unicode in its entirety. The + Hangul block whose top left corner in the visualization is at pointer 9026 is in the Unicode + order. Taken separately, the rest of the Hangul syllables in this index are in the Unicode order, + too. +
index gb18030 +	index-gb18030.txt +	index gb18030 visualization +	index gb18030 BMP coverage +	This matches the GB18030-2022 standard for code points encoded as two bytes, except for + 0xA3 0xA0 which maps to U+3000 to be compatible with deployed content. This index covers the + CJK Unified Ideographs block of Unicode in its entirety. Entries from that block that are above or + to the left of (the first) U+3000 in the visualization are in the Unicode order. + +
index gb18030 ranges +	index-gb18030-ranges.txt +			This index works different from all others. Listing all code points would result + in over a million items whereas they can be represented neatly in 207 ranges combined with trivial + limit checks. It therefore only superficially matches the GB18030-2000 standard for code points + encoded as four bytes. The change for the GB18030-2005 revision is handled inline by the + index gb18030 ranges code point and index gb18030 ranges pointer algorithms below + that accompany this index. And the changes for the GB18030-2022 revision are handled differently + again to not further increase the number of byte sequences mapping to Private Use code points. The + relevant Private Use code points are mapped in the gb18030 encoder directly through a side + table to preserve compatibility with how they were mapped before. +
index jis0208 +	index-jis0208.txt +	index jis0208 visualization, Shift_JIS visualization +	index jis0208 BMP coverage +	This is the JIS X 0208 standard including formerly proprietary + extensions from IBM and NEC. + +
index jis0212 +	index-jis0212.txt +	index jis0212 visualization +	index jis0212 BMP coverage +	This is the JIS X 0212 standard. It is only used by the EUC-JP decoder + due to lack of widespread support elsewhere. + +
index ISO-2022-JP katakana +	index-iso-2022-jp-katakana.txt +			This maps halfwidth to fullwidth katakana as per Unicode Normalization Form KC, except that + U+FF9E and U+FF9F map to U+309B and U+309C rather than U+3099 and U+309A. It is only used by the + ISO-2022-JP encoder. [[UNICODE]] +
IBM866	index-ibm866.txt	index IBM866 visualization	index IBM866 BMP coverage +
ISO-8859-2	index-iso-8859-2.txt	index ISO-8859-2 visualization	index ISO-8859-2 BMP coverage +
ISO-8859-3	index-iso-8859-3.txt	index ISO-8859-3 visualization	index ISO-8859-3 BMP coverage +
ISO-8859-4	index-iso-8859-4.txt	index ISO-8859-4 visualization	index ISO-8859-4 BMP coverage +
ISO-8859-5	index-iso-8859-5.txt	index ISO-8859-5 visualization	index ISO-8859-5 BMP coverage +
ISO-8859-6	index-iso-8859-6.txt	index ISO-8859-6 visualization	index ISO-8859-6 BMP coverage +
ISO-8859-7	index-iso-8859-7.txt	index ISO-8859-7 visualization	index ISO-8859-7 BMP coverage +
ISO-8859-8	index-iso-8859-8.txt	index ISO-8859-8 visualization	index ISO-8859-8 BMP coverage +
ISO-8859-8-I +	index-iso-8859-8.txt	index ISO-8859-8 visualization	index ISO-8859-8 BMP coverage +
ISO-8859-10	index-iso-8859-10.txt	index ISO-8859-10 visualization	index ISO-8859-10 BMP coverage +
ISO-8859-13	index-iso-8859-13.txt	index ISO-8859-13 visualization	index ISO-8859-13 BMP coverage +
ISO-8859-14	index-iso-8859-14.txt	index ISO-8859-14 visualization	index ISO-8859-14 BMP coverage +
ISO-8859-15	index-iso-8859-15.txt	index ISO-8859-15 visualization	index ISO-8859-15 BMP coverage +
ISO-8859-16	index-iso-8859-16.txt	index ISO-8859-16 visualization	index ISO-8859-16 BMP coverage +
KOI8-R	index-koi8-r.txt	index KOI8-R visualization	index KOI8-R BMP coverage +
KOI8-U	index-koi8-u.txt	index KOI8-U visualization	index KOI8-U BMP coverage +
macintosh	index-macintosh.txt	index macintosh visualization	index macintosh BMP coverage +
windows-874	index-windows-874.txt	index windows-874 visualization	index windows-874 BMP coverage +
windows-1250	index-windows-1250.txt	index windows-1250 visualization	index windows-1250 BMP coverage +
windows-1251	index-windows-1251.txt	index windows-1251 visualization	index windows-1251 BMP coverage +
windows-1252	index-windows-1252.txt	index windows-1252 visualization	index windows-1252 BMP coverage +
windows-1253	index-windows-1253.txt	index windows-1253 visualization	index windows-1253 BMP coverage +
windows-1254	index-windows-1254.txt	index windows-1254 visualization	index windows-1254 BMP coverage +
windows-1255	index-windows-1255.txt	index windows-1255 visualization	index windows-1255 BMP coverage +
windows-1256	index-windows-1256.txt	index windows-1256 visualization	index windows-1256 BMP coverage +
windows-1257	index-windows-1257.txt	index windows-1257 visualization	index windows-1257 BMP coverage +
windows-1258	index-windows-1258.txt	index windows-1258 visualization	index windows-1258 BMP coverage +
x-mac-cyrillic	index-x-mac-cyrillic.txt	index x-mac-cyrillic visualization	index x-mac-cyrillic BMP coverage +
Code point +	Bytes +
U+E78D +	0xA6 0xD9 +
U+E78E +	0xA6 0xDA +
U+E78F +	0xA6 0xDB +
U+E790 +	0xA6 0xDC +
U+E791 +	0xA6 0xDD +
U+E792 +	0xA6 0xDE +
U+E793 +	0xA6 0xDF +
U+E794 +	0xA6 0xEC +
U+E795 +	0xA6 0xED +
U+E796 +	0xA6 0xF3 +
U+E81E +	0xFE 0x59 +
U+E826 +	0xFE 0x61 +
U+E82B +	0xFE 0x66 +
U+E82C +	0xFE 0x67 +
U+E832 +	0xFE 0x6D +
U+E843 +	0xFE 0x7E +
U+E854 +	0xFE 0x90 +
U+E864 +	0xFE 0xA0 +
Pointer	Code points	Notes +
1133	U+00CA U+0304	Ê̄ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND MACRON) +
1135	U+00CA U+030C	Ê̌ (LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND CARON) +
1164	U+00EA U+0304	ê̄ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND MACRON) +
1166	U+00EA U+030C	ê̌ (LATIN SMALL LETTER E WITH CIRCUMFLEX AND CARON) +