From 47a3e55bf5ad15d02f5a228ac093e2aa4cbe010c Mon Sep 17 00:00:00 2001
From: Anne van Kesteren <annevk@annevk.nl>
Date: Mon, 26 Oct 2020 18:23:03 +0100
Subject: [PATCH] Clarify instance language around decoders and encoders

And also stop defaulting error mode in "run" and "process".

Fixes #240.
---
 encoding.bs | 121 ++++++++++++++++++++++++----------------------------
 1 file changed, 55 insertions(+), 66 deletions(-)
diff --git a/encoding.bs b/encoding.bs
index 7afd377..f95805e 100644
--- a/encoding.bs
+++ b/encoding.bs
@@ -238,8 +238,8 @@ This specification does not provide wrapper algorithms that would combine with <
 <h3 id=encoders-and-decoders>Encoders and decoders</h3>
 
 <p>Each <a for=/>encoding</a> has an associated <dfn>decoder</dfn> and most of them have an
-associated <dfn>encoder</dfn>. Each <a for=/>decoder</a> and <a for=/>encoder</a> have a
-<dfn>handler</dfn> algorithm. A <a>handler</a> algorithm takes an input
+associated <dfn>encoder</dfn>. Instances of <a for=/>decoders</a> and <a for=/>encoders</a> have a
+<dfn>handler</dfn> algorithm and might also have state. A <a>handler</a> algorithm takes an input
 <a for=/>I/O queue</a> and an <a for=list>item</a>, and returns
 <dfn>finished</dfn>, one or more <a for=list>items</a>, <dfn>error</dfn>
 optionally with a <a>code point</a>, or <dfn>continue</dfn>.
@@ -247,9 +247,8 @@ optionally with a <a>code point</a>, or <dfn>continue</dfn>.
 <p class="note no-backref">The <a>replacement</a> and <a>UTF-16BE/LE</a> <a for=/>encodings</a> have
 no <a for=/>encoder</a>.
 
-<p>An <dfn>error mode</dfn> as used below is "<code>replacement</code>" (default) or
-"<code>fatal</code>" for a <a for=/>decoder</a> and "<code>fatal</code>" (default) or
-"<code>html</code>" for an <a for=/>encoder</a>.
+<p>An <dfn>error mode</dfn> as used below is "<code>replacement</code>" or "<code>fatal</code>" for
+a <a for=/>decoder</a> and "<code>fatal</code>" or "<code>html</code>" for an <a for=/>encoder</a>.
 
 <p class=note>An XML processor would set <a for=/>error mode</a> to "<code>fatal</code>".
 [[XML]]
@@ -264,24 +263,17 @@ happening.
 [[HTML]]
 
 <p>To <dfn id=concept-encoding-run>run</dfn> an <a for=/>encoding</a>'s <a for=/>decoder</a> or
-<a for=/>encoder</a> <var>encoderDecoder</var> with input <a for=/>I/O queue</a> <var>input</var>,
-output <a for=/>I/O queue</a> <var>output</var>, and optional <a for=/>error mode</a>
+<a for=/>encoder</a> instance <var>encoderDecoder</var> with <a for=/>I/O queue</a>
+<var>input</var>, <a for=/>I/O queue</a> <var>output</var>, and <a for=/>error mode</a>
 <var>mode</var>, run these steps:
 
 <ol>
- <li><p>If <var>mode</var> is not given, then set it to "<code>replacement</code>" if
- <var>encoderDecoder</var> is a <a for=/>decoder</a>, otherwise "<code>fatal</code>".
-
- <li><p>Let <var>encoderDecoderInstance</var> be a new <var>encoderDecoder</var>.
-
  <li>
   <p>While true:
 
   <ol>
-   <li><p>Let <var>result</var> be the result of
-   <a>processing</a> the result of
-   <a>reading</a> from <var>input</var> for
-   <var>encoderDecoderInstance</var>, <var>input</var>, <var>output</var>, and
+   <li><p>Let <var>result</var> be the result of <a>processing</a> the result of <a>reading</a> from
+   <var>input</var> for <var>encoderDecoder</var>, <var>input</var>, <var>output</var>, and
    <var>mode</var>.
 
    <li><p>If <var>result</var> is not <a>continue</a>, then return <var>result</var>.
@@ -290,28 +282,23 @@ output <a for=/>I/O queue</a> <var>output</var>, and optional <a for=/>error mod
 
 <p>To <dfn id=concept-encoding-process>process</dfn> an <a for=list>item</a> <var>item</var> for an
 <a for=/>encoding</a>'s <a for=/>encoder</a> or <a for=/>decoder</a> instance
-<var>encoderDecoderInstance</var>, <a for=/>I/O queue</a> <var>input</var>, output
-<a for=/>I/O queue</a> <var>output</var>, and optional <a for=/>error mode</a> <var>mode</var>, run
-these steps:
+<var>encoderDecoder</var>, <a for=/>I/O queue</a> <var>input</var>, <a for=/>I/O queue</a>
+<var>output</var>, and <a for=/>error mode</a> <var>mode</var>, run these steps:
 
 <ol>
- <li><p>If <var>mode</var> is not given, then set it to "<code>replacement</code>" if
- <var>encoderDecoderInstance</var> is a <a for=/>decoder</a> instance, otherwise
- "<code>fatal</code>".
-
- <li><p>Assert: if <var>encoderDecoderInstance</var> is an <a for=/>encoder</a> instance,
- <var>mode</var> is not "<code>replacement</code>".
+ <li><p>Assert: if <var>encoderDecoder</var> is an <a for=/>encoder</a> instance, <var>mode</var> is
+ not "<code>replacement</code>".
 
- <li><p>Assert: if <var>encoderDecoderInstance</var> is a <a for=/>decoder</a> instance,
- <var>mode</var> is not "<code>html</code>".
+ <li><p>Assert: if <var>encoderDecoder</var> is a <a for=/>decoder</a> instance, <var>mode</var> is
+ not "<code>html</code>".
 
- <li><p>Assert: if <var>encoderDecoderInstance</var> is an <a for=/>encoder</a> instance,
- <var>item</var> is not a <a>surrogate</a>.
+ <li><p>Assert: if <var>encoderDecoder</var> is an <a for=/>encoder</a> instance, <var>item</var> is
+ not a <a>surrogate</a>.
 
- <li><p>Let <var>result</var> be the result of running <var>encoderDecoderInstance</var>'s
- <a>handler</a> on <var>input</var> and <var>item</var>.
+ <li><p>Let <var>result</var> be the result of running <var>encoderDecoder</var>'s <a>handler</a> on
+ <var>input</var> and <var>item</var>.
 
- <li><p>If <var>result</var> is <a>continue</a>, return <var>result</var>.
+ <li><p>If <var>result</var> is <a>continue</a>, then return <var>result</var>.
 
  <li>
   <p>Otherwise, if <var>result</var> is <a>finished</a>:
@@ -327,8 +314,8 @@ these steps:
   <p>Otherwise, if <var>result</var> is one or more <a for=list>items</a>:
 
   <ol>
-   <li><p>Assert: if <var>encoderDecoderInstance</var> is a <a for=/>decoder</a> instance,
-   <var>result</var> does not contain any <a>surrogates</a>.
+   <li><p>Assert: if <var>encoderDecoder</var> is a <a for=/>decoder</a> instance, <var>result</var>
+   does not contain any <a>surrogates</a>.
 
    <li><p><a>Push</a> <var>result</var> to <var>output</var>.
   </ol>
@@ -1005,8 +992,8 @@ queue of scalar values <var>output</var> (default « »), run these steps:
  <li><p>If <var>buffer</var> does not match 0xEF 0xBB 0xBF, <a>prepend</a> <var>buffer</var> to
  <var>ioQueue</var>.
 
- <li><p><a>Run</a> <a>UTF-8</a>'s <a for=/>decoder</a> with <var>ioQueue</var> and
- <var>output</var>.
+ <li><p><a>Run</a> an instance of <a>UTF-8</a>'s <a for=/>decoder</a> with <var>ioQueue</var>,
+ <var>output</var>, and "<code>replacement</code>".
 
  <li><p>Return <var>output</var>.
 </ol>
@@ -1015,8 +1002,8 @@ queue of scalar values <var>output</var> (default « »), run these steps:
 optional I/O queue of scalar values <var>output</var> (default « »), run these steps:
 
 <ol>
- <li><p><a>Run</a> <a>UTF-8</a>'s <a for=/>decoder</a> with <var>ioQueue</var> and
- <var>output</var>.
+ <li><p><a>Run</a> an instance of <a>UTF-8</a>'s <a for=/>decoder</a> with <var>ioQueue</var>,
+ <var>output</var>, and "<code>replacement</code>".
 
  <li><p>Return <var>output</var>.
 </ol>
@@ -1028,7 +1015,7 @@ given an optional I/O queue of scalar values <var>output</var> (default « »),
      -->
 
 <ol>
- <li><p>Let <var>potentialError</var> be the result of <a>running</a> <a>UTF-8</a>'s
+ <li><p>Let <var>potentialError</var> be the result of <a>running</a> an instance of <a>UTF-8</a>'s
  <a for=/>decoder</a> with <var>ioQueue</var>, <var>output</var>, and "<code>fatal</code>".
 
  <li><p>If <var>potentialError</var> is an <a>error</a>, then return failure.
@@ -1078,8 +1065,8 @@ these steps:
   than anything else. In a context where HTTP is used this is in violation of the semantics of the
   `<code>Content-Type</code>` header.
 
- <li><p><a>Run</a> <var>encoding</var>'s <a for=/>decoder</a> with <var>ioQueue</var> and
- <var>output</var>.
+ <li><p><a>Run</a> an instance of <var>encoding</var>'s <a for=/>decoder</a> with
+ <var>ioQueue</var>, <var>output</var>, and "<code>replacement</code>".
 
  <li><p>Return <var>output</var>.
 </ol>
@@ -1135,12 +1122,12 @@ is safe as it never triggers <a>errors</a>. [[HTML]]
 <ol>
  <li><p>Assert: <var>encoding</var> is not <a>replacement</a> or <a>UTF-16BE/LE</a>.
 
- <li><p>Return <var>encoding</var>'s <a for=/>encoder</a>.
+ <li><p>Return an instance of <var>encoding</var>'s <a for=/>encoder</a>.
 </ol>
 
 <p>To <dfn export>encode or fail</dfn> an I/O queue of scalar values <var>ioQueue</var> given an
-<a for=/>encoder</a> <var>encoder</var> and an I/O queue of bytes <var>output</var>, run these
-steps:
+<a for=/>encoder</a> instance <var>encoder</var> and an I/O queue of bytes <var>output</var>, run
+these steps:
 
 <ol>
  <li><p>Let <var>potentialError</var> be the result of <a>running</a> <var>encoder</var> with
@@ -1156,10 +1143,10 @@ steps:
 
 <div class=note id=pit-of-iso-2022-jp>
  <p>This is a legacy hook for URL percent-encoding. The caller will have to keep an
- <a for=/>encoder</a> alive as the <a>ISO-2022-JP encoder</a> can be in two different states when
- returning an <a>error</a>. That also means that if the caller emits bytes to encode the error in
- some way, these have to be in the range 0x00 to 0x7F, inclusive, excluding 0x0E, 0x0F, 0x1B, 0x5C,
- and 0x7E. [[URL]]
+ <a for=/>encoder</a> instance alive as the <a>ISO-2022-JP encoder</a> can be in two different
+ states when returning an <a>error</a>. That also means that if the caller emits bytes to encode the
+ error in some way, these have to be in the range 0x00 to 0x7F, inclusive, excluding 0x0E, 0x0F,
+ 0x1B, 0x5C, and 0x7E. [[URL]]
 
  <p>In particular, if upon returning an <a>error</a> the <a>ISO-2022-JP encoder</a> is in the
  <a lt="ISO-2022-JP decoder Roman">Roman</a> state, the caller cannot output 0x5C (\) as it will not
@@ -1171,7 +1158,7 @@ steps:
 
  <p>The return value is either the number representing the <a>code point</a> that could not be
  encoded or null, if there was no <a>error</a>. When it returns non-null the caller will have to
- invoke it again, supplying the same <a for=/>encoder</a> and a new output I/O queue.
+ invoke it again, supplying the same <a for=/>encoder</a> instance and a new output I/O queue.
 </div>
 
 
@@ -1268,7 +1255,7 @@ interface mixin TextDecoderCommon {
  <dd>An <a for=/>encoding</a>.
 
  <dt><dfn for=TextDecoderCommon oldids=textdecoder-decoder,textdecoderstream-decoder>decoder</dfn>
- <dd>A <a for=/>decoder</a>.
+ <dd>A <a for=/>decoder</a> instance.
 
  <dt><dfn for=TextDecoderCommon oldids=textdecoder-stream,textdecoderstream-stream,textdecodercommon-stream>I/O queue</dfn>
  <dd>An <a for=/>I/O queue</a> of bytes.
@@ -1419,10 +1406,10 @@ method steps are:
 
 <ol>
  <li><p>If <a>this</a>'s <a for=TextDecoder>do not flush</a> is false, then set <a>this</a>'s
- <a for=TextDecoderCommon>decoder</a> to a new <a for=/>decoder</a> for <a>this</a>'s
- <a for=TextDecoderCommon>encoding</a>, <a>this</a>'s <a for=TextDecoderCommon>I/O queue</a> to the
- <a for=/>I/O queue</a> of bytes « <a>end-of-queue</a> », and <a>this</a>'s
- <a for=TextDecoderCommon>BOM seen</a> to false.
+ <a for=TextDecoderCommon>decoder</a> to a new instance of <a>this</a>'s
+ <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>, <a>this</a>'s
+ <a for=TextDecoderCommon>I/O queue</a> to the <a for=/>I/O queue</a> of bytes
+ « <a>end-of-queue</a> », and <a>this</a>'s <a for=TextDecoderCommon>BOM seen</a> to false.
 
  <li><p>Set <a>this</a>'s <a for=TextDecoder>do not flush</a> to
  <var>options</var>["{{TextDecodeOptions/stream}}"].
@@ -1554,8 +1541,8 @@ constructor steps are to do nothing.
    <li><p>Let <var>item</var> be the result of
    <a>reading</a> from <var>input</var>.
 
-   <li><p>Let <var>result</var> be the result of <a>processing</a> <var>item</var> for the
-   <a>UTF-8 encoder</a>, <var>input</var>, <var>output</var>.
+   <li><p>Let <var>result</var> be the result of <a>processing</a> <var>item</var> for an instance
+   of the <a>UTF-8 encoder</a>, <var>input</var>, <var>output</var>, and "<code>fatal</code>".
 
    <li>
     <p>Assert: <var>result</var> is not <a>error</a>.
@@ -1582,6 +1569,8 @@ method steps are:
  <a lt="get a reference to the buffer source">getting a reference to the bytes held by</a>
  <var>destination</var>.
 
+ <li><p>Let <var>encoder</var> be an instance of the <a>UTF-8 encoder</a>.
+
  <li>
   <p>Let <var>unused</var> be the <a for=/>I/O queue</a> of scalar values « <a>end-of-queue</a> ».
 
@@ -1597,8 +1586,8 @@ method steps are:
   <ol>
    <li><p>Let <var>item</var> be the result of <a>reading</a> from <var>source</var>.
 
-   <li><p>Let <var>result</var> be the result of running the <a>UTF-8 encoder</a>'s <a>handler</a>
-   on <var>unused</var> and <var>item</var>.
+   <li><p>Let <var>result</var> be the result of running <var>encoder</var>'s <a>handler</a> on
+   <var>unused</var> and <var>item</var>.
 
    <li><p>If <var>result</var> is <a>finished</a>, then <a for=iteration>break</a>.
 
@@ -1738,8 +1727,8 @@ constructor steps are:
  <li><p>set <a>this</a>'s <a for=TextDecoderCommon>ignore BOM</a> to
  <var>options</var>["{{TextDecoderOptions/ignoreBOM}}"].
 
- <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>decoder</a> to a new <a for=/>decoder</a> for
- <a>this</a>'s <a for=TextDecoderCommon>encoding</a>, and set <a>this</a>'s
+ <li><p>Set <a>this</a>'s <a for=TextDecoderCommon>decoder</a> to a new instance of <a>this</a>'s
+ <a for=TextDecoderCommon>encoding</a>'s <a for=/>decoder</a>, and set <a>this</a>'s
  <a for=TextDecoderCommon>I/O queue</a> to a new <a for=/>I/O queue</a>.
 
  <li><p>Let <var>transformAlgorithm</var> be an algorithm which takes a <var>chunk</var> argument
@@ -1846,7 +1835,7 @@ TextEncoderStream includes GenericTransformStream;
 
 <dl>
  <dt><dfn for=TextEncoderStream>encoder</dfn>
- <dd>An <a for=/>encoder</a>.
+ <dd>An <a for=/>encoder</a> instance.
 
  <dt><dfn for=TextEncoderStream>pending high surrogate</dfn>
  <dd>Null or a <a for=/>surrogate</a>, initially null.
@@ -1887,8 +1876,8 @@ textReadable
 constructor steps are:
 
 <ol>
- <li><p>Set <a>this</a>'s <a for=TextEncoderStream>encoder</a> to <a>UTF-8</a>'s
- <a for=/>encoder</a>.
+ <li><p>Set <a>this</a>'s <a for=TextEncoderStream>encoder</a> to an instance of the
+ <a>UTF-8 encoder</a>.
 
  <li><p>Let <var>transformAlgorithm</var> be an algorithm which takes a <var>chunk</var> argument
  and runs the <a>encode and enqueue a chunk</a> algorithm with <a>this</a> and <var>chunk</var>.
@@ -1953,8 +1942,8 @@ constructor steps are:
    value</a> algorithm with <var>encoder</var>, <var>item</var> and <var>input</var>.
 
    <li><p>If <var>result</var> is not <a>continue</a>, then <a>process</a> <var>result</var> for
-   <a for=TextEncoderStream>encoder</a>, <var>input</var>, <var>output</var>.
-
+   <var>encoder</var>'s <a for=TextEncoderStream>encoder</a>, <var>input</var>, <var>output</var>,
+   and "<code>fatal</code>".
   </ol>
 </ol>
 
@@ -2023,7 +2012,7 @@ that are split between strings. [[!INFRA]]
 to be more accurate in deployed content. Therefore it is not part of the <a>UTF-8 decoder</a>
 algorithm but rather the <a>decode</a> and <a>UTF-8 decode</a> algorithms.
 
-<p><a>UTF-8</a>'s <a for=/>decoder</a>'s has an associated
+<p><a>UTF-8</a>'s <a for=/>decoder</a> has an associated
 <dfn>UTF-8 code point</dfn>, <dfn>UTF-8 bytes seen</dfn>, and
 <dfn>UTF-8 bytes needed</dfn> (all initially 0), a <dfn>UTF-8 lower boundary</dfn>
 (initially 0x80), and a <dfn>UTF-8 upper boundary</dfn> (initially 0xBF).