Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: add option to omit padding #60

Merged
merged 1 commit into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ Additional options are supplied in an options bag argument:

- `lastChunkHandling`: Recall that base64 decoding operates on chunks of 4 characters at a time, but the input may have some characters which don't fit evenly into such a chunk of 4 characters. This option determines how the final chunk of characters should be handled. The three options are `"loose"` (the default), which treats the chunk as if it had any necessary `=` padding (but throws if this is not possible, i.e. there is exactly one extra character); `"strict"`, which enforces that the chunk has exactly 4 characters (counting `=` padding) and that [overflow bits](https://datatracker.ietf.org/doc/html/rfc4648#section-3.5) are 0; and `"stop-before-partial"`, which stops decoding before the final chunk unless the final chunk has exactly 4 characters.

- `omitPadding`: When encoding, whether to include `=` padding. Defaults to `false`, i.e., padding is included.

The hex methods do not take any options.

## Writing to an existing Uint8Array
Expand Down Expand Up @@ -89,10 +91,6 @@ For base64, you can specify either base64 or base64url for both the encoder and

For hex, both lowercase and uppercase characters (including mixed within the same string) will decode successfully. Output is always lowercase.

### How is `=` padding handled?

Padding is always generated. The base64 decoder allows specifying how to handle inputs without it with the `lastChunkHandling` option.

### How are the extra padding bits handled?

If the length of your input data isn't exactly a multiple of 3 bytes, then encoding it will use either 2 or 3 base64 characters to encode the final 1 or 2 bytes. Since each base64 character is 6 bits, this means you'll be using either 12 or 18 bits to represent 8 or 16 bits, which means you have an extra 4 or 2 bits which don't encode anything.
Expand Down
6 changes: 5 additions & 1 deletion playground/index-raw.html
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,8 @@ <h3>Basic usage</h3>

<h3>Options</h3>
<p>The base64 methods take an optional options bag which allows specifying the alphabet as either <code>"base64"</code> (the default) or <code>"base64url"</code> (<a href="https://datatracker.ietf.org/doc/html/rfc4648#section-5">the URL-safe variant</a>).</p>
<p>The base64 decoder also allows specifying the behavior for the final chunk with <code>lastChunkHandling</code>. Recall that base64 decoding operates on chunks of 4 characters at a time, but the input may have some characters which don't fit evenly into such a chunk of 4 characters. This option determines how the final chunk of characters should be handled. The three options are <code>"loose"</code> (the default), which treats the chunk as if it had any necessary <code>=</code> padding (but throws if this is not possible, i.e. there is exactly one extra character); <code>"strict"</code>, which enforces that the chunk has exactly 4 characters (counting <code>=</code> padding) and that <a href="https://datatracker.ietf.org/doc/html/rfc4648#section-3.5">overflow bits</a> are 0; and <code>"stop-before-partial"</code>, which stops decoding before the final chunk unless the final chunk has exactly 4 characters.
<p>The base64 decoder also allows specifying the behavior for the final chunk with <code>lastChunkHandling</code>. Recall that base64 decoding operates on chunks of 4 characters at a time, but the input may have some characters which don't fit evenly into such a chunk of 4 characters. This option determines how the final chunk of characters should be handled. The three options are <code>"loose"</code> (the default), which treats the chunk as if it had any necessary <code>=</code> padding (but throws if this is not possible, i.e. there is exactly one extra character); <code>"strict"</code>, which enforces that the chunk has exactly 4 characters (counting <code>=</code> padding) and that <a href="https://datatracker.ietf.org/doc/html/rfc4648#section-3.5">overflow bits</a> are 0; and <code>"stop-before-partial"</code>, which stops decoding before the final chunk unless the final chunk has exactly 4 characters.</p>
<p>The base64 encoder allows omitting padding by specifying <code>omitPadding: true</code>. The default is to include padding.</p>
<p>The hex methods do not have any options.</p>

<pre class="language-js"><code class="language-js">
Expand All @@ -109,6 +110,9 @@ <h3>Options</h3>
} catch {
console.log('with lastChunkHandling: "strict", overflow bits are rejected');
}

console.log((new Uint8Array([72])).toBase64()); // 'SA=='
console.log((new Uint8Array([72])).toBase64({ omitPadding: true })); // 'SA'
</code></pre>

<h3>Writing to an existing Uint8Array</h3>
Expand Down
5 changes: 3 additions & 2 deletions playground/polyfill-core.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ export function uint8ArrayToBase64(arr, options) {
if (alphabet !== 'base64' && alphabet !== 'base64url') {
throw new TypeError('expected alphabet to be either "base64" or "base64url"');
}
let omitPadding = !!opts.omitPadding;

if ('detached' in arr.buffer && arr.buffer.detached) {
throw new TypeError('toBase64 called on array backed by detached buffer');
Expand All @@ -63,13 +64,13 @@ export function uint8ArrayToBase64(arr, options) {
lookup[(triplet >> 18) & 63] +
lookup[(triplet >> 12) & 63] +
lookup[(triplet >> 6) & 63] +
'=';
(omitPadding ? '' : '=');
} else if (i + 1 === arr.length) {
let triplet = arr[i] << 16;
result +=
lookup[(triplet >> 18) & 63] +
lookup[(triplet >> 12) & 63] +
'==';
(omitPadding ? '' : '==');
}
return result;
}
Expand Down
5 changes: 3 additions & 2 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,13 @@ <h1>Uint8Array.prototype.toBase64 ( [ _options_ ] )</h1>
1. Let _alphabet_ be ? Get(_opts_, *"alphabet"*).
1. If _alphabet_ is *undefined*, set _alphabet_ to *"base64"*.
1. If _alphabet_ is neither *"base64"* nor *"base64url"*, throw a *TypeError* exception.
1. Let _omitPadding_ be ToBoolean(? Get(_opts_, *"omitPadding"*)).
1. Let _toEncode_ be ? GetUint8ArrayBytes(_O_).
1. If _alphabet_ is *"base64"*, then
1. Let _outAscii_ be the sequence of code points which results from encoding _toEncode_ according to the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a>. Padding is included.
1. Let _outAscii_ be the sequence of code points which results from encoding _toEncode_ according to the base64 encoding specified in section 4 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a>. Padding is included if and only if _omitPadding_ is *false*.
1. Else,
1. Assert: _alphabet_ is *"base64url"*.
1. Let _outAscii_ be the sequence of code points which results from encoding _toEncode_ according to the base64url encoding specified in section 5 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a>. Padding is included.
1. Let _outAscii_ be the sequence of code points which results from encoding _toEncode_ according to the base64url encoding specified in section 5 of <a href="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</a>. Padding is included if and only if _omitPadding_ is *false*.
1. Return CodePointsToString(_outAscii_).
</emu-alg>
</emu-clause>
Expand Down
8 changes: 8 additions & 0 deletions test-polyfill.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,14 @@ test('standard vectors', async t => {
}
});

test('omitPadding', async t => {
for (let [string, result] of standardBase64Vectors) {
await t.test(JSON.stringify(string), () => {
Comment on lines +29 to +31
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nbd, but why is this async? isn't this all sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just find it easier to always await everything in node:test rather than thinking about which things need it.

assert.strictEqual(stringToBytes(string).toBase64({ omitPadding: true }), result.replace(/=/g, ''));
});
}
});

let malformedPadding = ['=', 'Zg=', 'Z===', 'Zm8==', 'Zm9v='];
test('malformed padding', async t => {
for (let string of malformedPadding) {
Expand Down
Loading