Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding: GB18030-2022 encode #48239

Merged
merged 1 commit into from
Sep 19, 2024
Merged

Encoding: GB18030-2022 encode #48239

merged 1 commit into from
Sep 19, 2024

Conversation

annevk
Copy link
Member

@annevk annevk commented Sep 18, 2024

Correct a mistake in 62fb506. The leading 0x00 bytes are wrong.

Correct a mistake in 62fb506. The leading 0x00 bytes are wrong.
annevk added a commit to whatwg/encoding that referenced this pull request Sep 18, 2024
This implements the Unicode Technical Committee recommendation around GB18030-2022 in a matter suitable for this standard, taking into account existing practice and the closeness between GBK and gb18030.

In particular, using the text file attached to https://www.unicode.org/L2/L2023/23003r-gb18030-recommendations.pdf this does the following:

1. Merges the first set of 18 mappings, which are bidirectional, directly into index gb18030, replacing existing PUA entries. This ends up impacting GBK and gb18030.
2. The second set of 18 mappings (from PUA to bytes) are encoded as an encoder only table, for both GBK and gb18030.
3. The third set of 18 mappings (from bytes to code points) are ignored, as they are already covered by index gb18030 ranges. (Presumably they are included because the recommendation covers the transition from "Previous Mappings" to "Current Mappings" to "Recommended Mappings", whereas we are going directly from "Previous Mappings" to "Recommended Mappings".)

The reason for changing GBK as well is because Chromium and WebKit have already code in the wild that impacts GBK to some degree (although the encoder only table is excluded for GBK only at the moment, including that would make the most sense compatibility-wise) and no fallout has been recorded. Additionally GBK is already positioned as a rough subset of gb18030 in this standard, with the decoder being shared completely.

Tests: encoding/legacy-mb-schinese has some GB18030-2022 coverage already. The aim is to complete that with web-platform-tests/wpt#48239 and web-platform-tests/wpt#48240.

This supersedes #335. This fixes #27 and fixes #312.
@annevk annevk mentioned this pull request Sep 18, 2024
5 tasks
annevk added a commit to annevk/WebKit that referenced this pull request Sep 18, 2024
https://bugs.webkit.org/show_bug.cgi?id=279898

Reviewed by NOBODY (OOPS!).

Synchronize the encoding/ folder with upstream, using --clean-dest-dir
to remove many stale files (became stale due to changes from .html to
.any.js). Also gives us better coverage for the streams API.

This should not impact GB18030-2022 coverage as most of that was
already exported one way or another.
web-platform-tests/wpt#48239 tracks the latest
that is still missing upstream.

Otherwise this aligns with this commit upstream:
web-platform-tests/wpt@62fb506

* LayoutTests/imported/w3c/resources/resource-files.json:
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-basics-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-basics.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.worker-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.worker.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.worker_1-1000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.worker_1001-2000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.worker_2001-3000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.worker_3001-last-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any_1-1000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any_1001-2000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any_2001-3000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any_3001-last-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-replacement-encodings-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-surrogates-utf8-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/api-surrogates-utf8.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any-expected.txt:
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.js.headers: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.serviceworker-expected.txt: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any-expected.txt.
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.sharedworker-expected.txt: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any-expected.txt.
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/encodeInto.any.worker-expected.txt:
* LayoutTests/imported/w3c/web-platform-tests/encoding/eof-shift_jis-ref.html: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/eof-utf-8-one-ref.html: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/eof-utf-8-three-ref.html: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/eof-utf-8-two-ref.html: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/idlharness.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/idlharness.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/idlharness.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/idlharness.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/iso-2022-jp-decoder.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-korean/euc-kr/euckr-decode-ks_c_5601-1987-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-schinese/gb18030/gb18030-decoder-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-schinese/gb18030/gb18030-decoder.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-schinese/gb18030/gb18030-decoder.html: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-schinese/gb18030/gb18030-encoder.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-schinese/gbk/gbk-decoder-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-schinese/gbk/gbk-decoder.html: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/legacy-mb-tchinese/big5/big5-encode-form-errors-extBa-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/resources/text-html-meta-charset.py: Added.
(main):
* LayoutTests/imported/w3c/web-platform-tests/encoding/resources/w3c-import.log:
* LayoutTests/imported/w3c/web-platform-tests/encoding/sharedarraybuffer.https-expected.txt:
* LayoutTests/imported/w3c/web-platform-tests/encoding/sharedarraybuffer.https.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/single-byte-decoder-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/backpressure.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/backpressure.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/backpressure.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/backpressure.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/backpressure.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-attributes.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-attributes.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-attributes.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-attributes.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-attributes.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-bad-chunks.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-bad-chunks.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-bad-chunks.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-bad-chunks.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-bad-chunks.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-ignore-bom.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-ignore-bom.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-ignore-bom.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-ignore-bom.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-incomplete-input.any.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-incomplete-input.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-incomplete-input.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-incomplete-input.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-incomplete-input.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-non-utf8.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-non-utf8.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-non-utf8.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-non-utf8.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-non-utf8.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-split-character.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-split-character.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-split-character.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-split-character.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-utf8.any.js.headers: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-utf8.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-utf8.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-utf8.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/decode-utf8.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-bad-chunks.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-bad-chunks.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-bad-chunks.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-bad-chunks.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-utf8.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-utf8.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-utf8.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/encode-utf8.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/invalid-realm.window-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/invalid-realm.window.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/invalid-realm.window.js: Added.
(addIframe):
(promise_test.async t):
(string_appeared_here.promise_test.async t):
(string_appeared_here.return.stream.writable.close):
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/readable-writable-properties.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/readable-writable-properties.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/readable-writable-properties.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/readable-writable-properties.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/readable-writable-properties.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/stringification-crash.html: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/streams/w3c-import.log:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-arguments.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-byte-order-marks-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-byte-order-marks.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-copy.any.js.headers: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-copy.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-copy.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-copy.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-copy.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker.html:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_1-1000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_1001-2000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_2001-3000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_3001-4000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_4001-5000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_5001-6000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_6001-7000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any.worker_7001-last-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_1-1000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_1001-2000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_2001-3000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_3001-4000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_4001-5000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_5001-6000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_6001-7000-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-single-byte.any_7001-last-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal-streaming.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-fatal.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-ignorebom-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-ignorebom.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-streaming-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-streaming.any.js.headers: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-streaming.any.serviceworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-streaming.any.serviceworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-streaming.any.sharedworker-expected.txt: Added.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-streaming.any.sharedworker.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textdecoder-utf16-surrogates.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/textencoder-constructor-non-utf-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textencoder-utf16-surrogates-expected.txt: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/textencoder-utf16-surrogates.any.js:
* LayoutTests/imported/w3c/web-platform-tests/encoding/unsupported-labels.html: Removed.
* LayoutTests/imported/w3c/web-platform-tests/encoding/unsupported-labels.window-expected.txt: Renamed from LayoutTests/imported/w3c/web-platform-tests/encoding/unsupported-labels-expected.txt.
* LayoutTests/imported/w3c/web-platform-tests/encoding/unsupported-labels.window.html: Copied from LayoutTests/imported/w3c/web-platform-tests/encoding/api-invalid-label.any.html.
* LayoutTests/imported/w3c/web-platform-tests/encoding/unsupported-labels.window.js: Renamed from LayoutTests/imported/w3c/web-platform-tests/encoding/resources/unsupported-labels.window.js.
(string_appeared_here.forEach.label.async_test.t.frame.onload.t.step_func_done):
* LayoutTests/imported/w3c/web-platform-tests/encoding/w3c-import.log:
* LayoutTests/tests-options.json:
Copy link
Contributor

@Ms2ger Ms2ger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rubber stamp, but would appreciate @hsivonen to take a look if he has time

@annevk
Copy link
Member Author

annevk commented Sep 19, 2024

Thanks and agreed, but going to land this as this was already reviewed in WebKit and the current expectations are not something anyone should ever try to match.

@annevk annevk merged commit 06682bd into master Sep 19, 2024
19 checks passed
@annevk annevk deleted the gb18030-2022 branch September 19, 2024 08:41
annevk added a commit to whatwg/encoding that referenced this pull request Oct 4, 2024
This implements the Unicode Technical Committee recommendation around GB18030-2022 in a matter suitable for this standard, taking into account existing practice and the closeness between GBK and gb18030.

In particular, using the text file attached to https://www.unicode.org/L2/L2023/23003r-gb18030-recommendations.pdf this does the following:

1. Merges the first set of 18 mappings, which are bidirectional, directly into index gb18030, replacing existing PUA entries. This ends up impacting GBK and gb18030.
2. The second set of 18 mappings (from PUA to bytes) are encoded as an encoder only table, for both GBK and gb18030.
3. The third set of 18 mappings (from bytes to code points) are ignored, as they are already covered by index gb18030 ranges. (Presumably they are included because the recommendation covers the transition from "Previous Mappings" to "Current Mappings" to "Recommended Mappings", whereas we are going directly from "Previous Mappings" to "Recommended Mappings".)

The reason for changing GBK as well is because Chromium and WebKit have already code in the wild that impacts GBK to some degree (although the encoder only table is excluded for GBK only at the moment, including that would make the most sense compatibility-wise) and no fallout has been recorded. Additionally GBK is already positioned as a rough subset of gb18030 in this standard, with the decoder being shared completely.

Tests: encoding/legacy-mb-schinese has some GB18030-2022 coverage already. This is completed with web-platform-tests/wpt#48239 and web-platform-tests/wpt#48240.

This supersedes #335. This fixes #27 and fixes #312.

This also updates the description of index gb18030 ranges to account for #22 (the change from GB18030-2000 to -2005) which it until now did not.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants