System.Text.Encodings.Web refactoring and code modernization #49373

GrabYourPitchforks · 2021-03-09T18:04:39Z

Background: System.Text.Encodings.Web contains significant amounts of unsafe code. Part of this is due to the fact that the abstraction itself is pointer-based. And part of it is due to efforts to increase performance in hot paths. However, because these code patterns end up in hot paths and tight loops, it's difficult to foresee all the different edge cases that might crop up. This can manifest itself as a reliability or a security problem. Given that this code is intended to run over untrusted input, this is not ideal.

High-level overview of this PR

This PR refactors the System.Text.Encodings.Web project, modernizes much of the code to use Span<T> and other safer APIs as appropriate, and fixes a handful of outstanding bugs.

Unsafe code has been removed from hot paths where possible and refactored into separate reviewable and testable helper methods.
Vestigial code paths in both the source project and the test project have been removed. (The test project in particular contained many ancient artifacts and non-shipping adapters resulting from this code originally existing in the old pre-1.0 aspnet repository.)
The optimized workhorse logic for the inbox HTML, URL, JSON, and JSON-relaxed encoders have been moved into a single class, with the individual encoders now only responsible for dictating the representation of a single scalar value.
For custom encoders (where the user has subclassed one of our abstract types), we fall back to naïve but more universally correct logic.
For inbox encoders, the fast workhorse routine can make assumptions about how they'll escape data, giving significant performance wins over previous iterations of the logic. This optimized logic was previously unique to the JSON-relaxed encoder, but it has been generalized and extended to all inbox encoders as part of this refactoring.

A brief tour of the files

AllowedBmpCodePointsBitmap.cs - a bitmap of "allowed vs. disallowed" flags for all BMP characters. The implementation of this class is unsafe and requires close review. However, all entry points are guaranteed safe, and there's a standalone unit test exercising edge cases for the unsafe implementation.

AsciiByteMap.cs - a simple map of ASCII characters to single bytes, used for quick lookup by index. The implementation of this class is unsafe and requires close review. However, all entry points are guaranteed safe, and there's a standalone unit test exercising edge cases for the unsafe implementation.

Default[Html|Url|JavaScript]Encoder.cs - the in-box implementations of HtmlEncoder, UrlEncoder, and JavaScriptEncoder. There is no longer a separate implementation for the default inbox JSON encoder vs. the unsafe relaxed JSON encoder: they both filter down to shared logic in DefaultJavaScriptEncoder.cs. These files also contain the core "how do I perform HTML / URL / JS escaping?" logic. These files now contain only safe code, modulo overriding some existing unsafe APIs and forwarding the arguments elsewhere.

[Html|Url|JavaScript]Encoder.cs - provide static factories around the Default\*Encoder types. There's no longer any real logic in these types.

OptimizedInboxTextEncoder.cs - contains the shared "find which characters need escaping and write out the escaped form" logic used by all of the in-box encoders. There's no longer a separate code path for JSON vs. everything else. There are some unsafe method overrides, but for the most part they just forward arguments and don't do anything particularly interesting. The implementation of GetIndexOfFirstCharToEncode is unsafe and requires close review.

OptimizedInboxTextEncoder.Ascii.cs - contains optimized lookup tables for ASCII escaping. The implementation of these methods is unsafe and requires close review. However, all entry points are guaranteed safe, and there are standalone unit tests for these APIs.

OptimizedInboxTextEncoder.Ssse3.cs - contains SSE3-optimized "find the first char / byte to escape" logic. The implementation of these methods is unsafe and requires close review.

SpanUtility.cs - contains helper methods for working with and writing data to spans. The implementation of these methods is unsafe and requires close review. However, all entry points are guaranteed safe, and there are standalone unit tests for these APIs.

TextEncoder.cs - contains naïve "find which characters need escaping and write out the escaped form" logic that can work for generalized encoding that doesn't fulfill the contracts provided by our inbox encoders. There are also shared helper optimization methods for handling string escaping, etc. The implementation of these methods is safe, modulo some unsafe method overrides that forward to safe alternatives.

Polyfill\*.cs - contains internal polyfill implementations for APIs which are missing from downlevel.

Of special note is that the unsafe code is refactored in such a way that only the implementations bolded above have unsafe entry points. Other helper types which have unsafe implementations (like AsciiByteMap) have guaranteed-safe entry points and perform argument validation, and these helper types have their own suite of unit tests to help exercise edge cases. This should give high confidence that these helpers remain safe to call even in the face of a safely-written workhorse routine passing them bad data. The APIs bolded above (with unsafe entry points) are the ones that require closer review since they cannot be exercised in isolation from within unit tests. However, the unit test file InboxEncoderCommonTests.cs does try its best to provide various-length inputs to help detect issues. The unit tests are also scaffolded with the BoundedMemory<T> infrastructure to provide further detection of out-of-bounds memory accesses.

Performance

Performance numbers and discussion will be left as a comment within the issue.

Other notes for reviewers

The package no longer builds for netstandard2.1 or netcoreapp3.0. Instead, everything is unified as follows:

net60 - inbox version as part of the .NET 6 wave.
netcoreapp31 - OOB version to install into .NET Core 3.1 apps.
net461 - OOB version for .NET Framework 4.6.1+ (see Eric's comment here).
netstandard20 - OOB version for all other platforms and runtimes.

.NET Core 3.0 is already out of support, and .NET Core 2.1 will be out of support by the time this package RTMs. I don't think there's a need to include special DLLs targeting these runtimes. Additionally, even though this is not checked in yet, I'd like to stop harvesting the netstandard1.0 DLL into this package. Pretty much all apps should be targeting a netstandard2.0-capable platform at this point.

The existing SSE2 and ADVSIMD optimizations have been removed as part of this PR. The reason for this is that there's no longer a need for a "does this vector contain only ASCII bytes?" helper method. Instead, the SIMD ASCII-processing code paths have been written in terms of a pshufb-equivalent. For x86, this requires SSSE3.1. The ARM64 equivalent code path was never checked in to this library. That work will need to take place in order to restore the performance on ARM64. (/cc @carlossanlop @eiriktsarpalis)

Fixes #39829.
Fixes #45994.
Fixes #48519.

Ref: CVE-2021-26701 (MSRC 62749)

- Refactor unsafe code from TextEncoder workhorse routines into standalone helpers - Fix bounds check logic in workhorse routines - Remove vestigial code from the library and unit test project - Add significant unit test coverage for the workhorse routines and unsafe helpers

ghost · 2021-03-09T18:04:46Z

Tagging subscribers to this area: @tarekgh, @eiriktsarpalis, @layomia
See info in area-owners.md if you want to be subscribed.

Issue Details

Background: System.Text.Encodings.Web contains significant amounts of unsafe code. Part of this is due to the fact that the abstraction itself is pointer-based. And part of it is due to efforts to increase performance in hot paths. However, because these code patterns end up in hot paths and tight loops, it's difficult to foresee all the different edge cases that might crop up. This can manifest itself as a reliability or a security problem. Given that this code is intended to run over untrusted input, this is not ideal.

High-level overview of this PR

This PR refactors the System.Text.Encodings.Web project, modernizes much of the code to use Span<T> and other safer APIs as appropriate, and fixes a handful of outstanding bugs.

Unsafe code has been removed from hot paths where possible and refactored into separate reviewable and testable helper methods.
Vestigial code paths in both the source project and the test project have been removed. (The test project in particular contained many ancient artifacts and non-shipping adapters resulting from this code originally existing in the old pre-1.0 aspnet repository.)
The optimized workhorse logic for the inbox HTML, URL, JSON, and JSON-relaxed encoders have been moved into a single class, with the individual encoders now only responsible for dictating the representation of a single scalar value.
For custom encoders (where the user has subclassed one of our abstract types), we fall back to naïve but more universally correct logic.
For inbox encoders, the fast workhorse routine can make assumptions about how they'll escape data, giving significant performance wins over previous iterations of the logic. This optimized logic was previously unique to the JSON-relaxed encoder, but it has been generalized and extended to all inbox encoders as part of this refactoring.

A brief tour of the files

AllowedBmpCodePointsBitmap.cs - a bitmap of "allowed vs. disallowed" flags for all BMP characters. The implementation of this class is unsafe and requires close review. However, all entry points are guaranteed safe, and there's a standalone unit test exercising edge cases for the unsafe implementation.

AsciiByteMap.cs - a simple map of ASCII characters to single bytes, used for quick lookup by index. The implementation of this class is unsafe and requires close review. However, all entry points are guaranteed safe, and there's a standalone unit test exercising edge cases for the unsafe implementation.

Default[Html|Url|JavaScript]Encoder.cs - the in-box implementations of HtmlEncoder, UrlEncoder, and JavaScriptEncoder. There is no longer a separate implementation for the default inbox JSON encoder vs. the unsafe relaxed JSON encoder: they both filter down to shared logic in DefaultJavaScriptEncoder.cs. These files also contain the core "how do I perform HTML / URL / JS escaping?" logic. These files now contain only safe code, modulo overriding some existing unsafe APIs and forwarding the arguments elsewhere.

[Html|Url|JavaScript]Encoder.cs - provide static factories around the Default\*Encoder types. There's no longer any real logic in these types.

OptimizedInboxTextEncoder.cs - contains the shared "find which characters need escaping and write out the escaped form" logic used by all of the in-box encoders. There's no longer a separate code path for JSON vs. everything else. There are some unsafe method overrides, but for the most part they just forward arguments and don't do anything particularly interesting. The implementation of GetIndexOfFirstCharToEncode is unsafe and requires close review.

OptimizedInboxTextEncoder.Ascii.cs - contains optimized lookup tables for ASCII escaping. The implementation of these methods is unsafe and requires close review. However, all entry points are guaranteed safe, and there are standalone unit tests for these APIs.

OptimizedInboxTextEncoder.[Ssse3|Simd].cs - contains SSE3-optimized "find the first char / byte to escape" logic. The implementation of these methods is unsafe and requires close review.

SpanUtility.cs - contains helper methods for working with and writing data to spans. The implementation of these methods is unsafe and requires close review. However, all entry points are guaranteed safe, and there are standalone unit tests for these APIs.

TextEncoder.cs - contains naïve "find which characters need escaping and write out the escaped form" logic that can work for generalized encoding that doesn't fulfill the contracts provided by our inbox encoders. There are also shared helper optimization methods for handling string escaping, etc. The implementation of these methods is safe, modulo some unsafe method overrides that forward to safe alternatives.

Polyfill\*.cs - contains internal polyfill implementations for APIs which are missing from downlevel.

Of special note is that the unsafe code is refactored in such a way that only the implementations bolded above have unsafe entry points. Other helper types which have unsafe implementations (like AsciiByteMap) have guaranteed-safe entry points and perform argument validation, and these helper types have their own suite of unit tests to help exercise edge cases. This should give high confidence that these helpers remain safe to call even in the face of a safely-written workhorse routine passing them bad data. The APIs bolded above (with unsafe entry points) are the ones that require closer review since they cannot be exercised in isolation from within unit tests. However, the unit test file InboxEncoderCommonTests.cs does try its best to provide various-length inputs to help detect issues. The unit tests are also scaffolded with the BoundedMemory<T> infrastructure to provide further detection of out-of-bounds memory accesses.

Performance

Performance numbers and discussion will be left as a comment within the issue.

Other notes for reviewers

The package no longer builds for netstandard2.1, netcoreapp3.0, and net461. Instead, everything is unified as follows:

net60 - inbox version as part of the .NET 6 wave.
netcoreapp31 - OOB version to install into .NET Core 3.1 apps.
netstandard20 - OOB version for all other platforms and runtimes.

.NET Core 3.0 is already out of support, and .NET Core 2.1 will be out of support by the time this package RTMs. I don't think there's a need to include special DLLs targeting these runtimes. Additionally, even though this is not checked in yet, I'd like to stop harvesting the netstandard1.0 DLL into this package. Pretty much all apps should be targeting a netstandard2.0-capable platform at this point.

The existing SSE2 and ADVSIMD optimizations have been removed as part of this PR. The reason for this is that there's no longer a need for a "does this vector contain only ASCII bytes?" helper method. Instead, the SIMD ASCII-processing code paths have been written in terms of a pshufb-equivalent. For x86, this requires SSSE3.1. The ARM64 equivalent code path was never checked in to this library. That work will need to take place in order to restore the performance on ARM64. (/cc @carlossanlop @eiriktsarpalis)

Fixes #39829.
Fixes #45994.
Fixes #48519.

Ref: CVE-2021-26701 (MSRC 62749)

Author:	GrabYourPitchforks
Assignees:	-
Labels:	`area-System.Text.Encodings.Web`
Milestone:	-

GrabYourPitchforks · 2021-03-09T18:04:49Z

Performance results

Raw performance numbers

Method	Job	Toolchain	Arg	Encoder	Mean	Error	StdDev	Ratio	RatioSD
FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	HTML	8.694 ns	0.0539 ns	0.0478 ns	2.92	0.03
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	<div (...)/div> [38]	HTML	2.972 ns	0.0230 ns	0.0204 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	HTML	6.325 ns	0.0491 ns	0.0435 ns	0.02	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	<div (...)/div> [38]	HTML	319.785 ns	3.3888 ns	3.0041 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	HTML	91.049 ns	1.2822 ns	1.1366 ns	0.43	0.01
EncodeToStringUtf16	Job-BRHPCW	main	<div (...)/div> [38]	HTML	211.922 ns	1.3099 ns	1.2253 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	HTML	66.198 ns	0.5649 ns	0.5007 ns	0.31	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	<div (...)/div> [38]	HTML	216.714 ns	0.9033 ns	0.8008 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	HTML	55.146 ns	0.2212 ns	0.1961 ns	0.43	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	<div (...)/div> [38]	HTML	128.954 ns	0.6804 ns	0.6031 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Default	7.189 ns	0.1735 ns	0.2065 ns	1.38	0.04
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Default	5.258 ns	0.0283 ns	0.0265 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Default	5.606 ns	0.0227 ns	0.0212 ns	0.96	0.01
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Default	5.867 ns	0.0406 ns	0.0339 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Default	102.222 ns	0.6249 ns	0.4879 ns	0.44	0.00
EncodeToStringUtf16	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Default	233.505 ns	0.8457 ns	0.7911 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Default	71.749 ns	0.1845 ns	0.1541 ns	0.33	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Default	214.514 ns	0.5218 ns	0.4357 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Default	53.996 ns	0.2096 ns	0.1858 ns	0.42	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Default	129.970 ns	0.9891 ns	0.8769 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Relaxed	7.055 ns	0.0332 ns	0.0311 ns	0.85	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Relaxed	8.323 ns	0.0383 ns	0.0358 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Relaxed	5.615 ns	0.0203 ns	0.0190 ns	0.95	0.01
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Relaxed	5.911 ns	0.0436 ns	0.0387 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Relaxed	59.882 ns	0.4918 ns	0.4600 ns	0.39	0.00
EncodeToStringUtf16	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Relaxed	154.046 ns	1.1448 ns	0.9560 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Relaxed	46.645 ns	0.2107 ns	0.1868 ns	0.33	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Relaxed	139.924 ns	0.3554 ns	0.3151 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	JSON-Relaxed	65.963 ns	0.3243 ns	0.3033 ns	0.57	0.01
EncodeToBufferUtf8	Job-BRHPCW	main	<div (...)/div> [38]	JSON-Relaxed	115.284 ns	0.9305 ns	0.8704 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	URL	7.178 ns	0.0400 ns	0.0355 ns	1.88	0.03
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	<div (...)/div> [38]	URL	3.817 ns	0.0686 ns	0.0642 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	URL	6.263 ns	0.0366 ns	0.0343 ns	0.02	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	<div (...)/div> [38]	URL	326.791 ns	4.4165 ns	4.1312 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	URL	84.182 ns	1.0765 ns	1.0069 ns	0.38	0.00
EncodeToStringUtf16	Job-BRHPCW	main	<div (...)/div> [38]	URL	224.395 ns	1.0459 ns	0.9784 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	<div (...)/div> [38]	URL	63.745 ns	0.3234 ns	0.3025 ns	0.30	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	<div (...)/div> [38]	URL	214.040 ns	0.7012 ns	0.5855 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	<div (...)/div> [38]	URL	61.078 ns	0.2520 ns	0.2358 ns	0.39	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	<div (...)/div> [38]	URL	157.103 ns	1.7207 ns	1.6096 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	HTML	9.042 ns	0.1458 ns	0.1364 ns	0.29	0.00
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	The q(...) dog. [44]	HTML	31.407 ns	0.1234 ns	0.1154 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	HTML	8.910 ns	0.0453 ns	0.0424 ns	0.03	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	The q(...) dog. [44]	HTML	332.453 ns	3.5721 ns	3.3414 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	HTML	7.796 ns	0.0469 ns	0.0439 ns	0.24	0.00
EncodeToStringUtf16	Job-BRHPCW	main	The q(...) dog. [44]	HTML	32.159 ns	0.1531 ns	0.1432 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	HTML	15.169 ns	0.0854 ns	0.0798 ns	0.37	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	The q(...) dog. [44]	HTML	41.074 ns	0.3558 ns	0.3328 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	HTML	13.366 ns	0.0721 ns	0.0674 ns	0.12	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	The q(...) dog. [44]	HTML	114.260 ns	0.7250 ns	0.6782 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Default	8.224 ns	0.0471 ns	0.0440 ns	0.64	0.00
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Default	12.845 ns	0.0530 ns	0.0496 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Default	7.335 ns	0.0394 ns	0.0349 ns	0.71	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Default	10.261 ns	0.0523 ns	0.0437 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Default	7.407 ns	0.0617 ns	0.0547 ns	0.52	0.02
EncodeToStringUtf16	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Default	14.645 ns	0.3429 ns	0.4918 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Default	15.193 ns	0.0691 ns	0.0612 ns	0.73	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Default	20.960 ns	0.1974 ns	0.1846 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Default	14.895 ns	0.3277 ns	0.3218 ns	0.12	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Default	120.529 ns	1.0115 ns	0.8446 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Relaxed	9.141 ns	0.2108 ns	0.2165 ns	0.49	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Relaxed	18.614 ns	0.1031 ns	0.0914 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Relaxed	8.393 ns	0.2006 ns	0.5107 ns	0.35	0.01
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Relaxed	23.774 ns	0.1035 ns	0.0968 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Relaxed	9.884 ns	0.7963 ns	2.3478 ns	0.45	0.07
EncodeToStringUtf16	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Relaxed	21.797 ns	0.1602 ns	0.1499 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Relaxed	15.261 ns	0.0958 ns	0.0849 ns	0.59	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Relaxed	25.686 ns	0.1634 ns	0.1448 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	JSON-Relaxed	14.727 ns	0.1172 ns	0.1039 ns	0.13	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	The q(...) dog. [44]	JSON-Relaxed	114.112 ns	0.7554 ns	0.7066 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	URL	7.182 ns	0.0645 ns	0.0604 ns	1.24	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	The q(...) dog. [44]	URL	5.801 ns	0.0428 ns	0.0357 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	URL	6.112 ns	0.0390 ns	0.0346 ns	0.02	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	The q(...) dog. [44]	URL	327.942 ns	1.8296 ns	1.5278 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	URL	79.203 ns	0.4831 ns	0.4519 ns	0.32	0.00
EncodeToStringUtf16	Job-BRHPCW	main	The q(...) dog. [44]	URL	248.049 ns	1.5316 ns	1.4327 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	The q(...) dog. [44]	URL	58.186 ns	0.2487 ns	0.2205 ns	0.25	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	The q(...) dog. [44]	URL	230.507 ns	0.8315 ns	0.7778 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	The q(...) dog. [44]	URL	57.857 ns	0.2623 ns	0.2325 ns	0.33	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	The q(...) dog. [44]	URL	174.010 ns	1.9451 ns	1.8194 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	HTML	8.513 ns	0.0500 ns	0.0468 ns	2.44	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	HTML	3.483 ns	0.0177 ns	0.0157 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	HTML	11.363 ns	0.0419 ns	0.0392 ns	0.04	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	HTML	321.798 ns	2.2619 ns	2.1158 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	HTML	656.485 ns	5.2886 ns	4.6882 ns	0.72	0.01
EncodeToStringUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	HTML	914.875 ns	6.3256 ns	5.6075 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	HTML	595.451 ns	4.0190 ns	3.5628 ns	0.68	0.01
EncodeToBufferUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	HTML	878.506 ns	17.5687 ns	16.4337 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	HTML	778.128 ns	2.9941 ns	2.6542 ns	0.40	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	HTML	1,940.292 ns	21.4708 ns	19.0333 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Default	7.065 ns	0.0454 ns	0.0425 ns	1.34	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Default	5.261 ns	0.0289 ns	0.0271 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Default	10.291 ns	0.1681 ns	0.1573 ns	1.76	0.03
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Default	5.842 ns	0.0357 ns	0.0298 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Default	504.770 ns	2.4158 ns	2.2597 ns	0.54	0.01
EncodeToStringUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Default	933.988 ns	18.2717 ns	20.3089 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Default	475.763 ns	2.1843 ns	1.9363 ns	0.53	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Default	904.113 ns	3.0337 ns	2.3685 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Default	571.503 ns	3.2688 ns	2.8977 ns	0.28	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Default	2,044.989 ns	18.8413 ns	17.6242 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Relaxed	38.437 ns	0.2185 ns	0.2043 ns	0.58	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Relaxed	66.766 ns	0.5176 ns	0.4842 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Relaxed	245.429 ns	3.0871 ns	2.8876 ns	0.98	0.01
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Relaxed	251.336 ns	1.4148 ns	1.3234 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Relaxed	37.532 ns	0.1692 ns	0.1582 ns	0.54	0.00
EncodeToStringUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Relaxed	70.017 ns	0.2584 ns	0.2158 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Relaxed	43.956 ns	0.1667 ns	0.1559 ns	0.60	0.00
EncodeToBufferUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Relaxed	73.799 ns	0.4131 ns	0.3662 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	JSON-Relaxed	250.019 ns	1.6540 ns	1.5472 ns	0.64	0.01
EncodeToBufferUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	JSON-Relaxed	390.873 ns	1.4870 ns	1.3909 ns	1.00	0.00

FindFirstCharToEncodeUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	URL	7.149 ns	0.0388 ns	0.0363 ns	1.81	0.01
FindFirstCharToEncodeUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	URL	3.954 ns	0.0200 ns	0.0187 ns	1.00	0.00

FindFirstCharToEncodeUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	URL	11.343 ns	0.0374 ns	0.0350 ns	0.03	0.00
FindFirstCharToEncodeUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	URL	333.462 ns	2.7372 ns	2.5604 ns	1.00	0.00

EncodeToStringUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	URL	588.343 ns	2.6690 ns	2.2287 ns	0.73	0.01
EncodeToStringUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	URL	800.937 ns	6.3436 ns	5.2972 ns	1.00	0.00

EncodeToBufferUtf16	Job-QJLEZH	encoder	Лорем(...) хис. [68]	URL	523.123 ns	3.7607 ns	3.5177 ns	0.70	0.01
EncodeToBufferUtf16	Job-BRHPCW	main	Лорем(...) хис. [68]	URL	751.730 ns	4.6596 ns	4.3586 ns	1.00	0.00

EncodeToBufferUtf8	Job-QJLEZH	encoder	Лорем(...) хис. [68]	URL	607.154 ns	2.3643 ns	2.2115 ns	0.32	0.00
EncodeToBufferUtf8	Job-BRHPCW	main	Лорем(...) хис. [68]	URL	1,889.955 ns	7.3792 ns	6.1619 ns	1.00	0.00

Benchmark code

namespace ConsoleAppBenchmark
{
    [SkipLocalsInit]
    public class TextEncoderRunner
    {
        [Params(
            "The quick brown fox jumps over the lazy dog.", // no escaping needed ever
            "<div id=\"myDiv\">Escape &amp; me!</div>", // contains some HTML / URL / JSON-sensitive chars
            "Лорем ипсум долор сит амет, цоммуне малуиссет цонцлудатуряуе ад хис.")] // Cyrillic lipsum; no escaping needed (when Cyrillic allowed)
        public string Arg { get; set; }
        private byte[] _argUtf8;
        private char[] _scratchBuffer = new char[1024];
        private byte[] _scratchUtf8Buffer = new byte[1024];

        [Params("HTML", "URL", "JSON-Default", "JSON-Relaxed")]
        public string Encoder { get; set; }
        private TextEncoder _encoder;

        [GlobalSetup]
        public void Setup()
        {
            _argUtf8 = Encoding.UTF8.GetBytes(Arg);
            _encoder = Encoder switch
            {
                "HTML" => HtmlEncoder.Default,
                "URL" => UrlEncoder.Default,
                "JSON-Default" => JavaScriptEncoder.Default,
                "JSON-Relaxed" => JavaScriptEncoder.UnsafeRelaxedJsonEscaping,
                _ => throw new Exception("Unknown encoder."),
            };
        }

        [Benchmark]
        public unsafe int FindFirstCharToEncodeUtf16()
        {
            string arg = Arg;
            _ = arg.Length; // deref; prove not null

            fixed (char* pArg = arg)
            {
                return _encoder.FindFirstCharacterToEncode(pArg, arg.Length);
            }
        }

        [Benchmark]
        public int FindFirstCharToEncodeUtf8()
        {
            byte[] argUtf8 = _argUtf8;
            _ = argUtf8.Length; // deref; prove not null
            return _encoder.FindFirstCharacterToEncodeUtf8(argUtf8);
        }

        [Benchmark]
        public string EncodeToStringUtf16()
        {
            return _encoder.Encode(Arg);
        }

        [Benchmark]
        public OperationStatus EncodeToBufferUtf16()
        {
            string arg = Arg;
            _ = arg.Length; // deref; prove not null

            char[] dest = _scratchBuffer;
            _ = dest.Length; // deref; prove not null

            return _encoder.Encode(arg, dest, out _, out _);
        }

        [Benchmark]
        public OperationStatus EncodeToBufferUtf8()
        {
            byte[] argUtf8 = _argUtf8;
            _ = argUtf8.Length; // deref; prove not null

            byte[] dest = _scratchUtf8Buffer;
            _ = dest.Length; // deref; prove not null

            return _encoder.EncodeUtf8(argUtf8, dest, out _, out _);
        }
    }
}

Performance discussions

Performance is generally better across the board, often significantly so. The performance improvement comes from three main places:

The "skip over all ASCII chars which don't require encoding" logic is now SIMD-optimized (on x64) for all encoders, not just the JSON encoder.
The newly refactored helper methods reduce the number of unnecessary bounds checks in the safe workhorse routines. Bounds checking still takes place, but it is folded into the subsequent derefence or otherwise results in a future bounds check being elided where possible.
The helper routines utilize data structures with simplified (C-style) memory layouts rather than bouncing through array-based indirections.

We also take advantage of recent PRs like #49180 to reduce the number of duplicate checks occurring inside our hot paths, opting to hoist these checks outside of the loop where possible.

The notable exception to the performance improvement is the FindFirstCharToEncodeUtf16 method. This method incurs a fixed (O(1)) overhead on method entry due to setting up the SIMD data structures. If the first character to encode occurs at the very beginning of the string, this overhead will show up as a 3 - 4 ns loss when compared to a simple char-by-char loop. Since the runtime of such a method was already very low, this 3 - 4 ns loss appears to be a significant overhead when seen as a ratio. I do not believe this to affect the common use case for these APIs, as the typical calling pattern is to call Encode(string) or similar API. That API uses FindFirstCharToEncode* as a workhorse routine, and the linear (O(n)) savings we see from the Encode* methods more than make up for any fixed loss due to SIMD overhead.

Tornhoof · 2021-03-09T20:15:20Z

src/libraries/System.Text.Encodings.Web/src/System/Text/Encodings/Web/DefaultHtmlEncoder.cs

+            {
+                if (value.Value == '<')
+                {
+                    if (!SpanUtility.TryWriteBytes(destination, (byte)'&', (byte)'l', (byte)'t', (byte)';')) { goto OutOfSpace; }


Maybe I'm missing something crucial here, but wouldn't it be better to pass an uint directly and do BinaryPrimitives.ReverseEndianness if required? ReverseEndianess is a bswap, compared to multiple shift and or in TryWriteBytes or even pass the appropriate reversed endianess version directly if required?
If I read the sharplab asm correcctly, apparently having something like

uint v = (byte)'&' << 24 | (byte)'l' << 16 | (byte)'t' << 8 | (byte)';'; if (BitConverter.IsLittleEndian) { v = BinaryPrimitives.ReverseEndianness(v); }

is by now evaluated as a runtime constant (never realized that before)
https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIAYACY8gOgCEBXAMy5il3YBLAHbYoATwDcNek1YAlDsIyD8MFgGEI+AA6CANnwDKfAG6CwMXNNqNmLRctXqAksr4QdJqOcvWaMgDMTKQMGgwA3jQMMUzBxCgMALIAFACUkdGx2dlGOtjCADzA4hgwAHwMwPocwAwAvAy4GNhgANbY+voQYFWlMADapHQAujY5EwwcIhgMpg0MKSVlaQDkAGSrDIWFDKSJhIvLMGv6WzsM5EgMh0v9axjnuwAcN0f3q5KfWZOxglyLNiCDBaYSmPhlKAsFy4AAywIwhgAosIACaCAppH6/TLUHGTeaNIGiCQABSgqmBgnBAnkMHB/BgKPRBWEVlwKVMaXG+IYAF8AnjeQAVCQAdQpZTY/Q51VqaDm3OxAuo2OyAySMAwAAsIKiXLp9ClNTq9QadPoAPI6FQQYQCACCAHMnbBcLhqTA3PoRCInWkRmrYsR4lcqhAIPoGKLxBLgTBpWUOXkCsV+pVcPlhArpso5p0OCcgzEokL8f9FpmCixYTBhE6dQxyo0UFiyzjS7ycgBVe3YXgsONlXudQROtmowq5jDlFKwAGa/DQcRJMS4bWdFgAcS1dN4sGElhSVeEaQVpgLJx5XZixAA7AwMFBC9ecSquzB9LgYMWcp2b7eD5cJ036vr876TCqfJAA

that would allow you to keep the bytes visible and not require some "strange" magic value.

The goal was to keep the call site readable: "I'm writing these bytes in this order." If we want to change the implementation to call ReverseEndianness as an implementation detail I'm fine with that. You're right in that the ReverseEndianness API was intended to be optimized by the JIT as const input -> const output.

gfoidl

(only half-way throught the code so far...enough for me today)

gfoidl · 2021-03-09T20:28:40Z

src/libraries/System.Text.Encodings.Web/src/System/Text/Encodings/Web/SpanUtility.cs

+                uint value;
+                if (BitConverter.IsLittleEndian)
+                {
+                    value = ((uint)d << 24) | ((uint)c << 16) | ((uint)b << 8) | a;


I believe writing the individual bytes is faster as

this produces quite a lot of machine code

cpus have a store buffer, so flushing to L1 will be done in bigger "chunks" anyway

A quick micro-benchmark (on kaby-lake) proves that:

| Method | Mean | Error | StdDev | Ratio | RatioSD | |------- |---------:|----------:|----------:|------:|--------:| | A | 1.335 ns | 0.0656 ns | 0.1022 ns | 1.00 | 0.00 | | B | 1.178 ns | 0.0620 ns | 0.0608 ns | 0.85 | 0.08 |

A...your code
B...individiual writes

What's the asm output on your box? On my box this code produces a single mov dword ptr [foo], CONST instruction, which beats the performance of four mov byte ptr [foo + i], CONST instructions.

mov dword ptr [foo], CONST

Really a const if a, b, ... are arguments?
Or just in the specific case where the method is inlined and the arguments can be evaluated as constant values?
In this case of course that's better.

For asm (note: I'm not on latest main-branch):

; Bench.A() ; ... cmp r8d,4 jl short M00_L02 shl ecx,18 shl r10d,10 or ecx,r10d shl r9d,8 or ecx,r9d or eax,ecx mov [rdx],eax mov eax,1 jmp short M00_L03 M00_L02: xor eax,eax M00_L03: ret ; Bench.B() ; ... cmp r8d,4 jl short M00_L02 mov [rdx],al mov [rdx+1],r9b mov [rdx+2],r10b mov [rdx+3],cl mov eax,1 jmp short M00_L03 M00_L02: xor eax,eax M00_L03: ret

The micro-benchmark is very flaky, but B is always faster.
Although it's a micro benchmark, that doesn't take into account that the store buffer may be full, only one store can be dispatched per cycle, etc. as it can be on real world workloads.

@gfoidl The specific use case for this API is that all value parameters are constants. That's also called out in the devdoc on the API. This causes the JIT to const-fold everything.

GrabYourPitchforks · 2021-03-09T21:50:10Z

I'm investigating why this is failing in CI. Unit tests pass cleanly on my box.

GrabYourPitchforks · 2021-03-10T00:36:26Z

Hello JSON crew! You're pinged on this review because it changes the underlying System.Text.Encodings.Web implementation, and I had to adjust one of the System.Text.Json unit tests to account for the change. The System.Text.Json-specific change is cf8e998. It's a unit test only change. Basically, when the encoder sees invalid UTF-* data, it replaces that data with U+FFFD ('�') in the response. The unit test change makes the test resilient against the response containing either a literal '�' character or the escaped "\uFFFD" form, which are equivalent in JSON.

GrabYourPitchforks · 2021-03-10T02:27:19Z

The "all configurations" broken CI leg should be fixed by #49396.

- Update test csproj to include missing polyfills - Fix net461 test compilation failures

GrabYourPitchforks · 2021-03-10T07:33:23Z

/azp run runtime

azure-pipelines · 2021-03-10T07:33:46Z

Azure Pipelines successfully started running 1 pipeline(s).

gfoidl

I checked the SSE code and this looks good to me.

Left some nits.

gfoidl · 2021-03-10T12:06:10Z

...s/System.Text.Encodings.Web/src/System/Text/Encodings/Web/OptimizedInboxTextEncoder.Ssse3.cs

+                } while ((i += 16) < lastLegalIterationFor16CharRead);
+            }
+
+            if ((lengthInBytes & 8) != 0)


👍 this is clever and produces nice test jmp-combo that can be fused.

gfoidl · 2021-03-10T12:23:11Z

src/libraries/System.Text.Encodings.Web/src/System/Text/Encodings/Web/SpanUtility.cs

+            if (span.Length >= 6)
+            {
+                ulong value64;
+                uint value32;


Super nit: naming, we have value, hi, lo and these. Can this be unified? I like the value64 and value32 approach most.

I renamed them to abcd and ef, depending on what values they're intended to hold. I think this naming is a little clearer. Let me know what you think!

Let me know what you think!

Now it's clear and a good naming that I like.

(Sorry for not replying earlier)

src/libraries/System.Text.Encodings.Web/src/System/Text/Encodings/Web/SpanUtility.cs

...es/System.Text.Encodings.Web/src/System/Text/Encodings/Web/OptimizedInboxTextEncoder.Simd.cs

...s/System.Text.Encodings.Web/src/System/Text/Encodings/Web/OptimizedInboxTextEncoder.Ssse3.cs

gfoidl · 2021-03-10T13:25:03Z

...s/System.Text.Encodings.Web/src/System/Text/Encodings/Web/OptimizedInboxTextEncoder.Ssse3.cs

+
+            if ((lengthInBytes & 3) != 0)
+            {
+                Debug.Assert(lengthInBytes - i <= 3);


Should the other branches have a Debug.Assert too?
Like Debug.Assert(lengthInBytes - i >= 8 && lengthInBytes - i < 16);, etc.

...s/System.Text.Encodings.Web/src/System/Text/Encodings/Web/OptimizedInboxTextEncoder.Ssse3.cs

eiriktsarpalis · 2021-03-10T15:48:51Z

Performance results

I'd be curious to see benchmark results on arm64.

src/libraries/System.Text.Encodings.Web/ref/System.Text.Encodings.Web.csproj

eerhardt · 2021-03-10T19:06:30Z

I'm seeing a ~4.3KB .br compressed size regression in the default Blazor WASM app with this change:

Left is before, right is after:

GrabYourPitchforks · 2021-03-10T19:08:25Z

I'd be curious to see benchmark results on arm64.

Might be a good excuse to learn how to compile and run perf tests on my SPX device. :)

Per the comments at the top of the issue, I expect this will regress performance on arm64 for the "nothing needs to be escaped" code paths. However, the arm64 code paths here really needed to be reworked anyway in order to support pshufb-like semantics. Once that work is done, I expect arm64 performance here to be better than it was for 5.0.

GrabYourPitchforks · 2021-03-10T19:15:53Z

@eerhardt Interesting. I'm curious about the System.Private.CoreLib change in particular, as the only file touched there was https://github.com/dotnet/runtime/pull/49373/files#diff-3a22ee85ff262ccdbad92a3c073a523a039016bc6ee6cab752621493d6d46693, and I'm not sure how that trivial a change could have caused a 2KB regression. How does one begin investigating this?

eerhardt · 2021-03-10T19:38:12Z

I'm not sure how that trivial a change could have caused a 2KB regression

With trimming, changes that are the most impactful are often the higher level changes that cause more, or less, code to be kept in dependent assemblies. So refactoring System.Text.Encodings.Web can change it to use more APIs / code from CoreLib. Which means those APIs can no longer be trimmed after the refactoring.

How does one begin investigating this?

Here are the steps I use to investigate size changes:

Install the latest 6.0 SDK: https://github.com/dotnet/installer#installers-and-binaries
- I usually install the .zip to some place like C:\dotnet and then put that on my $PATH
dotnet new blazorwasm
dotnet publish -c Release

This gets you the "before" app in bin\Release\net6.0\publish\wwwroot\_framework. You can see both the uncompressed .dll and the compressed .dll.br files. We care most about the .br compressed files' size. But you can use the uncompressed .dll for analysis.

Now to get that app to use your change, what I typically do is "replace the NuGet package files with locally built files". I'm sure there are other approaches, but I found this to be the easiest.

Find the path to the runtime NuGet package being used
- The way I do this is use /bl in the publish command above and look for the $task illink in the .binlog, and grab the linker command line, which shows the path.
- It is of the form C:\Users\eerhardt\.nuget\packages\microsoft.netcore.app.runtime.browser-wasm\6.0.0-preview.3.21157.6\runtimes\browser-wasm\lib\net6.0
Build the libraries you are changing locally for Release
- .\build.cmd mono.corelib -os browser -arch wasm -c Release is how you build corelib for blazor wasm
Copy the built libraries into the browser-wasm nuget package above, replacing the official libraries
dotnet publish -c Release again, which will publish using your local libraries
Compare / analyze the results

Note: Sometimes the latest SDK and the latest main branch have changes between them. So it is good practice to do steps 1-5 above using the "before your changes" commit and the "after your changes" commit. This is what I did to get the numbers above.

Tools for analysis I've found helpful are:

ILSpy to inspect what is and isn't there
ApiReviewer, and show internal methods, which will give you a diff of methods that are there now vs. before
Trimming Lens from the mono/linker repo

GrabYourPitchforks · 2021-03-11T00:42:55Z

@eerhardt With the latest commit (d13f660), I removed the Vector<T> dependency. This also should have removed the Vector128<T> dependency, because Vector128<T> now only exists as a field inside an explicit-layout fixed-size struct, and since nobody references that field it should be safe to remove both the field and any remaining compile-time references to the Vector128<T> type itself. But mono's iltrim utility for some reason is not taking this opportunity to trim that. If this is indeed a legal trim optimization (and I believe it is), then I think this is something that should be addressed in that tool rather than worked around on our side.

With this commit System.Private.CoreLib.dll.br is 1300 bytes above baseline.

I'm still looking into possible improvements in System.Text.Encodings.Web.dll itself.

eerhardt · 2021-03-11T00:44:27Z

But mono's iltrim utility for some reason is not taking this opportunity to trim that. If this is indeed a legal trim optimization (and I believe it is), then I think this is something that should be addressed in that tool rather than worked around on our side.

Can you open an issue for this in https://github.com/mono/linker ?

GrabYourPitchforks · 2021-03-11T01:10:21Z

I can work around this for now by making AsVector a property instead of a field, but I'll need to disassemble again to make sure I'm not undoing the optimizations we got from #49180.

internal readonly ref readonly Vector128<byte> AsVector
    => ref Unsafe.As<byte, Vector128<byte>>(ref Unsafe.AsRef(in AsBytes[0]));

Edit: Looks like it's interfering with the optimizations and causing register stack spilling. :(

src/libraries/System.Text.Encodings.Web/src/System/Text/Encodings/Web/TextEncoder.cs

src/libraries/System.Text.Encodings.Web/tests/JavaScriptStringEncoderTests.Relaxed.cs

GrabYourPitchforks · 2021-03-17T01:04:16Z

@eiriktsarpalis I was experimenting with arm64 SIMD enablement over in my personal fork based on some feedback from @tannergooding. I still need to test the code, but the logic in that file is fairly close to the logic in the SSSE3-specific code paths. Trying to figure out how to get it over to my SPX device for perf testing.

Edit: Something's going on with BenchmarkDotNet on that box, but I am able to perform some basic smoke testing. Things appear to be working correctly. I'll hold the ARM64 commit for now and send it as a separate PR once this is done. That way I can enlist help for running benchmarks and we can dedicate that separate PR just for ARM64-related discussion.

Edit x2: Basic console app and stopwatch never fails. :)

Baseline (release/5.0): 46 ns for HtmlEncoder.Default.FindFirstCharacterToEncodeUtf8(u8"The quick brown fox jumps over the lazy dog.")
advsimd-optimized: 8.8 ns for same input. (-81% wall clock time taken)

GrabYourPitchforks · 2021-03-17T17:57:45Z

Most recent 2 commits are unit test changes only to respond to PR feedback, no source or packaging changes.

GrabYourPitchforks · 2021-03-17T21:55:19Z

CI "build all configurations" failure is known issue which should be resolved by #49781.

Edit: Since that PR might bake for a few days, I've cherry-picked two updates to the package baseline in the latest iteration of this PR to help unblock CI. The resulting merge conflict should be minimal.

GrabYourPitchforks · 2021-03-18T19:05:52Z

Latest commit (76f04f7) is merge from origin/main and conflict resolution, no code changes from previous PR review.

GrabYourPitchforks · 2021-03-19T00:12:07Z

Failing wasm test appears to be #48079.
Failing staging test appears to be a transient package server outage unrelated to the earlier "allConfigurations" packaging issues we were seeing.

adamsitnik · 2021-03-23T16:50:26Z

@GrabYourPitchforks we got some nice improvements from this PR: DrewScoggins/performance-2#4632

GrabYourPitchforks · 2021-03-23T17:01:19Z

@adamsitnik This is probably also reflected in DrewScoggins/performance-2#4666. It's good to keep an eye on the System.Text.Json tests specifically to ensure that we didn't regress anything there.

lewing · 2021-03-25T21:53:35Z

@GrabYourPitchforks we're seeing a big brower-wasm regression in System.Text.Json over a range that includes this #50260

GrabYourPitchforks · 2021-03-26T00:47:07Z

@lewing Thanks for the pointer. I'll respond over in that thread.

kunalspathak · 2022-06-08T19:39:54Z

I'll hold the ARM64 commit for now and send it as a separate PR once this is done

Was this ever done or did we ever measure the performance difference between x64 and arm64 and if there is a gap? Also, is there a tracking issue to optimize it for ARM64 (if it is slow)?

GrabYourPitchforks · 2022-06-09T16:07:42Z

Also, is there a tracking issue to optimize it for ARM64 (if it is slow)?

It was addressed by #49847.

kunalspathak · 2022-06-09T17:49:20Z

Thanks. It is surprising that none of the MicroBenchmarks improvements were noticed or we might have missed triaging the improvements. CC: @DrewScoggins

GrabYourPitchforks added the area-System.Text.Encodings.Web label Mar 9, 2021

Tornhoof reviewed Mar 9, 2021

View reviewed changes

gfoidl reviewed Mar 9, 2021

View reviewed changes

Make WriteStringInvalidCharacter resilient against U+FFFD escaping

cf8e998

GrabYourPitchforks requested review from eiriktsarpalis, jozkee, layomia and steveharter as code owners March 10, 2021 00:33

Fix broken tests & x86 edge case detection

ee279d4

- Update test csproj to include missing polyfills - Fix net461 test compilation failures

gfoidl reviewed Mar 10, 2021

View reviewed changes

eerhardt reviewed Mar 10, 2021

View reviewed changes

src/libraries/System.Text.Encodings.Web/ref/System.Text.Encodings.Web.csproj Show resolved Hide resolved

eerhardt mentioned this pull request Mar 10, 2021

Include "simple" UTF-8 validation and transcoding logic for interpreted and low-footprint scenarios #48006

Closed

Remove Vector<T> dependency in wasm

d13f660

GrabYourPitchforks mentioned this pull request Mar 11, 2021

Linker is not opportunistically trimming unused fields from explicit-layout structs dotnet/linker#1883

Closed

Fix build break on netcoreapp3.1

3eec93e

runfoapp bot mentioned this pull request Mar 16, 2021

Mono System.Text.Json.Tests on Windows timing out #42677

Open

eiriktsarpalis reviewed Mar 16, 2021

View reviewed changes

src/libraries/System.Text.Encodings.Web/src/System/Text/Encodings/Web/TextEncoder.cs Show resolved Hide resolved

eiriktsarpalis reviewed Mar 16, 2021

View reviewed changes

src/libraries/System.Text.Encodings.Web/tests/JavaScriptStringEncoderTests.Relaxed.cs Show resolved Hide resolved

eiriktsarpalis approved these changes Mar 17, 2021

View reviewed changes

GrabYourPitchforks added 2 commits March 17, 2021 10:53

Rename files JavaScriptStringEncoder -> JavaScriptEncoder

d0d3d2f

Rename test methods JavaScriptString -> JavaScript

e6f4433

GrabYourPitchforks added 2 commits March 17, 2021 16:30

Cherry-pick updates to package baseline

0d84fc4

Merge remote-tracking branch 'origin/main' into encoder

76f04f7

GrabYourPitchforks merged commit c569bc1 into dotnet:main Mar 19, 2021

GrabYourPitchforks deleted the encoder branch March 19, 2021 01:16

GrabYourPitchforks mentioned this pull request Mar 19, 2021

Add ADVSIMD64 optimizations for System.Text.Encodings.Web #49847

Merged

runfoapp bot mentioned this pull request Mar 22, 2021

SafeHandle use-after-dispose in FileSystemWatcher on OSX #30056

Closed

mattjohnsonpint mentioned this pull request Mar 25, 2021

[WASM] Major size regression in dotnet.wasm #50210

Closed

GrabYourPitchforks mentioned this pull request Mar 26, 2021

[wasm] Major performance regression serializing and deserializing json #50260

Closed

tannergooding mentioned this pull request Apr 20, 2021

Intrinsicify (Sse, Axv2, Arm64, wasm) JsonReaderHelper.IndexOfOrLessThan #41097

Closed

ghost locked as resolved and limited conversation to collaborators Apr 25, 2021

karelz added this to the 6.0.0 milestone May 20, 2021

System.Text.Encodings.Web refactoring and code modernization #49373

System.Text.Encodings.Web refactoring and code modernization #49373

Conversation

GrabYourPitchforks commented Mar 9, 2021 • edited Loading

High-level overview of this PR

A brief tour of the files

Performance

Other notes for reviewers

ghost commented Mar 9, 2021

High-level overview of this PR

A brief tour of the files

Performance

Other notes for reviewers

GrabYourPitchforks commented Mar 9, 2021

Performance results

Performance discussions

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfoidl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GrabYourPitchforks commented Mar 9, 2021

GrabYourPitchforks commented Mar 10, 2021

GrabYourPitchforks commented Mar 10, 2021

GrabYourPitchforks commented Mar 10, 2021

azure-pipelines bot commented Mar 10, 2021

gfoidl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eiriktsarpalis commented Mar 10, 2021

eerhardt commented Mar 10, 2021

GrabYourPitchforks commented Mar 10, 2021

GrabYourPitchforks commented Mar 10, 2021

eerhardt commented Mar 10, 2021

GrabYourPitchforks commented Mar 11, 2021

eerhardt commented Mar 11, 2021

GrabYourPitchforks commented Mar 11, 2021 • edited Loading

GrabYourPitchforks commented Mar 17, 2021 • edited Loading

GrabYourPitchforks commented Mar 17, 2021 • edited Loading

GrabYourPitchforks commented Mar 17, 2021 • edited Loading

GrabYourPitchforks commented Mar 18, 2021

GrabYourPitchforks commented Mar 19, 2021 • edited Loading

adamsitnik commented Mar 23, 2021

GrabYourPitchforks commented Mar 23, 2021

lewing commented Mar 25, 2021

GrabYourPitchforks commented Mar 26, 2021

kunalspathak commented Jun 8, 2022 • edited Loading

GrabYourPitchforks commented Jun 9, 2022

kunalspathak commented Jun 9, 2022

GrabYourPitchforks commented Mar 9, 2021 •

edited

Loading

GrabYourPitchforks commented Mar 11, 2021 •

edited

Loading

GrabYourPitchforks commented Mar 17, 2021 •

edited

Loading

GrabYourPitchforks commented Mar 17, 2021 •

edited

Loading

GrabYourPitchforks commented Mar 17, 2021 •

edited

Loading

GrabYourPitchforks commented Mar 19, 2021 •

edited

Loading

kunalspathak commented Jun 8, 2022 •

edited

Loading