JSON serialization performance on Blazor #40318
Unanswered
steveharter
asked this question in
General
Replies: 1 comment 3 replies
-
Thanks for looking into this. So this deserialization slowness affects nearly every Blazor WebAssembly app, everywhere, right? Small requests and large requests alike will take 35 Also, any ideas for working around--is the only way to make responses smaller? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Auguest 4, 2020
Steve Harter
Overview - serialization performance on Blazor
The 5.0 performance of System.Text.Json in serializer is slow under Blazor client but mostly consistent with other libraries, and is ~2x faster than 3.1. A rough expected multiplier is that the Mono interpreter should be at most 15x slower than the Mono JIT, and the serializer is in line with that in real-world scenarios.
It is believed the serializer and underlying reader\writer are not currently doing anything systematically that causes the serializer to be exceptionally slower under Blazor than other non-serializer scenarios. However, there are cases where focused improvements can be made including improving
Span<T>
performance.Some super-slow areas were identified and recently fixed (or in progress).
The 15x multiplier compares the Mono interpreter against Mono JIT because that is the easiest way to measure progress (without trying to run Blazor each time). However, the community will compare Blazor against Core JIT and currently that multiplier is ~35-90x, at least for the main scenario covered here -- some scenarios will be faster or slower than that, but the 35-90x range should be a good approximation of a standard real-world scenario and what will likely be reported by the community.
For Blazor, this may or may not be performant enough depending on JSON payload size or object graph complexity. There have been earlier reports from the community that serialization is too slow although there have been improvements since then.
Blazor performance is poor, in general, because the IL is interpreted, not JITted or Crossgen'd. In addition, the WASM layer adds an additional 1.3-1.6x multiplier. Overall, these factors are known and are an expected tradeoff for the current architecture.
Improving Performance
Blazor performance can be improved in several ways; some example PRs and issues shown below:
Primary real-world benchmark and observations
A real-world benchmark for a large object graph (339,803 characters) where serialize\deserialize is run 10 times:
The "Blazor vs Core" column uses both a 2x and 40x warmup factor. Although Blazor runs on the Mono interpreter, not Core JIT, the comparison is there since that will likely be what the community will compare against. The 2x warmup factor is likely more real-world than a 40x warmup for typical browser scenarios. The Core JIT, due to tiered JIT support, becomes faster after a few runs and from local tests about 40x runs are needed to get steady-state maximum performance.
Observations from this single benchmark:
5.0 work remaining
The serialization investigation work I performed to understand the current performance characteristics and potential areas to improve is complete. I am not aware of any more known low-hanging-fruit although there are cases the Mono team could intrinsify or improve.
One slow area is support for
Span<T>
. Since the Serializer (and underlying reader\writer) usesSpan<T>
in many places, some overall improvement should be gained. Here's a simple benchmark doingnew Span<byte>(bytes).Slice()
(100,000,000 iterations)However, since there are so many iterations in the bechmark (100 million), significant perf gain is not expected (100 million operations takes 7.6 seconds). This issue is covered by interp Intrinsify span ctor.
Other areas:
System.Text.Encodings.Web.DefaultJavaScriptEncoderBasicLatin:FindFirstCharacterToEncode
although in that case there will likely need to be a fast-path (when default encoder is used) vs slow-path (non-default encoder).Since these are non-trivial work items, there would need a cost:benefit for each and additional tests added to ensure quality.
Other activity from the Mono team members and suggestions:
Note it is possible that much of the interpreter work done in 5.0 is "throw-away" pending future plans (if the interpreter is not used).
Post-5.0 \ future
After 5.0, other layering options to improve performance will likely be considered that will avoid the IL interpreter overhead. This will likely be AOT-based such as compiling IL to native code, compiling IL to WASM bytecode or compiling IL to WASM binary.
There is a Json code-gen effort that will be ready in 6.0 that is expected to improve throughput performance 10-15% and minimize startup time of the first (de)serialization. It requires a code-gen plugin to generate the (de)serialization code for the POCOs in a given project.
Benchmarks
Existing JSON Benchmarks from https://github.com/dotnet/performance
These show the differences between Mono JIT vs. Mono interpreter.
There are about ~300 JSON benchmarks which range from 2.5x slower to 35x slower with the median ~11x slower.
The benchmarks that are in the 20x+ range are mostly micro-benchmarks of the reader and writer or very small POCOs (which will be super quick) and not general real-world (de)serialization scenarios of larger object graphs or collections. However, there are some Base64 benchmarks that fall into this 20x+ range can be improved if we intrinsify that; Base64 is used when a property is
byte[]
.Consider the benchmark
System.Text.Json.Serialization.Tests.ReadJson<IndexViewModel>.DeserializeFromUtf8Bytes
which is a real-world object model and runs ~12x slower. The correspondingSystem.Text.Json.Serialization.Tests.WriteJson<IndexViewModel>.SerializeToUtf8Bytes
runs ~8x slower. These numbers should get a bit faster with Improve JSON (de)serialization perf of longer strings on mono.Click to expand IndexViewModel object model
Instructions to view benchmarks:
WASM
WASM (de)serialization performance from the first release of Blazor (3.1x) until has increased by around 2x. See page 14 at
https://msit.powerbi.com/view?r=eyJrIjoiYTZjMTk3YjEtMzQ3Yi00NTI5LTg5ZDItNmUyMGRlOTkwMGRlIiwidCI6IjcyZjk4OGJmLTg2ZjEtNDFhZi05MWFiLTJkN2NkMDExZGI0NyIsImMiOjV9
or this image:
todo: from the graph, determine when the switch to 5.0 runtime version of
System.Text.Json.dll
occurred. There have been severalSystem.Text.Json
perf improvements in 5.0.Large object graph benchmark
This benchmark is used in the
Primary real-world benchmark and observations
section above.Click to expand
Small POCO
In order to measure string-size performance, a small POCO was used. This was also used in #39733.
String size of
Text
property ranged from 10 characters to 10,240 characters.Serializer details
Some notes on the serializer that highlights the main areas of work.
The serializer uses the Utf8JsonWriter and deserializer uses the Utf8JsonReader. The majority of time is spent in the reader\writer, not the (de)serializer.
Some areas that are hot spots for past performance work:
Misc notes:
Span<T>
which is very slow under both Mono JIT and Mono interpreter. Common scenarios includingnew Span<T>
andSlice()
.System.Text.Encodings.Web
assembly and the serializer allows a pluggable encoder that can be specified onJsonSerializerOptions
. Because the encoder is pluggable, it makes it more difficult to create interpreter intrinstics for that.Stream
and are ~10% slower than non-async that usebyte[]
orstring
. However, the async mode supports a true streaming mode: during deserialization, no "drain" is done upfront during deserialization and the deserializer will continue to ask the Stream for more data; during serializationFlushAsync()
is called automatically to minimize buffer size and memory consumption. Thus the Async mode may perform better than sync mode in large payload + memory-exhausted scenarios or when the Stream is itself is reading or writing over a higher-latency endpoint or physical file since the (d)serializer can start the work before waiting for the whole Stream to load.Beta Was this translation helpful? Give feedback.
All reactions