This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
[release/3.0] Fix BytesConsumed and error messages when reading JSON payloads within a multi-segment ReadOnlySequence via Utf8JsonReader #40422
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ports #40303 and #40349 to 3.0
Addresses https://github.com/dotnet/corefx/issues/39974 for valid JSON and invalid JSON (where exception message is consistent/accurate)
cc @steveharter, @ericstj, @danmosemsft, @ahsonkhan @eerhardt, @Anipik, @wtgodbe, @bartonjs, @stephentoub, @GSPP
Description
For processing valid JSON input:
We missed updating the bytes consumed in one instance when parsing numbers within JSON that is contained within a multi-segment
ReadOnlySequence<byte>
. Updating other instances of consumed to be updated correctly as well. Additionally, we are now consistently recovering the necessary reader state when the user passes in incomplete payload so they can continue with more data on subsequent reads.Also fix setting up the initial positions during the ctor when the first segment happens to be empty.
For processing invalid JSON input:
There were a few places where the exception message and the values we returned as part of usability/diagnostics was inconsistent (or incorrect) when the user passed-in multi-segment data. Also, even though we don't provide guarantees on the reader state being recoverable after an error, certain properties like Line Number and Position In Line are still useful. This change also makes sure to avoid incorrectly increasing line number when seeing escaped new line characters within quoted strings.
Customer Impact
The bug was customer-reported as part of testing various JSON payloads (both valid/invalid) and making sure the behavior is consistent.
When polling Utf8JsonReader.BytesConsumed, the user will now see a consistent result regardless of which input source contained their data (whether it was a span, or a multi-segment sequence where the number being parsed straddled a segment boundary). For example, after reading "2e2", that was split into three segments, we reported BytesConsumed as 2 instead of 3.
When the user observes the exception for invalid JSON (or programmatically polls the Line Number/Position properties), the message and values are accurate in many of the edge cases now. This helps with end-user-experience and diagnostics.
Regression?
No.
Risk
Low. Significant test cases were added and this only affects an edge use case on multi-segment buffers where the user is relying on the BytesConsumed property (low usage). The other changes involve the exception path for invalid JSON. The only risk is that we are updating code that was previously prone to off-by-one errors, but we have significant tests for all the various number inputs.