Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Moving the Utf8Formatter and Utf8Parser into S.P.Corelib #20934

Merged
merged 5 commits into from
Nov 12, 2018
Merged

Moving the Utf8Formatter and Utf8Parser into S.P.Corelib #20934

merged 5 commits into from
Nov 12, 2018

Conversation

tannergooding
Copy link
Member

As per #20873 (comment), I split the PR into multiple.

This is the third PR and moves the Utf8 Formatter and Parser into S.P.Corelib.

  • As part of the update, it finishes fixing the Utf8Parser for Single/Double to match the new Utf16Parser changes.

@tannergooding
Copy link
Member Author

// results.

number.HasNonZeroTail = (c != '0');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change necessary as part of moving this into corelib? Does this have any impact on throughput, seeing as though we're adding additional logic in what appears to be a few tight loops?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoreFX tests won't pass otherwise.

The Utf8Parser is using the NumberBuffer implementation that was already in S.P.Corelib now. The Digit buffer it contains is dynamically sized to be exactly the right size (which is enough digits to hold the largest exact string + 1 digit for rounding and + 1 digit for the null terminator). It also tracks whether any digits that were specified past the rounding digit where zero, which allows it to be correctly rounded in all scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check perf here, but I would expect some minimal impact (most of which should hopefully be handled by the branch predictor).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub, we have no perf tests that currently exercise this code (which is decimal, double, and float). We only have tests that cover the integer types.

  • Working on getting some basic tests to cover this now...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really basic perf tests (just mirroring the utf16 ones) here: dotnet/corefx#33427

Copy link
Member Author

@tannergooding tannergooding Nov 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub, @GrabYourPitchforks, perf numbers look similar to the Utf16Parser/Utf16Formatter differences.

For integers (both parsing and formatting), we are the essentially the same (some faster, some slower; all look within std deviation).
For floating-point (decimal, single, and double), we are essentially the same for formatting (some faster, some slower; all look within std deviation).

For floating-point parsing, we are faster on numbers that are <= 15 digits and with exponents <= 22. For those outside the range, we are up to ~5x slower but are now returning the correct result.

  • As mentioned on the Utf16Parser issue, we can improve this further for numbers <= 19 digits and with exponents <= 27 by using 80-bit extended precision multiplication/division
    We can improve this for numbers <= 34 digits and with exponents <= 38 by using 128-bit floating-point multiplication/division

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't seem to upload the CSV files right now, GitHub is having issues doing so.

@tannergooding
Copy link
Member Author

@stephentoub, how do we want to handle the mirror?

Merging this will cause the changes to mirror to CoreFX, which will cause System.Memory to no longer build.
The correct fix is to remove the code from System.Memory and ensure the type forward to S.P.Corelib exists, but that can't be done until after the CoreCLR package update happens.

I think the options are:

  • Wait and do the mirror and package update together, or...
  • For the mirror, add an additional commit which removes the existing NumberBuffer from System.Memory and replaces it with a link to the shared NumberBuffer implementation from corelib
    Then, remove all links and add the type-forward when the package update occurs

@jkotas
Copy link
Member

jkotas commented Nov 12, 2018

Add temporary ifdefs around the offending code

@tannergooding
Copy link
Member Author

Add temporary ifdefs around the offending code

I think there is too much code that was changed to do that, it would basically be putting an #ifdef around all changes in this PR.

@stephentoub
Copy link
Member

stephentoub commented Nov 12, 2018

I think the options are

It shouldn't (famous last words) be a long delay between this being merged and a new build being produced and consumed into corefx. If ifdef'ing is too painful, you could just wait to merge the mirror PR until that's happened, at which point you can add a commit with the appropriate fix-ups to the mirror PR. Reasonable?

@tannergooding
Copy link
Member Author

Reasonable?

That is basically the first option, I listed 😄

Wait and do the mirror and package update together

@stephentoub
Copy link
Member

Ah, then yeah, if ifdef'ing doesn't work, that seems reasonable.

//
public static bool TryFormatThrowFormatException(out int bytesWritten)
{
bytesWritten = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: swap this line with the line immediately below. Otherwise the write has an observable side effect that the JIT cannot elide.
(Don't let this block checkin.)

//
public static bool TryParseThrowFormatException(out int bytesConsumed)
{
bytesConsumed = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit as earlier.

@tannergooding tannergooding merged commit 80c0a0e into dotnet:master Nov 12, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…clr#20934)

* Moving the Utf8Formatter and Utf8Parser into S.P.Corelib

* Doing some minimal cleanup to lineup types and get the Utf8Parser/Utf8Formatter building

* Updating the Utf8 Float Parser to have different buffers for Single vs Double

* Fixing the Utf8Parser to track trailing zero digits and to properly mark the end of the buffer

* Fixing a couple of issues in Utf8Parser.Number


Commit migrated from dotnet/coreclr@80c0a0e
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants