Porting NumberToDouble to managed code. #20080

tannergooding · 2018-09-20T20:20:46Z

This ports the double/single parsing code to be implemented in managed.

tannergooding · 2018-09-20T20:22:23Z

This is related to #19999, which ported the formatting code.

CoreRT already has a similar port here: https://github.com/dotnet/corert/blob/master/src/System.Private.CoreLib/src/System/Number.CoreRT.cs#L302

tannergooding · 2018-09-20T20:23:09Z

This does not attempt to fix any of the known bugs that exist in the parsing code.

tannergooding · 2018-09-20T20:24:24Z

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs

+                }
+                else
+                {
+                    return number.sign ? -0.0 : 0.0;


I did some minor cleanup here to remove the goto.

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs

…Int64Bits

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs

danmoseley · 2018-09-20T22:28:55Z

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs

+namespace System
+{
+    internal unsafe partial class Number
+    {


Is there anything useful you are able to say about the algorithm used, for the benefit of anyone maintaining it?

Possibly, but there wasn't any comments like that in the native code, so I can really only put what I am interpreting the code to be doing.

Given that the algorithm is currently known to be incorrect, I would think this should be replaced with a "correct" parsing algorithm soonish (hopefully) and we can add some proper algorithm comments then.

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs

tannergooding · 2018-09-20T22:40:46Z

CC. @jkotas, since you helped review the previous PR as well.

tannergooding · 2018-09-20T23:01:43Z

As with the previous PR...

I locally ran the Roslyn RealParser suite, as well as did a basic benchmark on both float and double, covering 267,386,880 values in the input range (including both denormal and normal inputs).

Benchmarking was done with Tiered Jitting disabled.

The below benchmarks are just for double and show a small 1.09% regression in elapsed time.

Native

Managed

jkotas · 2018-09-21T05:09:39Z

The below benchmarks are just for double and show a small 1.09% regression in elapsed time.

Is this fair benchmark to use? This benchmark is not dominated by number formatting that you are touching.

tannergooding · 2018-09-21T05:47:23Z

NumberToDouble is the parsing code, which is overall dominant code being tested (ParseNumber and NumberToDouble itself, as well as much less costly intermediate calls).

I could remove all of the formatting calls but then we wouldn't have a benchmark that covers a wide range of inputs (currently testing 2.67m inputs) and I would think just testing a few values thousands of times would be overall less fair, given that the algorithm has to do more or less work based on the input string length.

jkotas · 2018-09-21T11:51:27Z

The numbers you posted say native NumberToDouble took 4.597s (exclusive?), managed NumberToDouble 5.094s, the total time is ~120s and the top unrelated formatting methods took 44s that is ~10x more than NumberToDouble. These numbers say to me that NumberToDouble regressed about 10% and that the actual time spent in NumberToDouble is a small fraction of the total.

I think that the right number to quote for this change is regression range and average for calling Double.TryParse. What do the CoreFX microbenchmarks for double parsing say?

benchmark that covers a wide range of inputs (currently testing 2.67m inputs)

What is the distribution of these inputs? This would be fair only if distribution of these inputs represents what one would expect distribution of inputs for parsing to be in real world apps.

tannergooding · 2018-09-21T15:56:40Z

The numbers you posted say native NumberToDouble took 4.597s (exclusive?), managed NumberToDouble 5.094s,

Yes, that looks like the case. Given that the only code that changed between them was changing NumberToDouble from native to managed, I had only pulled the regression numbers for the total CPU time.

the total time is ~120s

That may not the right number to be looking at, you probably want to look at CPU time (which covers the time actually spent executing the program, and not any collection overhead):

Elapsed time is the wall time from the beginning to the end of collection.
CPU Time is time during which the CPU is actively executing your application.

What do the CoreFX microbenchmarks for double parsing say?

We do not have any CoreFX microbenchmarks for double parsing yet (or if we do, I couldn't find them, because they aren't in the same assembly as the formatting tests or other double/single tests). It is one of the items I am working on.

What is the distribution of these inputs?

It is 2.67m inputs evenly distributed across the finite input range of double. Due to the actual distribution of double-numbers, the majority of them will be in an expected user-input range and a few of them will be extremely large or extremely small (subnormal) numbers.

jkotas · 2018-09-21T16:11:29Z

We do not have any CoreFX microbenchmarks for double parsing

Should we add some before changing this?

It is 2.67m inputs evenly distributed across the finite input range of double.

How are the string lengths of these inputs distributed? I expect that real-world programs frequently parse short strings that do not use full precision - we should cover that case.

tannergooding · 2018-09-21T16:56:54Z

Should we add some before changing this?

I'm working on adding some now.

How are the string lengths of these inputs distributed? I expect that real-world programs frequently parse short strings that do not use full precision - we should cover that case.

The code is doing:

double d1 = BitConverter.Int64BitsToDouble((long)(bits));
string s1 = d1.ToString();
d1 = double.Parse(s1);

double d2 = -d1;
string s2 = d2.ToString();
d2 = double.Parse(s2);

Which gives us a range of string input lengths:

[01]: 11
[02]: 99
[03]: 1000
[04]: 10020
[05]: 100230
[06]: 1002360
[07]: 2745872
[08]: 4212994
[09]: 4920747
[10]: 5021742
[11]: 4928777
[12]: 5020955
[13]: 4994085
[14]: 4986497
[15]: 5076973
[17]: 5406024
[16]: 6055297
[18]: 4503349
[19]: 13433937
[20]: 102880126
[21]: 92085785

The following additional metrics might be interesting:

Total Numbers:              267,386,880
Positive Numbers:           133,693,440
Negative Numbers:           133,693,440
Numbers with a decimal:     32,885,022
Number with an exponent:    201,181,932 (all also contain a decimal point)
Integer Numbers:            33,319,926

tannergooding · 2018-09-21T22:49:27Z

Numbers from the CoreFX benchmarks I am adding are:

   System.Runtime.Performance.Tests.dll                                                                                                             | Metric   | Unit | Iterations |    Average |    STDEV.S |        Min |     Max
  :------------------------------------------------------------------------------------------------------------------------------------------------ |:-------- |:----:|:----------:| ----------:| ----------:| ----------:| -------:
   System.Tests.Perf_Double.DefaultTryParse(input: "0", innerIterations: 10000000)                                                                  | Duration | msec |     17     |    588.690 |      4.277 |    581.576 | 595.027
   System.Tests.Perf_Double.DefaultTryParse(input: "-0.0", innerIterations: 10000000)                                                               | Duration | msec |     16     |    633.134 |      6.304 |    626.540 | 651.333
   System.Tests.Perf_Double.DefaultTryParse(input: "1", innerIterations: 1000000)                                                                   | Duration | msec |    150     |     66.872 |      1.420 |     64.905 |  73.088
   System.Tests.Perf_Double.DefaultTryParse(input: "-1", innerIterations: 1000000)                                                                  | Duration | msec |    142     |     70.627 |      4.002 |     68.318 | 108.493
   System.Tests.Perf_Double.DefaultTryParse(input: "1.7976931348623157E+308", innerIterations: 100000)                                              | Duration | msec |    686     |     14.585 |      1.116 |     13.486 |  25.640
   System.Tests.Perf_Double.DefaultTryParse(input: "-1.7976931348623157E+308", innerIterations: 100000)                                             | Duration | msec |    665     |     15.041 |      1.631 |     13.949 |  28.559
   System.Tests.Perf_Double.DefaultTryParse(input: "2.2250738585072009E-308", innerIterations: 100000)                                              | Duration | msec |    671     |     14.911 |      0.642 |     14.233 |  21.663
   System.Tests.Perf_Double.DefaultTryParse(input: "-2.2250738585072009E-308", innerIterations: 100000)                                             | Duration | msec |    659     |     15.177 |      0.523 |     14.538 |  23.675
   System.Tests.Perf_Double.DefaultTryParse(input: "2.2250738585072014E-308", innerIterations: 100000)                                              | Duration | msec |    676     |     14.809 |      0.665 |     14.110 |  20.489
   System.Tests.Perf_Double.DefaultTryParse(input: "-2.2250738585072014E-308", innerIterations: 100000)                                             | Duration | msec |    657     |     15.232 |      1.374 |     14.411 |  35.720
   System.Tests.Perf_Double.DefaultTryParse(input: "2.7182818284590451", innerIterations: 1000000)                                                  | Duration | msec |     73     |    137.142 |     10.991 |    129.995 | 190.404
   System.Tests.Perf_Double.DefaultTryParse(input: "-2.7182818284590451", innerIterations: 1000000)                                                 | Duration | msec |     74     |    135.912 |      2.717 |    133.242 | 151.757
   System.Tests.Perf_Double.DefaultTryParse(input: "3.1415926535897931", innerIterations: 1000000)                                                  | Duration | msec |     75     |    134.697 |      8.483 |    128.895 | 174.130
   System.Tests.Perf_Double.DefaultTryParse(input: "-3.1415926535897931", innerIterations: 1000000)                                                 | Duration | msec |     75     |    134.653 |      1.419 |    132.088 | 138.892
   System.Tests.Perf_Double.DefaultTryParse(input: "4.94065645841247E-324", innerIterations: 100000)                                                | Duration | msec |    704     |     14.204 |      1.110 |     13.444 |  30.665
   System.Tests.Perf_Double.DefaultTryParse(input: "-4.94065645841247E-324", innerIterations: 100000)                                               | Duration | msec |    690     |     14.495 |      1.047 |     13.730 |  27.376
   System.Tests.Perf_Double.DefaultTryParse(input: "∞", innerIterations: 10000000)                                                                  | Duration | msec |     16     |    640.379 |     12.881 |    620.160 | 661.798
   System.Tests.Perf_Double.DefaultTryParse(input: "-∞", innerIterations: 10000000)                                                                 | Duration | msec |     16     |    648.858 |      4.701 |    643.843 | 662.045
   System.Tests.Perf_Double.DefaultTryParse(input: "NaN", innerIterations: 10000000)                                                                | Duration | msec |     17     |    603.244 |      4.212 |    595.898 | 612.024
   System.Tests.Perf_Single.DefaultTryParse(input: "0", innerIterations: 10000000)                                                                  | Duration | msec |     17     |    603.986 |     12.243 |    584.432 | 635.513
   System.Tests.Perf_Single.DefaultTryParse(input: "-0.0", innerIterations: 10000000)                                                               | Duration | msec |     16     |    645.467 |      7.618 |    632.851 | 658.606
   System.Tests.Perf_Single.DefaultTryParse(input: "1", innerIterations: 1000000)                                                                   | Duration | msec |    150     |     67.108 |      2.127 |     64.419 |  80.196
   System.Tests.Perf_Single.DefaultTryParse(input: "-1", innerIterations: 1000000)                                                                  | Duration | msec |    139     |     72.015 |      2.983 |     67.907 |  84.758
   System.Tests.Perf_Single.DefaultTryParse(input: "1.17549421E-38", innerIterations: 100000)                                                       | Duration | msec |    882     |     11.345 |      0.705 |     10.592 |  21.740
   System.Tests.Perf_Single.DefaultTryParse(input: "-1.17549421E-38", innerIterations: 100000)                                                      | Duration | msec |    852     |     11.737 |      0.512 |     11.025 |  15.971
   System.Tests.Perf_Single.DefaultTryParse(input: "1.17549435E-38", innerIterations: 100000)                                                       | Duration | msec |    887     |     11.272 |      0.530 |     10.586 |  17.378
   System.Tests.Perf_Single.DefaultTryParse(input: "-1.17549435E-38", innerIterations: 100000)                                                      | Duration | msec |    857     |     11.672 |      0.536 |     10.988 |  15.727
   System.Tests.Perf_Single.DefaultTryParse(input: "1.401298E-45", innerIterations: 100000)                                                         | Duration | msec |    931     |     10.700 |      0.528 |      9.916 |  15.130
   System.Tests.Perf_Single.DefaultTryParse(input: "-1.401298E-45", innerIterations: 100000)                                                        | Duration | msec |    908     |     11.014 |      0.535 |     10.361 |  15.990
   System.Tests.Perf_Single.DefaultTryParse(input: "2.71828175", innerIterations: 1000000)                                                          | Duration | msec |    101     |     99.531 |      2.497 |     96.171 | 114.522
   System.Tests.Perf_Single.DefaultTryParse(input: "-2.71828175", innerIterations: 1000000)                                                         | Duration | msec |     97     |    103.742 |      1.853 |    100.555 | 108.782
   System.Tests.Perf_Single.DefaultTryParse(input: "3.14159274", innerIterations: 1000000)                                                          | Duration | msec |    100     |    100.383 |      2.902 |     96.441 | 112.945
   System.Tests.Perf_Single.DefaultTryParse(input: "-3.14159274", innerIterations: 1000000)                                                         | Duration | msec |     95     |    105.349 |      5.387 |    100.270 | 130.419
   System.Tests.Perf_Single.DefaultTryParse(input: "3.40282347E+38", innerIterations: 100000)                                                       | Duration | msec |    893     |     11.193 |      0.606 |     10.418 |  15.594
   System.Tests.Perf_Single.DefaultTryParse(input: "-3.40282347E+38", innerIterations: 100000)                                                      | Duration | msec |    854     |     11.704 |      0.804 |     10.824 |  19.799
   System.Tests.Perf_Single.DefaultTryParse(input: "∞", innerIterations: 10000000)                                                                  | Duration | msec |     16     |    635.425 |     18.877 |    617.934 | 678.445
   System.Tests.Perf_Single.DefaultTryParse(input: "-∞", innerIterations: 10000000)                                                                 | Duration | msec |     16     |    644.618 |      4.971 |    638.880 | 653.558
   System.Tests.Perf_Single.DefaultTryParse(input: "NaN", innerIterations: 10000000)                                                                | Duration | msec |     17     |    605.072 |     10.871 |    594.756 | 630.581

This is in comparison to: dotnet/corefx#32392 (comment)

The "worst" case regression (for double) looks to be 25.8% for double.MaxValue (from 11.6ms avg to 14.6ms avg -- 100k inner iterations).
The "best" case regression (for double) looks to be 1.07% for -0.0 (from 626.4ms avg to 633.1ms avg -- 10m inner iterations).
The "average" regression (for all double) looks to be 4.64% (from 3739.317ms ttl to 3912.856ms ttl)

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs

src/classlibnative/bcltype/number.h

jkotas · 2018-09-21T23:25:00Z

LGTM otherwise

* Porting NumberToDouble to managed code. * Deleting bcltype/number.cpp and bcltype/number.h * Fixing NumberToDouble to call Int64BitsToDouble, rather than DoubleToInt64Bits * Some minor code cleanup in NumberToDouble for better readability. * Some additional code cleanup in the Number.NumberToDouble.cs code Signed-off-by: dotnet-bot <[email protected]>

EgorBo · 2018-09-23T18:15:31Z

@tannergooding I am not sure if it's a bug or not but (I am testing your managed impl in mono):

var str = -1234d.ToString("#,,", nfi);

str is "-" is it expected?

nfi is from some test case:

        var nfi = new NumberFormatInfo();

        nfi.NaNSymbol = "NaN";
        nfi.PositiveSign = "+";
        nfi.NegativeSign = "-";
        nfi.PerMilleSymbol = "x";
        nfi.PositiveInfinitySymbol = "Infinity";
        nfi.NegativeInfinitySymbol = "-Infinity";

        nfi.NumberDecimalDigits = 5;
        nfi.NumberDecimalSeparator = ".";
        nfi.NumberGroupSeparator = ",";
        nfi.NumberGroupSizes = new int[] { 3 };
        nfi.NumberNegativePattern = 2;

        nfi.CurrencyDecimalDigits = 2;
        nfi.CurrencyDecimalSeparator = ".";
        nfi.CurrencyGroupSeparator = ",";
        nfi.CurrencyGroupSizes = new int[] { 3 };
        nfi.CurrencyNegativePattern = 8;
        nfi.CurrencyPositivePattern = 3;
        nfi.CurrencySymbol = "$";

        nfi.PercentDecimalDigits = 5;
        nfi.PercentDecimalSeparator = ".";
        nfi.PercentGroupSeparator = ",";
        nfi.PercentGroupSizes = new int[] { 3 };
        nfi.PercentNegativePattern = 0;
        nfi.PercentPositivePattern = 0;
        nfi.PercentSymbol = "%";

tannergooding · 2018-09-23T21:45:29Z

@EgorBo, that's a bug, but not from this change. It's from #19775.

I'll take a look and see if I can resolve the issue.

tannergooding added 2 commits September 20, 2018 13:14

Porting NumberToDouble to managed code.

32ce315

Deleting bcltype/number.cpp and bcltype/number.h

95d9ff9

tannergooding commented Sep 20, 2018

View reviewed changes

krwq reviewed Sep 20, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Show resolved Hide resolved

krwq reviewed Sep 20, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Show resolved Hide resolved

Fixing NumberToDouble to call Int64BitsToDouble, rather than DoubleTo…

d5820a7

…Int64Bits

krwq reviewed Sep 20, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Show resolved Hide resolved

danmoseley reviewed Sep 20, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Outdated Show resolved Hide resolved

danmoseley reviewed Sep 20, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Outdated Show resolved Hide resolved

danmoseley reviewed Sep 20, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Outdated Show resolved Hide resolved

Some minor code cleanup in NumberToDouble for better readability.

1f6e7ba

jkotas reviewed Sep 21, 2018

View reviewed changes

src/System.Private.CoreLib/shared/System/Number.NumberToDouble.cs Outdated Show resolved Hide resolved

jkotas reviewed Sep 21, 2018

View reviewed changes

src/classlibnative/bcltype/number.h Show resolved Hide resolved

Some additional code cleanup in the Number.NumberToDouble.cs code

cf6f443

tannergooding merged commit 09cc49e into dotnet:master Sep 22, 2018

EgorBo mentioned this pull request Sep 23, 2018

[WIP][corlib] Managed NumberToDouble and DoubleToNumber implementations from CoreCLR mono/mono#10763

Closed

tannergooding mentioned this pull request Sep 23, 2018

Updating NumberToStringFormat to not print the sign if there are no digits being returned #20109

Merged

ViktorHofer mentioned this pull request Sep 24, 2018

Revert "Add more VB operator tests" dotnet/corefx#32439

Merged

EgorBo mentioned this pull request Oct 3, 2018

cherry-pick Number changes to mono/corefx from coreclr (shared) mono/corefx#146

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Porting NumberToDouble to managed code. #20080

Porting NumberToDouble to managed code. #20080

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding Sep 20, 2018

danmoseley Sep 20, 2018

tannergooding Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

jkotas commented Sep 21, 2018

tannergooding commented Sep 21, 2018

jkotas commented Sep 21, 2018 •

edited

Loading

tannergooding commented Sep 21, 2018

jkotas commented Sep 21, 2018

tannergooding commented Sep 21, 2018

tannergooding commented Sep 21, 2018

jkotas commented Sep 21, 2018

EgorBo commented Sep 23, 2018 •

edited

Loading

tannergooding commented Sep 23, 2018

Porting NumberToDouble to managed code. #20080

Porting NumberToDouble to managed code. #20080

Conversation

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

tannergooding Sep 20, 2018

Choose a reason for hiding this comment

danmoseley Sep 20, 2018

Choose a reason for hiding this comment

tannergooding Sep 20, 2018

Choose a reason for hiding this comment

tannergooding commented Sep 20, 2018

tannergooding commented Sep 20, 2018

jkotas commented Sep 21, 2018

tannergooding commented Sep 21, 2018

jkotas commented Sep 21, 2018 • edited Loading

tannergooding commented Sep 21, 2018

jkotas commented Sep 21, 2018

tannergooding commented Sep 21, 2018

tannergooding commented Sep 21, 2018

jkotas commented Sep 21, 2018

EgorBo commented Sep 23, 2018 • edited Loading

tannergooding commented Sep 23, 2018

jkotas commented Sep 21, 2018 •

edited

Loading

EgorBo commented Sep 23, 2018 •

edited

Loading