Faster Integer Range Operators #931

manofstick · 2016-02-05T09:43:22Z

As discussed in #520, the current implementation of the range operators is kind of slow. The discussion on #520 listed some issues with the provided implementation which included some issues where the resultant series didn't match, as well as not matching the exceptions thrown in the invalid state of Current (although MSDN docs says that such calls are undefined, so could probably return anything anyway...) These issues, I believe, have been addressed in this implementation.

A lighter implementation of the range operators that conforms (i think?) to current impl

msftclas · 2016-02-05T09:43:27Z

Hi @manofstick, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!
You've already signed the contribution license agreement. Thanks!

The agreement was validated by Microsoft and real humans are currently evaluating your PR.

TTYL, MSBOT;

forki · 2016-02-05T09:45:45Z

@manofstick do you have any numbers that show that your implementation is indeed faster?

Do we need to add additional tests?

manofstick · 2016-02-05T09:48:49Z

Some performance numbers:

let sw = System.Diagnostics.Stopwatch.StartNew ()

let data = seq {
    let r = System.Random 42
    for i = 1 to 10000000 do
        yield r.NextDouble ()    
}

let average = data |> Seq.average
let minimum = data |> Seq.min
let maximum = data |> Seq.max

System.Console.WriteLine ("{0} ({1},{2},{3})", sw.ElapsedMilliseconds, minimum, average, maximum)

State	x32	% of release	x64	% of release
release	4664	100%	2166	100%
branch	1929	42%	1278	59%

TIme in milliseconds

manofstick · 2016-02-05T09:58:40Z

@forki

This implementation is basically equivalent to @tpetricek's version, (especially if you get rid of the exceptions in Current property, although that was one of the comments that @latkin made - i.e. to match existing functionality)

I also have a "fast path" for the standard increment by 1, which is faster still, but really we're just tweaking a little bit; although it does affect the performance in the previously listed example by > 10%, so I thought it was worth it.

I probably need to do some better digging to see what existing tests there are to determine if I need to add some, (and probably do; hopefully I'll find some time over the w/end)

enricosada · 2016-02-05T10:01:40Z

@manofstick if you can rebase (or cherry-pick) and force push to remove the merge master commit 9344c31, it's more clean. Otherwise np

forki · 2016-02-05T10:04:51Z

would it help if the implementation would be special cased for every (or some important like int32) type?

enricosada · 2016-02-05T10:05:22Z

src/fsharp/FSharp.Core/prim-types.fs

+            let inline variableStepIntegralRange n step m =
+                if step = LanguagePrimitives.GenericZero
+                then invalidArg "step" (SR.GetString(SR.stepCannotBeZero));
+                else


i think the else it's not needed, because invalidArg raise an exception ( so a if then it's enough, like there
Less indentation

manofstick · 2016-02-05T10:06:12Z

@enricosada

We use Mercurial at work and so I have never bothered to learn git. Send me commands, and I'll do 'em (...I'm just a noob living in GitHub Desktop land...)

enricosada · 2016-02-05T10:08:55Z

src/fsharp/FSharp.Core/prim-types.fs

+                                else state.Current
+
+                            member this.Current =
+                                box this.Current


why box? this.Current it' not of generic IEnumerator<'a> so abstract Current : 'a with get?

It's the "IEnumerable.Current : obj" version.

Probably a bad habbit, but I find myself removing all explicit types.

Happy to put the ": obj" in if it makes this more explicit.

ah ok, my bad, it's ok

manofstick · 2016-02-05T10:14:06Z

@forki

Due to the statically resolved types, and the way the code is structured (i.e. I do a check for overflow combined with a check for exceeding upper bound) I believe that the code is as efficient as it could be for all data sizes (it is only a replace for "integer" types; i.e not floating point) (... well it could be split for positive and negative step so that check only occurs once, but as I'm dealing with the increment by 1 as a separate case, which I think is the most common usage, I wasn't too concerned)

dsyme · 2016-02-05T10:47:38Z

src/fsharp/FSharp.Core/prim-types.fs

@@ -6497,6 +6497,95 @@ namespace Microsoft.FSharp.Core
                  interface IEnumerable with 
                      member x.GetEnumerator() = (gen() :> IEnumerator) }

+            [<NoEquality; NoComparison>]
+            type VariableStepIntegralRangeState<'a> = {


Can this be a struct to save an allocation?

Or maybe we just use 3 different variables and remove this type altogether.

Can't say I like the allocation, but without changes to the underlying F# code generator I don't think this is particularly easy to fix (without a lot of duplicated code per type.) So the problem with converting to a struct, or having the 3 variables separately (although I do like from a semantic perspective the mutable state all grouped together) is that the class that F# generates in a closure from let mutable blah = ... is just a ref type, so you are paying for the allocation anyway, it's just hidden. So then I thought about actually creating a real type instead of just using the object expression, but because I'm using statically resolved types functionality I would need an inline type which doesn't exist. So I would have to provide a specific type per underlying integer type which seems a bit excessive.

dsyme · 2016-02-05T10:53:33Z

Great work. Please run all tests :) (though I believe master has failing tests at the moment :( )

forki · 2016-02-05T10:55:24Z

How can master have failing tests? ;-)
On Feb 5, 2016 11:53, "Don Syme" [email protected] wrote:

Great work. Please run all tests :) (though I believe master has failing
tests at the moment :( )

—
Reply to this email directly or view it on GitHub
#931 (comment)
.

dsyme · 2016-02-05T11:03:15Z

@forki I know, I know.... Poor CI not able to run enough tests, we have to crank it up,

dsyme · 2016-03-02T15:14:24Z

@manofstick I like this, but I suspect we need to add specific new tests to match the cases in the implementation. For example, I look at cases like this: https://github.com/manofstick/visualfsharp/blob/manofstick-perf-range-enumerators/src/fsharp/FSharp.Core/prim-types.fs#L6665 and I reckon we don't have test coverage for edge cases in floating point enumeration.

Are there specific tests cases we could add? Perhaps someone could volunteer to do some hammering on this?

latkin · 2016-03-02T17:06:54Z

I had a partial list of nasty cases here #520 (comment), not sure if those are incorporated.

manofstick · 2016-03-06T09:19:37Z

Must say I was a bit slack here, I did kind of assume that strange bounds checks would have already been included in the original test suite, and this was just a refactoring.

But anyway, when I get some time (he says, scrambling to try and put sometime behind the keyboard on a Sunday evening) I'll either ensure that such tests do already exist, or that I add some official ones.

(I did poke an prod them a bit, obviously, so I would be surprised if I stuffed up. But hey, coming off the back of my failed recursive types, I think I would like to double checking my stuff!!)

manofstick · 2016-03-06T09:23:21Z

@dsyme / @latkin

Oh, there were no changes to the floating point enumerations... We're you just referring in general to have some extra tests?

dsyme · 2016-03-07T10:38:22Z

@manofstick You're right - ignore my comment on floating point, was looking at existing code rather than your new code. We can assume there is testing for the existing code.

dsyme · 2016-03-08T17:24:09Z

@manofstick Could you check these integer test cases please linked by @latkin above?

testSeq 1 10 (Some(Int32.MaxValue))  // new impl overflows, underflows
testSeq 1 10 (Some(-1))              // old impl returns 0-length seq, new impl underflows and gives huge seq
testSeq 3 1 None                     // old impl gives empty seq for a .. b with a < b

I don't know if those comments apply to this implementation - I don't think so - but the testing should be added if it's not there already. Also please add some kind of systematic testing for additional edge cases similar to these cases.

If those test cases are already there and there's no additional testing to be done then this can be pulled.

Thanks

manofstick · 2016-03-09T09:04:07Z

OK; I'll dig in and punish edge cases (for all the various integer types). Hopefully find time this weekend.

…ofstick-perf-range-enumerators

manofstick · 2016-03-12T02:26:30Z

@dsyme

So much for me trusting that there would have been tests around edges cases to begin with! And actually unless I'm mistaken, I can't actually find any testing around this functionality at all, beyond being utilized in auxiliary functionality in tests in collection modules. (i.e. can't find testing around the integer edges case that are used in floating point sequences that you mentioned before as an example.)

Anyway, I am rather blind, my wife always grabs the thing that I'm looking for from right in front of me, so maybe I'm missing it.

But I'll proceed as if there is currently no testing on ranges; but let me know if you know that there is. (The one place where there does seem to be a little of this kind of testing in regards to BigIntType).

dsyme · 2016-03-14T11:19:24Z

Great, thanks. There's some in tests\fsharp\core\libtest\test.fsx, e.g. IntegerLoopsWithMinAndMaxIntAndKnownBounds, IntegerLoopsWithMinAndMaxIntAndKnownBoundsGoingDown, but they only test int32. Minimally duplicating out all these tests for the other numeric types would make sense, plus systematically testing under different strides and edge cases

Various tests for both signed and unsigned integer values or various bit sizings

After MoveNext has been returning false, we should have been throwing an exception. This has now been rectified.

manofstick · 2016-03-16T08:59:44Z

The tests added give reasonable coverage (I feel) of original functionality, but as singleStepRangeEnumerator special cases single increments when the range bounds are not max and I so I need to add a couple of extra things around that. Anyway, out of time, as usual. Hopefully this weekend at the latest.

dsyme · 2016-03-17T16:45:00Z

src/fsharp/FSharp.Core.Unittests/FSharp.Core/PrimTypes.fs

+    [<Test>] member this.Int32  () = RangeTestsHelpers.signed   System.Int32.MinValue  System.Int32.MaxValue
+    [<Test>] member this.UInt32 () = RangeTestsHelpers.unsigned System.UInt32.MinValue System.UInt32.MaxValue
+    [<Test>] member this.Int64  () = RangeTestsHelpers.signed   System.Int64.MinValue  System.Int64.MaxValue
+    [<Test>] member this.UInt64 () = RangeTestsHelpers.unsigned System.UInt64.MinValue System.UInt64.MaxValue


What about IntPtr and UIntPtr? Thanks

- singleStepRangeEnumerator cases - exceptions for variableStepRangeEnumerator - IntPtr & UIntPtr

Well maybe it can be configured, but just shows the method name, not the class, so prepended Range to the names, which appears to be a mostly followed naming convention for tests.

KevinRansom · 2016-05-26T17:29:09Z

@manofstick @dsyme . Hey guys, is this ready to pull?

it looks pretty great to me.

dsyme · 2016-06-07T13:51:12Z

Yes, LGTM

Paul Westcott added 2 commits February 3, 2016 13:37

Merge remote-tracking branch 'refs/remotes/Microsoft/master'

9344c31

Replacement implementation for Range operators

d5d4c49

A lighter implementation of the range operators that conforms (i think?) to current impl

msftclas added the cla-already-signed label Feb 5, 2016

enricosada reviewed Feb 5, 2016
View reviewed changes

Removed indentation

6f6fabf

dsyme reviewed Feb 5, 2016
View reviewed changes

Made changes as per code reviews

a82e6a3

manofstick changed the title ~~Manofstick perf range enumerators~~ Faster Integer Range Operators Feb 7, 2016

Merge remote-tracking branch 'refs/remotes/Microsoft/master' into man…

786d0b8

…ofstick-perf-range-enumerators

manofstick added 2 commits March 16, 2016 14:49

Tests for OperatorIntrinsics.Range*

1e41aae

Various tests for both signed and unsigned integer values or various bit sizings

Fix throwing of exception

288ea4c

After MoveNext has been returning false, we should have been throwing an exception. This has now been rectified.

dsyme reviewed Mar 17, 2016
View reviewed changes

manofstick added 2 commits March 19, 2016 13:43

Additional tests covering remaining cases

e00d1e5

- singleStepRangeEnumerator cases - exceptions for variableStepRangeEnumerator - IntPtr & UIntPtr

Nicer naming for Visual Studio's "Test Explorer"

a698a97

Well maybe it can be configured, but just shows the method name, not the class, so prepended Range to the names, which appears to be a mostly followed naming convention for tests.

KevinRansom merged commit 3617da4 into dotnet:master Jun 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster Integer Range Operators #931

Faster Integer Range Operators #931

manofstick commented Feb 5, 2016

msftclas commented Feb 5, 2016

forki commented Feb 5, 2016

manofstick commented Feb 5, 2016

manofstick commented Feb 5, 2016

enricosada commented Feb 5, 2016

forki commented Feb 5, 2016

enricosada Feb 5, 2016

manofstick commented Feb 5, 2016

enricosada Feb 5, 2016

manofstick Feb 5, 2016

enricosada Feb 5, 2016

manofstick commented Feb 5, 2016

dsyme Feb 5, 2016

forki Feb 5, 2016

manofstick Feb 5, 2016

dsyme commented Feb 5, 2016

forki commented Feb 5, 2016

dsyme commented Feb 5, 2016

dsyme commented Mar 2, 2016

latkin commented Mar 2, 2016

manofstick commented Mar 6, 2016

manofstick commented Mar 6, 2016

dsyme commented Mar 7, 2016

dsyme commented Mar 8, 2016

manofstick commented Mar 9, 2016

manofstick commented Mar 12, 2016

dsyme commented Mar 14, 2016

manofstick commented Mar 16, 2016

dsyme Mar 17, 2016

KevinRansom commented May 26, 2016

dsyme commented Jun 7, 2016

Faster Integer Range Operators #931

Faster Integer Range Operators #931

Conversation

manofstick commented Feb 5, 2016

msftclas commented Feb 5, 2016

forki commented Feb 5, 2016

manofstick commented Feb 5, 2016

manofstick commented Feb 5, 2016

enricosada commented Feb 5, 2016

forki commented Feb 5, 2016

Choose a reason for hiding this comment

manofstick commented Feb 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manofstick commented Feb 5, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsyme commented Feb 5, 2016

forki commented Feb 5, 2016

dsyme commented Feb 5, 2016

dsyme commented Mar 2, 2016

latkin commented Mar 2, 2016

manofstick commented Mar 6, 2016

manofstick commented Mar 6, 2016

dsyme commented Mar 7, 2016

dsyme commented Mar 8, 2016

manofstick commented Mar 9, 2016

manofstick commented Mar 12, 2016

dsyme commented Mar 14, 2016

manofstick commented Mar 16, 2016

Choose a reason for hiding this comment

KevinRansom commented May 26, 2016

dsyme commented Jun 7, 2016