Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Integer Range Operators #931

Merged
merged 9 commits into from
Jun 7, 2016
109 changes: 99 additions & 10 deletions src/fsharp/FSharp.Core/prim-types.fs
Original file line number Diff line number Diff line change
Expand Up @@ -6497,6 +6497,95 @@ namespace Microsoft.FSharp.Core
interface IEnumerable with
member x.GetEnumerator() = (gen() :> IEnumerator) }

[<NoEquality; NoComparison>]
type VariableStepIntegralRangeState<'a> = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a struct to save an allocation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe we just use 3 different variables and remove this type altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't say I like the allocation, but without changes to the underlying F# code generator I don't think this is particularly easy to fix (without a lot of duplicated code per type.) So the problem with converting to a struct, or having the 3 variables separately (although I do like from a semantic perspective the mutable state all grouped together) is that the class that F# generates in a closure from let mutable blah = ... is just a ref type, so you are paying for the allocation anyway, it's just hidden. So then I thought about actually creating a real type instead of just using the object expression, but because I'm using statically resolved types functionality I would need an inline type which doesn't exist. So I would have to provide a specific type per underlying integer type which seems a bit excessive.

mutable Started : bool
mutable Complete : bool
mutable Current : 'a
}
let inline variableStepIntegralRange n step m =
if step = LanguagePrimitives.GenericZero
then invalidArg "step" (SR.GetString(SR.stepCannotBeZero));
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the else it's not needed, because invalidArg raise an exception ( so a if then it's enough, like there
Less indentation

let variableStepRangeEnumerator () =
let state = {
Started = false
Complete = false
Current = Unchecked.defaultof<'a>
}

{ new System.Collections.Generic.IEnumerator<'a> with
member __.Dispose () = ()

member __.Current =
// according to IEnumerator<int>.Current documentation, the result of of Current
// is undefined prior to the first call of MoveNext and post called to MoveNext
// that return false (see https://msdn.microsoft.com/en-us/library/58e146b7%28v=vs.110%29.aspx)
// so we should be able to just return value here, and we could get rid of the
// complete variable which would be faster
if not state.Started then notStarted ()
elif state.Complete then alreadyFinished ()
else state.Current

member this.Current =
box this.Current
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why box? this.Current it' not of generic IEnumerator<'a> so abstract Current : 'a with get?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the "IEnumerable.Current : obj" version.

Probably a bad habbit, but I find myself removing all explicit types.

Happy to put the ": obj" in if it makes this more explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, my bad, it's ok


member __.Reset () =
state.Started <- false
state.Complete <- false
state.Current <- Unchecked.defaultof<_>

member __.MoveNext () =
if not state.Started then
state.Started <- true
state.Current <- n
state.Complete <-
( (step > LanguagePrimitives.GenericZero && state.Current > m)
|| (step < LanguagePrimitives.GenericZero && state.Current < m))
else
let next = state.Current + step
if (step > LanguagePrimitives.GenericZero && next > state.Current && next <= m)
|| (step < LanguagePrimitives.GenericZero && next < state.Current && next >= m)
then
state.Current <- next
else
state.Complete <- true

not state.Complete }

{ new System.Collections.Generic.IEnumerable<'a> with
member __. GetEnumerator () = variableStepRangeEnumerator ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why __ and this in same class? choose one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see it's like that in other class in same file. But that's needed or a copy/paste error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, just a habit where I use __ to denote not using the variable and 'this' when I am. (as usually I'm trying not to use the public interface, so it helps me note where I am).

Happy to follow any convention though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The golden rule it's use same convention of near code, so you are right, because the usage it's a bit mixed (x., this.x, __. ) in the file.
Dont know if there is a naming convention or not. If there is we can write that down as a quick note in Docs directory /cc @dsyme.
I think your habit is right, it's nice to have ( i'll use it in my code 😄 )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wildcard self identifiers (ignore this) is a planned F# change but __ is as close as we can get until the change is made.

member this.GetEnumerator () = this.GetEnumerator () :> System.Collections.IEnumerator }

let inline simpleIntegralRange minValue maxValue n step m =
if step <> LanguagePrimitives.GenericOne || n > m || n = minValue || m = maxValue
then variableStepIntegralRange n step m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use

if ... then 
    expr
else
    expr

in the F# codebase, thanks

else
// a constrained, common simple iterator that is fast.
let singleStepRangeEnumerator () =
let mutable value : 'a = n - LanguagePrimitives.GenericOne
{ new System.Collections.Generic.IEnumerator<'a> with
member __.Dispose () = ()
member __.Current =
// according to IEnumerator<int>.Current documentation, the result of of Current
// is undefined prior to the first call of MoveNext and post called to MoveNext
// that return false (see https://msdn.microsoft.com/en-us/library/58e146b7%28v=vs.110%29.aspx)
// so we should be able to just return value here, which would be faster
if value < n then notStarted ()
elif value > m then alreadyFinished ()
else value
member this.Current = box this.Current
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please list the implementation of System.Collections.IEnumerator out separately, for clarity

member __.Reset () = value <- n - LanguagePrimitives.GenericOne
member __.MoveNext () =
if value < m then
value <- value + LanguagePrimitives.GenericOne
true
else false }

{ new System.Collections.Generic.IEnumerable<'a> with
member __. GetEnumerator () = singleStepRangeEnumerator ()
member this.GetEnumerator () = this.GetEnumerator () :> System.Collections.IEnumerator }

// For RangeStepGeneric, zero and add are functions representing the static resolution of GenericZero and (+)
// for the particular static type.
let inline integralRangeStep<'T,'Step> (zero:'Step) (add:'T -> 'Step -> 'T) (n:'T, step:'Step, m:'T) =
Expand Down Expand Up @@ -6576,16 +6665,16 @@ namespace Microsoft.FSharp.Core
interface System.Collections.IEnumerable with
member x.GetEnumerator() = (gen() :> System.Collections.IEnumerator) }

let RangeInt32 n step m : seq<int> = integralRangeStep 0 (+) (n,step,m)
let RangeInt64 n step m : seq<int64> = integralRangeStep 0L (+) (n,step,m)
let RangeUInt64 n step m : seq<uint64> = integralRangeStep 0UL (+) (n,step,m)
let RangeUInt32 n step m : seq<uint32> = integralRangeStep 0ul (+) (n,step,m)
let RangeIntPtr n step m : seq<nativeint> = integralRangeStep 0n (+) (n,step,m)
let RangeUIntPtr n step m : seq<unativeint> = integralRangeStep 0un (+) (n,step,m)
let RangeInt16 n step m : seq<int16> = integralRangeStep 0s (+) (n,step,m)
let RangeUInt16 n step m : seq<uint16> = integralRangeStep 0us (+) (n,step,m)
let RangeSByte n step m : seq<sbyte> = integralRangeStep 0y (+) (n,step,m)
let RangeByte n step m : seq<byte> = integralRangeStep 0uy (+) (n,step,m)
let RangeInt32 n step m : seq<int> = simpleIntegralRange Int32.MinValue Int32.MaxValue n step m
let RangeInt64 n step m : seq<int64> = simpleIntegralRange Int64.MinValue Int64.MaxValue n step m
let RangeUInt64 n step m : seq<uint64> = simpleIntegralRange UInt64.MinValue UInt64.MaxValue n step m
let RangeUInt32 n step m : seq<uint32> = simpleIntegralRange UInt32.MinValue UInt32.MaxValue n step m
let RangeIntPtr n step m : seq<nativeint> = variableStepIntegralRange n step m
let RangeUIntPtr n step m : seq<unativeint> = variableStepIntegralRange n step m
let RangeInt16 n step m : seq<int16> = simpleIntegralRange Int16.MinValue Int16.MaxValue n step m
let RangeUInt16 n step m : seq<uint16> = simpleIntegralRange UInt16.MinValue UInt16.MaxValue n step m
let RangeSByte n step m : seq<sbyte> = simpleIntegralRange SByte.MinValue SByte.MaxValue n step m
let RangeByte n step m : seq<byte> = simpleIntegralRange Byte.MinValue Byte.MaxValue n step m
let RangeDouble n step m : seq<float> = floatingRange float (n,step,m)
let RangeSingle n step m : seq<float32> = floatingRange float32 (n,step,m)
let RangeGeneric one add n m : seq<'T> = integralRange (one,add,n,m)
Expand Down