Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance benchmarks #32

Closed
wants to merge 4 commits into from
Closed

Conversation

mdavezac
Copy link
Contributor

@mdavezac mdavezac commented Oct 9, 2016

I've added a number of benchmarks in a file, to check whether operations between unitful objects are just as fast as their unitless counterparts.

Unfortunately, this is not always the case (assuming the code in the pull-request is correct :) ). Most notably, operations involving unitful arrays are quite expensive.

Not sure whether you want this in the code itself.
The benchmarks are arranged as tests, and can be run with:

include("test/benchmarks.jl")

Benchmarks can be trialed in the REPL with:

judge_unit_benchmark(:((2u"m") * (2u"m^-1")))

@ajkeller34
Copy link
Collaborator

Thank you so much for taking the time to implement benchmarking! For a package like this it is sorely needed.

I'll review this PR carefully when I get a moment to do so. It's disappointing that some operations seem slow, but keep in mind that at least with 0.5.0, there are some bugs in julia that could be impacting performance, notably JuliaLang/julia#18465. Rational numbers are used wherever exact unit conversions are possible and it could be that the type instability described in that issue is causing wide-ranging performance problems. Of course, this is just a guess, and I need to look into what you found a little more.

@ajkeller34
Copy link
Collaborator

How about we do a little benchmarking case study. cc @timholy since he is a user of this package and is interested in its performance.

Let's first consider addition when units are mixed, which was a problematic benchmark:

julia> using Unitful, BenchmarkTools

julia> a = 1u"km"
1 km

julia> b = 2u"m"
2 m

julia> @benchmark +($a,$b)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  144.00 bytes
  allocs estimate:  5
  minimum time:     10.72 μs (0.00% GC)
  median time:      12.44 μs (0.00% GC)
  mean time:        13.40 μs (0.00% GC)
  maximum time:     79.81 μs (0.00% GC)

Yikes! Why is that taking so long? Well, since you specified integers, an exact conversion is possible and Rationals are used. As I mentioned above, we know there is currently a type instability with Rational (note the return type of Any):

julia> @code_warntype +(1u"km",2u"m")
Variables:
  #self#::Base.#+

# I've omitted some output here

  end::Any

The return type of Any is a bad sign. There may be other performance penalties associated with exact conversion, but probably a lot of it is this type instability. One can avoid both the type instability and any legitimate penalties associated with using Rationals by using floating-point numbers. The return type is concrete, and the performance is way better:

julia> a = 1.0u"km"
1.0 km

julia> b = 2.0u"m"
2.0 m

julia> @benchmark +($a,$b)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     1000
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     6.00 ns (0.00% GC)
  median time:      6.00 ns (0.00% GC)
  mean time:        6.24 ns (0.00% GC)
  maximum time:     43.00 ns (0.00% GC)

Now, let's compare with the performance of SIUnits:

julia> using SIUnits, SIUnits.ShortUnits

julia> a = 1.0km
1000.0 m

julia> b = 2.0m
1.0 m

julia> @benchmark +($a, $b)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     1000
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  0.00 bytes
  allocs estimate:  0
  minimum time:     2.00 ns (0.00% GC)
  median time:      2.00 ns (0.00% GC)
  mean time:        2.06 ns (0.00% GC)
  maximum time:     47.00 ns (0.00% GC)

So, why does SIUnits win here (2ns vs 6ns)? Well, the only length unit SIUnits can internalize is the meter. Other length units specified by the user are just converted to meters, as you see above. So SIUnits wins because the benchmark doesn't count the computation needed to convert km to m; it has already been converted by the time the benchmarking is done. When issues beyond performance are considered, this automatic conversion to just one unit (say meters) for a given dimension (length) can make other issues hard to solve: Keno/SIUnits.jl#22, Keno/SIUnits.jl#92, Keno/SIUnits.jl#57, Keno/SIUnits.jl#8, to name a few.

In the case of arrays, let's again consider floating-point numbers only for simplicity, and do another comparison:

julia> using Unitful

julia> A = [1.0u"km", 2.0u"m"]
2-element Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}:
 1000.0 m
    2.0 m

julia> @benchmark .+($A, $A)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     973
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  144.00 bytes
  allocs estimate:  3
  minimum time:     73.00 ns (0.00% GC)
  median time:      79.00 ns (0.00% GC)
  mean time:        106.86 ns (20.18% GC)
  maximum time:     5.91 μs (97.90% GC)
julia> using SIUnits, SIUnits.ShortUnits

julia> B = [1.0km, 2.0m]
2-element Array{SIUnits.SIQuantity{Float64,1,0,0,0,0,0,0,0,0},1}:
 1000.0 m
    2.0 m

julia> @benchmark .+($B, $B)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     958
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  144.00 bytes
  allocs estimate:  3
  minimum time:     86.00 ns (0.00% GC)
  median time:      92.00 ns (0.00% GC)
  mean time:        121.36 ns (18.40% GC)
  maximum time:     6.27 μs (96.46% GC)
julia> C = [1000.0, 2.0]
2-element Array{Float64,1}:
 1000.0
    2.0

julia> @benchmark .+($C, $C)
BenchmarkTools.Trial: 
  samples:          10000
  evals/sample:     979
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  144.00 bytes
  allocs estimate:  3
  minimum time:     64.00 ns (0.00% GC)
  median time:      69.00 ns (0.00% GC)
  mean time:        96.29 ns (22.58% GC)
  maximum time:     5.82 μs (98.44% GC)

For both Unitful and SIUnits, to get a concretely-typed array, unit conversion is required before the benchmarking is done (no conversion required for Float64). In all cases, the performance is pretty similar. I wouldn't read too much into the differences here since I didn't do this very carefully.

Eventually (once I don't have to explain the type instability) I will write up some of this in the Unitful documentation, to emphasize differences between Unitful and SIUnits and enable users to choose the package that best suits their needs.

using Unitful
using BenchmarkTools
using Base.Test
using DataFrames
Copy link
Collaborator

@ajkeller34 ajkeller34 Oct 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFrames no longer needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Sorry, the pull-request is a bit dirty. In part, it's because I'm not sure how benchmarks are integrated into packages. As part of the testing framework? outside of it? as a report? not all?

In any case, I'll correct that and the unused function.

using Base.Test
using DataFrames

function benchmark()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function used anywhere? If I try it out I get the following:

julia> benchmark()
ERROR: UndefVarError: benchmark! not defined
 in benchmark() at /Users/ajkeller/.julia/v0.5/Unitful/test/benchmarks.jl:10

@mdavezac
Copy link
Contributor Author

What about the case for arrays with a single unit type [1u"m", 1u"m"] .* [1u"m", 1u"m"], where the eltype is concrete and so forth. Is there any reason to think it should be slower than the corresponding operation between bare arrays? I'd like to see if I can figure out a fix, if it's worth it.

@timholy
Copy link
Contributor

timholy commented Oct 10, 2016

You'd need to avoid using integers in that comparison, due to JuliaLang/julia#18465. Once that bug is fixed, it shouldn't matter.

@mdavezac
Copy link
Contributor Author

I've tried this with arrays of floats, and verified Unitful does not add much overhead, if any.
However, it's a bit flaky. So, I'm not sure a unit-test format is the right way to go.
Closing this for now.

@mdavezac mdavezac closed this Oct 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants