Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete package? #6

Open
devmotion opened this issue Jul 5, 2019 · 6 comments
Open

Delete package? #6

devmotion opened this issue Jul 5, 2019 · 6 comments

Comments

@devmotion
Copy link
Member

In the following benchmarks the samplers in Distributions are as fast as the implementations in this package:

using Distributions, PoissonRandom, StatsFuns
using Plots

function n_count(rng, λ, n)
  tmp = 0
  for i in 1:n
    tmp += PoissonRandom.count_rand(rng,λ)
  end
  tmp
end

function n_pois(rng,λ,n)
  tmp = 0
  for i in 1:n
    tmp += pois_rand(rng,λ)
  end
  tmp
end

function n_ad(rng, λ, n)
  tmp = 0
  for i in 1:n
    tmp += PoissonRandom.ad_rand(rng, λ)
  end
  tmp
end

function n_dist(λ,n)
  tmp = 0
  for i in 1:n
    tmp += rand(Poisson(λ))
  end
  tmp
end

function n_rfunctions(λ, n)
  tmp = 0
  for i in 1:n
    tmp += convert(Int, StatsFuns.RFunctions.poisrand(λ))
  end
  tmp
end

function n_countsampler(rng, λ::Float64, n)
  tmp = 0
  for i in 1:n
    tmp += rand(rng, Distributions.PoissonCountSampler(λ))
  end
  tmp
end

function n_adsampler(rng, λ::Float64, n)
  tmp = 0
  for i in 1:n
    tmp += rand(rng, Distributions.PoissonADSampler(λ))
  end
  tmp
end

function time_λ!(rng, times, λ::Float64, n)
  times[1] = @elapsed n_count(rng, λ, n)
  times[2] = @elapsed n_ad(rng, λ, n)
  times[3] = @elapsed n_pois(rng, λ, n)
  times[4] = @elapsed n_dist(λ, n)
  times[5] = @elapsed n_rfunctions(λ, n)
  times[6] = @elapsed n_countsampler(rng, λ, n)
  times[7] = @elapsed n_adsampler(rng, λ, n)

  nothing
end

function plot_benchmark(rng)
    times = Matrix{Float64}(undef, 7, 20)

    # Compile
    time_λ!(rng, view(times, :, 1), 5, 5_000_000)

    # Run with a bunch of λ
    for λ in 1:20
        time_λ!(rng, view(times, :, λ), float(λ), 5_000_000)
    end

    plot(times',
         labels = ["count_rand", "ad_rand", "pois_rand", "Distributions",
                   "RFunctions", "PoissonCountSampler", "PoissonADSampler"],
         lw = 3)
end

I get

using Random
Random.seed!(1234)
plot_benchmark(Random.GLOBAL_RNG)
savefig("global_rng.png")

image

and

using RandomNumbers
plot_benchmark(Xorshifts.Xoroshiro128Plus(1234))
savefig("xoroshiro128plus.png")

image

Actually, performing the benchmarks of n_count and n_ad with integer values for λ (as done in the README) leads to subpar performance compared with Distributions. For n_count this is probably due to the comparison of integer and float values in https://github.com/JuliaDiffEq/PoissonRandom.jl/blob/master/src/PoissonRandom.jl#L11, as @code_typed indicates:

julia> @code_typed PoissonRandom.count_rand(Random.GLOBAL_RNG, 5)
CodeInfo(
1%1  = invoke PoissonRandom.randexp(_2::MersenneTwister)::Float64
2%2  = φ (#1 => 0, #3 => %14)::Int64%3  = φ (#1 => %1, #3 => %16)::Float64%4  = (Base.sitofp)(Float64, λ)::Float64%5  = (Base.lt_float)(%3, %4)::Bool%6  = (Base.eq_float)(%3, %4)::Bool%7  = (Base.lt_float)(%4, 9.22337e18)::Bool%8  = (Base.and_int)(%6, %7)::Bool%9  = (Base.fptosi)(Int64, %4)::Int64%10 = (Base.slt_int)(%9, λ)::Bool%11 = (Base.and_int)(%8, %10)::Bool%12 = (Base.or_int)(%5, %11)::Bool
└──       goto #4 if not %12
3%14 = (Base.add_int)(%2, 1)::Int64%15 = invoke PoissonRandom.randexp(_2::MersenneTwister)::Float64%16 = (Base.add_float)(%3, %15)::Float64
└──       goto #2
4return %2
) => Int64

julia> @code_typed PoissonRandom.count_rand(Random.GLOBAL_RNG, 5.0)
CodeInfo(
1%1 = invoke PoissonRandom.randexp(_2::MersenneTwister)::Float64
2%2 = φ (#1 => 0, #3 => %6)::Int64%3 = φ (#1 => %1, #3 => %8)::Float64%4 = (Base.lt_float)(%3, λ)::Bool
└──      goto #4 if not %4
3%6 = (Base.add_int)(%2, 1)::Int64%7 = invoke PoissonRandom.randexp(_2::MersenneTwister)::Float64%8 = (Base.add_float)(%3, %7)::Float64
└──      goto #2
4return %2
) => Int64

julia> @code_typed rand(Random.GLOBAL_RNG, Distributions.PoissonCountSampler(5.0))
CodeInfo(
1%1 = (Base.getfield)(s, )::Float64
└── %2 = invoke Distributions.randexp(_2::MersenneTwister)::Float64
2%3 = φ (#1 => 0, #3 => %7)::Int64%4 = φ (#1 => %2, #3 => %9)::Float64%5 = (Base.lt_float)(%4, %1)::Bool
└──      goto #4 if not %5
3%7 = (Base.add_int)(%3, 1)::Int64%8 = invoke Distributions.randexp(_2::MersenneTwister)::Float64%9 = (Base.add_float)(%4, %8)::Float64
└──      goto #2
4return %3
) => Int64

Hence, should we deprecate this package by adding a deprecation warning to pois_rand? How many DiffEq packages depend on PoissonRandom? Would it be sufficient to implement pois_rand (if we want to keep it) in DiffEqJump using the samplers in Distributions?

@ChrisRackauckas
Copy link
Member

Just the DiffEqJump stuff uses it, and yeah one of the main purposes was to not have the binaries that Distributions.jl pulls in.

@devmotion
Copy link
Member Author

Fair enough. It just felt to me like the README

So this package ends up about 30% or so faster than Distributions.jl (the method at the far edge is λ-independent so that goes on forever).

suggests that the implementation in this package is somewhat superior and faster than the one in Distributions. That's actually not the case, and if you call count_rand with integer-valued intensities (as done in the benchmarks in the README) the performance is even worse.

@ChrisRackauckas
Copy link
Member

it used to be better 🤷‍♂ .

@rafaqz
Copy link

rafaqz commented Jul 26, 2019

It still is better overall for simple pois_rand (and maybe others), as it allocates less than the alternatives.

I get 10% improvement in a large simulation using rand_pois where rand(Poisson(x)) was only 10% of the total time anyway, because it was 99.5% of the allocations.

@devmotion
Copy link
Member Author

To get up to the same speed, you have to use the samplers in Distributions and should not run rand(Poisson(x)). As the benchmark in the first comment above shows, rand(rng, Distributions.PoissonCountSampler(λ)) and rand(rng, Distributions.PoissonADSampler(λ)) are as fast as PoissonRandom.count_rand(rng,λ) and PoissonRandom.ad_rand(rng, λ).

@rafaqz
Copy link

rafaqz commented Jul 27, 2019

Ok I see you're right, they have basically identical performance and no allocations.

Distributions should swap the code comments for real documenation, those types are pretty much impossible to find or understand currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants