Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize top_set_bit(::BigInt) and hash(::Real) #49996

Merged
merged 5 commits into from
Jun 2, 2023
Merged

Conversation

LilithHafner
Copy link
Member

@LilithHafner LilithHafner commented May 30, 2023

EDIT: this PR now does a fair bit more than #49986 did, see later comments.

Re-lands #49986, but computes abs first. This should be safe now because ndigits0z(x, 2) and top_set_bit(abs(x)) are semantically equivalent whereas before I was falsely assuming x > 0.

The performance gain is now only 1.4x instead of 1.5x, which could be noise or the cost of abs (probably noise)

@LilithHafner LilithHafner added performance Must go faster hashing re-land This relands a PR that was previously merged but was later reverted. labels May 30, 2023
@LilithHafner
Copy link
Member Author

Empirical check that top_set_bit(abs(x)) == Base.ndigits0z(x, 2) is valid:

for T in [UInt8, Int8] 
    for i in typemin(T):typemax(T)
        for j in (i, big(i), Int(i), unsigned(Int(i)))
            Base.top_set_bit(abs(j)) == Base.ndigits0z(j, 2) || error()
        end
    end
end

@oscardssmith
Copy link
Member

While we're at it, is there a reason we're doing hash(Int64(num) << Int(pow), h) rather than hash(Int64(x))? (and similarly hash(Float64(x)) instead of hash(Int64(num) << Int(pow), h)

@LilithHafner
Copy link
Member Author

I suppose the existing version could be better for types that are slow to decompose and to convert to integers or floating-point values. I benchmarked no difference on any of

@btime hash(x) setup=(x=rand(Int)//1)
@btime hash(x) setup=(x=rand(Int)//1024)
@btime hash(x) setup=(x=rand(Int)//rand(Int))
@btime hash(x) setup=(x=rand(Int128))

for switching from num << pow to x in those three places, so the case to make that switch is pretty weak.

@oscardssmith
Copy link
Member

oscardssmith commented May 30, 2023

The first three won't hit it since there is a fast path for hash(x::Rational{<:BitInteger64}, h::UInt). The last will almost never hit it since random Int128s will almost always be bigger than typemax(Int64). You would want to test something like rational{BigInt} (where the pieces were rational) or something like that. We also should possibly be using unsafe_trunc here since we have verified that the arguments are in bounds.

@LilithHafner
Copy link
Member Author

I tested

@btime hash(x) setup=(x=big(rand(Int))//big(1))
@btime hash(x) setup=(x=big(rand(Int128))//big(1))

And got poor results

Int Int128
Before 16.784 23.218
PR 18.078 24.264
no bitshift 21.210 24.707

The reason for the regression for this PR is likely that top_set_bit(abs(x::BigInt)) is slow because the compiler can't infer nothrow. I have a fix for that.

@LilithHafner
Copy link
Member Author

@btime Base.top_set_bit(abs(x)) setup=(x=big(rand(Int128))); gets 1.7x faster with the second commit of this PR (though the compiler still can't infer nothrow), which changes the performance figures due to the first commit and the proposed change from bitshifting to just using the original x.

Int Int128
Before 16.825 23.218
PR 14.988 19.433
no bitshift 16.951 19.474

These performance data do not indicate that we should switch from bitshift to direct conversion. It's possible that that could be good in some cases, but I'll leave that for a different PR.

@LilithHafner LilithHafner changed the title Use top_set_bit to optimize hash(Real) (re-land) Optimize top_set_bit(::BigInt) and use it in hash(::Real) (re-land) May 30, 2023
@LilithHafner LilithHafner changed the title Optimize top_set_bit(::BigInt) and use it in hash(::Real) (re-land) Optimize top_set_bit(::BigInt) and hash(::Real) May 30, 2023
base/float.jl Outdated
end # typemin(Int64) handled by Float64 case
left <= 1024 && left - right <= 53 && return hash(ldexp(Float64(num),pow), h)
left <= 1024 && left - right <= 53 && return hash(ldexp(Float64(num), den_z), h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that with this change num will fit in a Float64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, assuming num and den are relatively prime (which we already rely on because if num = 5 and den = 3 we produce a different result than if num = 15 and den = 9).

den either has 0 trailing zeros or a nonzero number of trailing zeros.

  • In the first case, den_z = 0, so if ldexp(Float64(num), den_z) is representable as a Float64, then by ldexp(Float64(num), den_z) === ldexp(Float64(num), 0) === Float64(num), so is Float64(num).
  • In the second case, den_z ≠ 0, so num_z = 0, because we can't have both num and den be even because they are relatively prime. This means that every bit between 1 and top_set_bit(x) is significant, and we check that there are no more than 53 significant bits, so top_set_bit(x) <= 53, and therefore x is a small integer that is representable as a Float64.

@LilithHafner
Copy link
Member Author

I reran all the benchmarks that have come up in this and #49998 and reported before => after all benchmarks are allocation free.

@btime Base.hash(x, UInt(1234)) setup=(x=Int128(rand(Int)));
  # 12.262 ns => 3.958 ns
@btime Base.hash(x, UInt(1234)) setup=(x=Int128(rand(Int128)));
  # 24.975 ns => 16.367 ns
@btime Base.top_set_bit(abs(x)) setup=(x=big(rand(Int)));
  # 11.469 ns => 2.541 ns
@btime Base.top_set_bit(abs(x)) setup=(x=big(rand(Int128)));
  # 11.512 ns => 2.583 ns
@btime hash(x) setup=(x=big(rand(Int))//big(1));
  # 33.442 ns => 24.807 ns
@btime hash(x) setup=(x=big(rand(Int128))//big(1));
  # 41.625 ns => 35.247 ns
@btime hash(x) setup=(x=rand(Int)//1);
  # 3.708 ns  => 3.750 ns
@btime hash(x) setup=(x=rand(Int)//1024);
  # 3.837 ns  => 3.833 ns
@btime hash(x) setup=(x=rand(Int)//rand(Int));
  # 14.780 ns => 14.780 ns
@btime hash(x) setup=(x=rand(Int128));
  # 23.845 ns => 16.784 ns

@oscardssmith
Copy link
Member

To be clear: are those benchmarks including #49998?

@LilithHafner
Copy link
Member Author

Those results are just this PR.

@oscardssmith
Copy link
Member

interesting. That somehow seems to be faster than #49998 which is rather surprising to me. I guess our compiler is doing a really good job cleaning this up for Int128.

@LilithHafner
Copy link
Member Author

Yes. It took a fair bit of fiddling before it matched the performance of #49998, but once it did I knew it was pretty good :). I expect the reason this appears more impactful than #49998 is that the impact is more pronounced on my machine than yours.

@oscardssmith oscardssmith requested a review from vtjnash May 30, 2023 20:36
@oscardssmith
Copy link
Member

Test failures seem real.

@LilithHafner
Copy link
Member Author

Oops, I hadn't noticed :)

@oscardssmith
Copy link
Member

I think this is good to go, but I do kind of want a review from @vtjnash here.

@oscardssmith
Copy link
Member

Closing since #49996 gets the same performance.

@oscardssmith oscardssmith reopened this Jun 2, 2023
@oscardssmith
Copy link
Member

(oops, closed the wrong one)

@oscardssmith oscardssmith merged commit 1961019 into master Jun 2, 2023
@oscardssmith oscardssmith deleted the reland-49986 branch June 2, 2023 14:45
oscardssmith pushed a commit that referenced this pull request Jun 6, 2023
* fixup for 49996 and add test from 50065

---------

Co-authored-by: Lilith Hafner <[email protected]>
oscardssmith added a commit that referenced this pull request Jul 26, 2023
This fixes 2 bugs introduced by #49996 and #50041.
Closes #50628.
KristofferC pushed a commit that referenced this pull request Aug 10, 2023
This fixes 2 bugs introduced by #49996 and #50041.
Closes #50628.

(cherry picked from commit c777c71)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashing performance Must go faster re-land This relands a PR that was previously merged but was later reverted.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants