Auto merge of #375 - QuarticCat:feedback, r=Amanieu · rust-lang/hashbrown@c93ef8f

Commit

Auto merge of #375 - QuarticCat:feedback, r=Amanieu

Add `#[inline(always)]` to `find_inner`

## Benchmark

Boost at most 6%.

```text
test clone_from_large            ... bench:       5,562 ->   5,523 ns/iter (+/- 96)
test clone_from_small            ... bench:          52 ->      52 ns/iter (+/- 0)
test clone_large                 ... bench:       5,764 ->   5,760 ns/iter (+/- 36)
test clone_small                 ... bench:          71 ->      72 ns/iter (+/- 12)
test grow_insert_ahash_highbits  ... bench:      24,918 ->  25,229 ns/iter (+/- 710)
test grow_insert_ahash_random    ... bench:      24,590 ->  24,824 ns/iter (+/- 279)
test grow_insert_ahash_serial    ... bench:      24,896 ->  24,892 ns/iter (+/- 3,497)
test grow_insert_std_highbits    ... bench:      43,482 ->  43,342 ns/iter (+/- 706)
test grow_insert_std_random      ... bench:      43,074 ->  43,674 ns/iter (+/- 592)
test grow_insert_std_serial      ... bench:      43,305 ->  43,335 ns/iter (+/- 1,059)
test insert_ahash_highbits       ... bench:      25,025 ->  25,110 ns/iter (+/- 96)
test insert_ahash_random         ... bench:      25,017 ->  25,329 ns/iter (+/- 7,387)
test insert_ahash_serial         ... bench:      24,895 ->  25,041 ns/iter (+/- 692)
test insert_erase_ahash_highbits ... bench:      25,049 ->  25,284 ns/iter (+/- 1,191)
test insert_erase_ahash_random   ... bench:      24,836 ->  24,742 ns/iter (+/- 380)
test insert_erase_ahash_serial   ... bench:      23,832 ->  23,563 ns/iter (+/- 611)
test insert_erase_std_highbits   ... bench:      43,699 ->  43,265 ns/iter (+/- 684)
test insert_erase_std_random     ... bench:      44,499 ->  44,193 ns/iter (+/- 670)
test insert_erase_std_serial     ... bench:      45,250 ->  43,969 ns/iter (+/- 544)
test insert_std_highbits         ... bench:      35,552 ->  34,444 ns/iter (+/- 739)
test insert_std_random           ... bench:      34,484 ->  34,596 ns/iter (+/- 508)
test insert_std_serial           ... bench:      34,778 ->  34,658 ns/iter (+/- 1,003)
test iter_ahash_highbits         ... bench:       1,292 ->   1,292 ns/iter (+/- 89)
test iter_ahash_random           ... bench:       1,299 ->   1,295 ns/iter (+/- 72)
test iter_ahash_serial           ... bench:       1,306 ->   1,295 ns/iter (+/- 18)
test iter_std_highbits           ... bench:       1,277 ->   1,306 ns/iter (+/- 31)
test iter_std_random             ... bench:       1,278 ->   1,295 ns/iter (+/- 30)
test iter_std_serial             ... bench:       1,303 ->   1,310 ns/iter (+/- 54)
test lookup_ahash_highbits       ... bench:       4,766 ->   4,767 ns/iter (+/- 44)
test lookup_ahash_random         ... bench:       4,735 ->   4,761 ns/iter (+/- 45)
test lookup_ahash_serial         ... bench:       4,670 ->   4,696 ns/iter (+/- 98)
test lookup_fail_ahash_highbits  ... bench:       4,438 ->   4,232 ns/iter (+/- 55)
test lookup_fail_ahash_random    ... bench:       4,233 ->   4,243 ns/iter (+/- 85)
test lookup_fail_ahash_serial    ... bench:       3,928 ->   3,970 ns/iter (+/- 307)
test lookup_fail_std_highbits    ... bench:      14,177 ->  14,927 ns/iter (+/- 165)
test lookup_fail_std_random      ... bench:      14,217 ->  15,081 ns/iter (+/- 188)
test lookup_fail_std_serial      ... bench:      14,253 ->  14,949 ns/iter (+/- 1,439)
test lookup_std_highbits         ... bench:      14,700 ->  14,797 ns/iter (+/- 940)
test lookup_std_random           ... bench:      14,759 ->  14,996 ns/iter (+/- 609)
test lookup_std_serial           ... bench:      14,784 ->  14,928 ns/iter (+/- 293)
test rehash_in_place             ... bench:     213,678 -> 218,933 ns/iter (+/- 3,012)
```

## ASM Difference

Here's the major difference of generated assembly of `HashMap::<u64, u64>::get`:

![image](https://user-images.githubusercontent.com/70888415/202892229-033e080e-beea-4d80-ab71-74091ad54423.png)

When using `#[inline]`, the last `self.bucket(index)` of `find` doesn't get optimized away. According to `@nbdd0121,` `#[inline(always)]` happens in MIR inlining, while `#[inline]` (likely) happens in LLVM inlining.

Loading branch information

bors committed Nov 20, 2022

2 parents 5d49276 + 7d6c2a8 commit c93ef8f

src/raw/mod.rs

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -1189,7 +1189,7 @@ impl<A: Allocator + Clone> RawTableInner<A> {
  
        /// Searches for an element in the table. This uses dynamic dispatch to reduce the amount of

        /// code generated, but it is eliminated by LLVM optimizations.

        #[inline]

        #[inline(always)]

        fn find_inner(&self, hash: u64, eq: &mut dyn FnMut(usize) -> bool) -> Option<usize> {

            let h2_hash = h2(hash);

            let mut probe_seq = self.probe_seq(hash);

0 comments on commit `c93ef8f`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `c93ef8f`

Commit

There are no files selected for viewing

0 comments on commit c93ef8f

0 comments on commit `c93ef8f`