Skip to content

Commit

Permalink
Merge #35189
Browse files Browse the repository at this point in the history
35189: exec: use go runtime hash functions in hashjoiner r=yuzefovich,jordanlewis a=asubiotto

The previous hash functions derived from Java's hashCode implementations
were good when it came to deriving a hash that could be used for
equality (my understanding is that hashCode is primarily used for this)
but the distribution property of these functions was unclear.
Additionally hashing was not supported for some types.

This commit copies go's non-cryptographic hash functions used for maps
to support hashing for all types and get good distribution properties
with minimal performance impact. The hash algorithm used is derived
from:

https://github.com/Cyan4973/xxHash
https://github.com/google/cityhash

And was tested using https://github.com/aappleby/smhasher.

Release note: None

The performance impact is noticeable (these are int64 cols), which makes me happy about the hash joiner's performance wrt theoretical limits. Profiles show more CPU usage at the hashing stage as expected. I don't think there's a way to avoid this impact.
```
name                                                                   old time/op    new time/op    delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24         299µs ±25%     367µs ± 5%  +22.64%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24      18.1ms ± 1%    19.7ms ± 3%   +8.79%  (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24      266ms ± 2%     294ms ± 2%  +10.26%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24        385µs ± 5%     397µs ± 4%   +3.17%  (p=0.029 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24     27.3ms ± 2%    27.6ms ± 3%     ~     (p=0.218 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24     969ms ± 6%     988ms ± 6%     ~     (p=0.123 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24          360µs ± 2%     363µs ± 6%     ~     (p=0.400 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24       18.9ms ± 2%    20.7ms ± 3%   +9.31%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24       281ms ± 2%     310ms ± 1%  +10.26%  (p=0.000 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24         391µs ± 4%     402µs ± 6%   +2.79%  (p=0.043 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24      26.5ms ± 2%    27.3ms ± 1%   +2.99%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24      751ms ± 1%     781ms ± 4%   +3.94%  (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24          452µs ± 3%     485µs ± 1%   +7.14%  (p=0.000 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24       31.0ms ± 2%    33.6ms ± 5%   +8.14%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24       469ms ± 1%     506ms ± 4%   +7.97%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24         496µs ± 4%     519µs ± 2%   +4.67%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24      42.2ms ± 4%    44.9ms ± 4%   +6.24%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24      1.48s ± 5%     1.50s ± 2%     ~     (p=0.211 n=10+9)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24           466µs ± 4%     500µs ± 2%   +7.28%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24        31.4ms ± 2%    34.1ms ± 3%   +8.62%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24        485ms ± 3%     518ms ± 4%   +6.83%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24          516µs ± 4%     517µs ± 2%     ~     (p=0.684 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24       40.1ms ± 3%    42.1ms ± 3%   +5.01%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24       1.10s ± 4%     1.17s ± 5%   +6.12%  (p=0.000 n=10+10)

name                                                                   old speed      new speed      delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24       453MB/s ±29%   358MB/s ± 5%  -21.03%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24     926MB/s ± 1%   852MB/s ± 3%   -8.07%  (p=0.000 n=9+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24   1.01GB/s ± 2%  0.91GB/s ± 2%   -9.29%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24      341MB/s ± 5%   331MB/s ± 4%   -3.07%  (p=0.029 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24    615MB/s ± 2%   609MB/s ± 3%     ~     (p=0.218 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24   277MB/s ± 6%   272MB/s ± 6%     ~     (p=0.123 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24        365MB/s ± 2%   362MB/s ± 6%     ~     (p=0.400 n=9+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24      887MB/s ± 2%   812MB/s ± 3%   -8.51%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24     955MB/s ± 2%   866MB/s ± 1%   -9.32%  (p=0.000 n=10+9)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24       335MB/s ± 4%   326MB/s ± 6%   -2.67%  (p=0.043 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24     633MB/s ± 3%   614MB/s ± 1%   -2.93%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24    357MB/s ± 1%   344MB/s ± 4%   -3.76%  (p=0.000 n=9+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24        290MB/s ± 3%   270MB/s ± 1%   -6.70%  (p=0.000 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24      540MB/s ± 2%   500MB/s ± 5%   -7.46%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24     573MB/s ± 1%   531MB/s ± 4%   -7.31%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24       264MB/s ± 4%   253MB/s ± 2%   -4.46%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24     397MB/s ± 4%   374MB/s ± 4%   -5.87%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24    181MB/s ± 5%   179MB/s ± 4%     ~     (p=0.143 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24         281MB/s ± 4%   262MB/s ± 2%   -6.81%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24       534MB/s ± 2%   492MB/s ± 3%   -7.89%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24      554MB/s ± 3%   519MB/s ± 4%   -6.38%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24        254MB/s ± 5%   253MB/s ± 2%     ~     (p=0.684 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24      419MB/s ± 3%   399MB/s ± 3%   -4.76%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24     244MB/s ± 4%   229MB/s ± 3%   -6.25%  (p=0.000 n=10+9)

name                                                                   old alloc/op   new alloc/op   delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24         758kB ± 0%     758kB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24      40.1MB ± 0%    40.1MB ± 0%     ~     (p=0.792 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24      598MB ± 0%     598MB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24        779kB ± 0%     779kB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24     42.5MB ± 0%    42.5MB ± 0%     ~     (p=0.240 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24     636MB ± 0%     636MB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24          761kB ± 0%     761kB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24       40.4MB ± 0%    40.4MB ± 0%     ~     (p=0.177 n=10+10)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24       602MB ± 0%     602MB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24         782kB ± 0%     782kB ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24      42.7MB ± 0%    42.7MB ± 0%     ~     (p=0.619 n=8+8)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24      640MB ± 0%     640MB ± 0%     ~     (p=0.137 n=10+8)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24          758kB ± 0%     758kB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24       40.1MB ± 0%    40.1MB ± 0%     ~     (p=1.000 n=9+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24       598MB ± 0%     598MB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24         779kB ± 0%     779kB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24      42.5MB ± 0%    42.5MB ± 0%     ~     (p=0.373 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24      636MB ± 0%     636MB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24           761kB ± 0%     761kB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24        40.4MB ± 0%    40.4MB ± 0%     ~     (p=0.529 n=8+9)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24        602MB ± 0%     602MB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24          782kB ± 0%     782kB ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24       42.7MB ± 0%    42.7MB ± 0%   -0.00%  (p=0.013 n=10+8)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24       640MB ± 0%     640MB ± 0%     ~     (all equal)

name                                                                   old allocs/op  new allocs/op  delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-24          99.0 ± 0%      99.0 ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-24       1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-24      24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=2048-24          101 ± 0%       101 ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=262144-24      1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=false/rows=4194304-24     24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=2048-24            101 ± 0%       101 ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=262144-24        1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=true/rows=4194304-24       24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=2048-24           103 ± 0%       103 ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=262144-24       1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=true/distinct=false/rows=4194304-24      24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-24           99.0 ± 0%      99.0 ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-24        1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-24       24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=2048-24           101 ± 0%       101 ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=262144-24       1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=false/rows=4194304-24      24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=2048-24             101 ± 0%       101 ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=262144-24         1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=true/rows=4194304-24        24.8k ± 0%     24.8k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=2048-24            103 ± 0%       103 ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=262144-24        1.71k ± 0%     1.71k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=true/distinct=false/rows=4194304-24       24.8k ± 0%     24.8k ± 0%     ~     (all equal)
```

Co-authored-by: Alfonso Subiotto Marqués <[email protected]>
  • Loading branch information
craig[bot] and asubiotto committed Mar 4, 2019
2 parents 8ad3b7f + f63890e commit af7425a
Show file tree
Hide file tree
Showing 5 changed files with 219 additions and 29 deletions.
40 changes: 22 additions & 18 deletions pkg/sql/exec/execgen/cmd/execgen/overloads.go
Original file line number Diff line number Diff line change
Expand Up @@ -245,10 +245,11 @@ type bytesCustomizer struct{}
// variable-set semantics.
type decimalCustomizer struct{}

// float32Customizer and float64Customizer are necessary since float32 and
// float64 require additional logic for hashing.
type float32Customizer struct{}
type float64Customizer struct{}
// floatCustomizers are used for hash functions.
type floatCustomizer struct{ width int }

// intCustomizers are used for hash functions.
type intCustomizer struct{ width int }

func (boolCustomizer) getCmpOpAssignFunc() assignFunc {
return func(op overload, target, l, r string) string {
Expand All @@ -264,11 +265,11 @@ func (boolCustomizer) getCmpOpAssignFunc() assignFunc {
func (boolCustomizer) getHashAssignFunc() assignFunc {
return func(op overload, target, v, _ string) string {
return fmt.Sprintf(`
x := uint64(0)
x := 0
if %[2]s {
x = 1
}
%[1]s = x
%[1]s = %[1]s*31 + uintptr(x)
`, target, v)
}
}
Expand All @@ -289,11 +290,9 @@ func (bytesCustomizer) getCmpOpAssignFunc() assignFunc {
func (bytesCustomizer) getHashAssignFunc() assignFunc {
return func(op overload, target, v, _ string) string {
return fmt.Sprintf(`
_temp := 1
for b := range %s {
_temp = _temp*31 + b
}
%s = uint64(hash)
sh := (*reflect.SliceHeader)(unsafe.Pointer(&%[1]s))
%[2]s = memhash(unsafe.Pointer(sh.Data), %[2]s, uintptr(len(%[1]s)))
`, v, target)
}
}
Expand All @@ -319,20 +318,21 @@ func (decimalCustomizer) getHashAssignFunc() assignFunc {
if err != nil {
panic(fmt.Sprintf("%%v", err))
}
%[1]s = math.Float64bits(d)
%[1]s = f64hash(noescape(unsafe.Pointer(&d)), %[1]s)
`, target, v)
}
}

func (float32Customizer) getHashAssignFunc() assignFunc {
func (c floatCustomizer) getHashAssignFunc() assignFunc {
return func(op overload, target, v, _ string) string {
return fmt.Sprintf("%s = uint64(math.Float32bits(%s))", target, v)
return fmt.Sprintf("%[1]s = f%[3]dhash(noescape(unsafe.Pointer(&%[2]s)), %[1]s)", target, v, c.width)
}
}

func (float64Customizer) getHashAssignFunc() assignFunc {
func (c intCustomizer) getHashAssignFunc() assignFunc {
return func(op overload, target, v, _ string) string {
return fmt.Sprintf("%s = math.Float64bits(%s)", target, v)
return fmt.Sprintf("%[1]s = memhash%[3]d(noescape(unsafe.Pointer(&%[2]s)), %[1]s)", target, v, c.width)
}
}

Expand All @@ -341,8 +341,12 @@ func registerTypeCustomizers() {
registerTypeCustomizer(types.Bool, boolCustomizer{})
registerTypeCustomizer(types.Bytes, bytesCustomizer{})
registerTypeCustomizer(types.Decimal, decimalCustomizer{})
registerTypeCustomizer(types.Float32, float32Customizer{})
registerTypeCustomizer(types.Float64, float64Customizer{})
registerTypeCustomizer(types.Float32, floatCustomizer{width: 32})
registerTypeCustomizer(types.Float64, floatCustomizer{width: 64})
registerTypeCustomizer(types.Int8, intCustomizer{width: 8})
registerTypeCustomizer(types.Int16, intCustomizer{width: 16})
registerTypeCustomizer(types.Int32, intCustomizer{width: 32})
registerTypeCustomizer(types.Int64, intCustomizer{width: 64})
}

// Avoid unused warning for Assign, which is only used in templates.
Expand Down
183 changes: 183 additions & 0 deletions pkg/sql/exec/hash.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
// Copyright 2014 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the golang.org/LICENSE file.

// Hashing algorithm inspired by
// xxhash: https://code.google.com/p/xxhash/
// cityhash: https://code.google.com/p/cityhash/
// Most of the code in this file is copied from the go runtime package. These
// are the hash functions used for go maps.

package exec

import (
"math/rand"
"unsafe"
)

const (
ptrSize = 4 << (^uintptr(0) >> 63) // unsafe.Sizeof(uintptr(0)) but an ideal const
c0 = uintptr((8-ptrSize)/4*2860486313 + (ptrSize-4)/4*33054211828000289)
c1 = uintptr((8-ptrSize)/4*3267000013 + (ptrSize-4)/4*23344194077549503)
// Constants for multiplication: four random odd 64-bit numbers.
m1 = 16877499708836156737
m2 = 2820277070424839065
m3 = 9497967016996688599
m4 = 15839092249703872147
)

// hashKey is used to seed the hash function.
var hashKey [4]uintptr

func init() {
rand.Read((*[len(hashKey) * ptrSize]byte)(unsafe.Pointer(&hashKey))[:])
hashKey[0] |= 1 // make sure these numbers are odd
hashKey[1] |= 1
hashKey[2] |= 1
hashKey[3] |= 1
}

func readUnaligned32(p unsafe.Pointer) uint32 {
return *(*uint32)(p)
}

func readUnaligned64(p unsafe.Pointer) uint64 {
return *(*uint64)(p)
}

// Should be a built-in for unsafe.Pointer?
//go:nosplit
func add(p unsafe.Pointer, x uintptr) unsafe.Pointer {
return unsafe.Pointer(uintptr(p) + x)
}

//go:linkname noescape runtime.noescape
func noescape(p unsafe.Pointer) unsafe.Pointer

func memhash(p unsafe.Pointer, seed, s uintptr) uintptr {
h := uint64(seed + s*hashKey[0])
tail:
switch {
case s == 0:
case s < 4:
h ^= uint64(*(*byte)(p))
h ^= uint64(*(*byte)(add(p, s>>1))) << 8
h ^= uint64(*(*byte)(add(p, s-1))) << 16
h = rotl31(h*m1) * m2
case s <= 8:
h ^= uint64(readUnaligned32(p))
h ^= uint64(readUnaligned32(add(p, s-4))) << 32
h = rotl31(h*m1) * m2
case s <= 16:
h ^= readUnaligned64(p)
h = rotl31(h*m1) * m2
h ^= readUnaligned64(add(p, s-8))
h = rotl31(h*m1) * m2
case s <= 32:
h ^= readUnaligned64(p)
h = rotl31(h*m1) * m2
h ^= readUnaligned64(add(p, 8))
h = rotl31(h*m1) * m2
h ^= readUnaligned64(add(p, s-16))
h = rotl31(h*m1) * m2
h ^= readUnaligned64(add(p, s-8))
h = rotl31(h*m1) * m2
default:
v1 := h
v2 := uint64(seed * hashKey[1])
v3 := uint64(seed * hashKey[2])
v4 := uint64(seed * hashKey[3])
for s >= 32 {
v1 ^= readUnaligned64(p)
v1 = rotl31(v1*m1) * m2
p = add(p, 8)
v2 ^= readUnaligned64(p)
v2 = rotl31(v2*m2) * m3
p = add(p, 8)
v3 ^= readUnaligned64(p)
v3 = rotl31(v3*m3) * m4
p = add(p, 8)
v4 ^= readUnaligned64(p)
v4 = rotl31(v4*m4) * m1
p = add(p, 8)
s -= 32
}
h = v1 ^ v2 ^ v3 ^ v4
goto tail
}

h ^= h >> 29
h *= m3
h ^= h >> 32
return uintptr(h)
}

func memhash8(p unsafe.Pointer, h uintptr) uintptr {
return memhash(p, h, 1)
}

func memhash16(p unsafe.Pointer, h uintptr) uintptr {
return memhash(p, h, 2)
}

func memhash32(p unsafe.Pointer, seed uintptr) uintptr {
h := uint64(seed + 4*hashKey[0])
v := uint64(readUnaligned32(p))
h ^= v
h ^= v << 32
h = rotl31(h*m1) * m2
h ^= h >> 29
h *= m3
h ^= h >> 32
return uintptr(h)
}

func memhash64(p unsafe.Pointer, seed uintptr) uintptr {
h := uint64(seed + 8*hashKey[0])
h ^= uint64(readUnaligned32(p)) | uint64(readUnaligned32(add(p, 4)))<<32
h = rotl31(h*m1) * m2
h ^= h >> 29
h *= m3
h ^= h >> 32
return uintptr(h)
}

// Note: in order to get the compiler to issue rotl instructions, we
// need to constant fold the shift amount by hand.
// TODO: convince the compiler to issue rotl instructions after inlining.
func rotl31(x uint64) uint64 {
return (x << 31) | (x >> (64 - 31))
}

// NOTE: Because NaN != NaN, a map can contain any
// number of (mostly useless) entries keyed with NaNs.
// To avoid long hash chains, we assign a random number
// as the hash value for a NaN.

func f32hash(p unsafe.Pointer, h uintptr) uintptr {
f := *(*float32)(p)
switch {
case f == 0:
return c1 * (c0 ^ h) // +0, -0
case f != f:
// TODO(asubiotto): fastrand relies on some stack internals.
//return c1 * (c0 ^ h ^ uintptr(fastrand())) // any kind of NaN
return c1 * (c0 ^ h ^ uintptr(rand.Uint32()))
default:
return memhash(p, h, 4)
}
}

func f64hash(p unsafe.Pointer, h uintptr) uintptr {
f := *(*float64)(p)
switch {
case f == 0:
return c1 * (c0 ^ h) // +0, -0
case f != f:
// TODO(asubiotto): fastrand relies on some stack internals.
//return c1 * (c0 ^ h ^ uintptr(fastrand())) // any kind of NaN
return c1 * (c0 ^ h ^ uintptr(rand.Uint32())) // any kind of NaN
default:
return memhash(p, h, 8)
}
}
8 changes: 3 additions & 5 deletions pkg/sql/exec/hashjoiner.go
Original file line number Diff line number Diff line change
Expand Up @@ -446,11 +446,9 @@ func (ht *hashTable) loadBatch(batch ColBatch) {
// key is a tuple of various types, rehash is used to apply a transformation on
// the resulting hash value based on an element of the key of a specified type.
//
// The current integer tuple hashing heuristic is based off of Java's
// Arrays.hashCode(int[]) and only supports int8, int16, int32, and int64
// elements. float32 and float64 are hashed according to their respective 32-bit
// and 64-bit integer representation. bool keys are hashed as a 1 for true and 0
// for false. bytes are hashed as an array of int8 integers.
// We currently use the same hash functions used by go's maps.
// TODO(asubiotto): Once https://go-review.googlesource.com/c/go/+/155118/ is
// in, we should use the public API.
//
// initHash initializes the hash value of each key to its initial state for
// rehashing purposes.
Expand Down
16 changes: 10 additions & 6 deletions pkg/sql/exec/hashjoiner_tmpl.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ package exec
import (
"bytes"
"fmt"
"math"
"reflect"
"unsafe"

"github.com/cockroachdb/cockroach/pkg/sql/exec/types"
"github.com/cockroachdb/cockroach/pkg/sql/sem/tree"
Expand All @@ -37,8 +38,11 @@ import (
// Dummy import to pull in "tree" package.
var _ tree.Datum

// Dummy import to pull in "math" package
var _ = math.Pi
// Dummy import to pull in "unsafe" package
var _ unsafe.Pointer

// Dummy import to pull in "reflect" package
var _ reflect.SliceHeader

// Dummy import to pull in "bytes" package.
var _ bytes.Buffer
Expand Down Expand Up @@ -140,9 +144,9 @@ func _REHASH_BODY(buckets []uint64, keys []interface{}, nKeys uint64, _SEL_STRIN
// {{define "rehashBody"}}
for i := uint64(0); i < nKeys; i++ {
v := keys[_SEL_IND]
var hash uint64
_ASSIGN_HASH(hash, v)
buckets[i] = buckets[i]*31 + hash
p := uintptr(buckets[i])
_ASSIGN_HASH(p, v)
buckets[i] = uint64(p)
}
// {{end}}

Expand Down
1 change: 1 addition & 0 deletions pkg/sql/exec/noescape.s
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
// Empty assembly file to allow go:linkname to work.

0 comments on commit af7425a

Please sign in to comment.