Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use slices package for most sorting #27474

Closed

Conversation

danlaine
Copy link

This PR makes changes many, but not all, uses of sorting functions from the sort package to use the slices package instead.

sort.Sort relies on sort.Interface, which often requires implementing structs whose sole purpose is to implement this interface. This change allows for the removal of several such structs (e.g. type authors []string in build/update-license.go)

sort.Slice uses reflection internally, which is slow. Using slices.Sort is faster. This benchmark:

func BenchmarkSliceSort(b *testing.B) {
	sizes := []int{8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192}

	for _, size := range sizes {
		b.Run(
			fmt.Sprintf("slices.Sort size=%d", size),
			func(b *testing.B) {
				elts := make([]byte, size)
				for i := 0; i < b.N; i++ {
					_, _ = rand.Read(elts)
					slices.Sort(elts)
				}
			},
		)
	}

	for _, size := range sizes {
		b.Run(
			fmt.Sprintf("sort.Slice size=%d", size),
			func(b *testing.B) {
				elts := make([]byte, size)
				for i := 0; i < b.N; i++ {
					_, _ = rand.Read(elts)
					sort.Slice(elts, func(i, j int) bool {
						return elts[i] < elts[j]
					})
				}
			},
		)
	}
}

gives these results on my machine:

Running tool: /usr/local/go/bin/go test -benchmem -run=^$ -bench ^BenchmarkSliceSort$ github.com/ava-labs/avalanchego/utils

goos: linux
goarch: amd64
pkg: github.com/ava-labs/avalanchego/utils
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
BenchmarkSliceSort/slices.Sort_size=8-16         	11840335	       103.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=16-16        	 4276635	       279.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=32-16        	 1684501	       767.7 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=64-16        	  723175	      1658 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=128-16       	  309732	      3844 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=256-16       	  135510	      8675 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=512-16       	   62653	     19329 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=1024-16      	   29211	     41380 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=2048-16      	   15036	     79901 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=4096-16      	    7947	    150979 ns/op	       0 B/op	       0 allocs/op
BenchmarkSliceSort/slices.Sort_size=8192-16      	    4092	    287255 ns/op	       2 B/op	       0 allocs/op
BenchmarkSliceSort/sort.Slice_size=8-16          	 4451560	       267.9 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=16-16         	 1977598	       607.6 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=32-16         	  867464	      1382 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=64-16         	  373876	      3226 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=128-16        	  158400	      7455 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=256-16        	   71409	     16632 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=512-16        	   32967	     37212 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=1024-16       	   15610	     77329 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=2048-16       	    7797	    152502 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=4096-16       	    4143	    293553 ns/op	      56 B/op	       2 allocs/op
BenchmarkSliceSort/sort.Slice_size=8192-16       	    2154	    561113 ns/op	      59 B/op	       2 allocs/op

I didn't replace all usage of sort.Sort. Namely, I left it in places where:

  1. Doing so would've changed an exported type from a non-test file.
  2. The struct implementing sort.Interface also implements heap.Interface (so it can't be removed anyway.)
  3. Multiple slices are being sorted at once (e.g. capacitySort in eth/protocols/snap/sync.go)
  4. Anywhere else I just missed :)

Dan Laine added 30 commits June 14, 2023 10:49
…node, neturil; use slices.Sort instead of sort.Strings
@@ -315,7 +308,9 @@ func (s *Snapshot) signers() []common.Address {
for sig := range s.Signers {
sigs = append(sigs, sig)
}
sort.Sort(signersAscending(sigs))
slices.SortFunc(sigs, func(a, b common.Address) bool {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could reasonably define a Less function on common.Address. I decided not to do so in this PR, but something to consider for the maintainers.


// Swap swaps the elements with indexes i and j.
func (hs hashes) Swap(i, j int) { hs[i], hs[j] = hs[j], hs[i] }
func hashesLess(a, b common.Hash) bool {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to common.Address, maintainers might consider whether it would be reasonable to implement Less on common.Hash

@danlaine danlaine marked this pull request as ready for review June 14, 2023 19:48
@rjl493456442
Copy link
Member

The numbers are pretty impressive! The PR is too huge by touching lots of files. Will discuss it in our PR triage.

@holiman
Copy link
Contributor

holiman commented Jun 16, 2023

Another variant of the benchmark:

func BenchmarkSliceSort(b *testing.B) {
	sizes := []int{8 /*16, 32, */, 64 /*128, 256, */, 512 /*1024, 2048, */, 4096, 8192}

	for _, size := range sizes {
		src := make([]byte, size)
		rand.Read(src)
		data := make([]byte, size)
		b.Run(fmt.Sprintf("slices.Sort size=%d", size),
			func(b *testing.B) {
				for i := 0; i < b.N; i++ {
					copy(data, src)
					slices.Sort(data)
				}
			},
		)
		b.Run(fmt.Sprintf("slices.SortFunc size=%d", size),
			func(b *testing.B) {
				for i := 0; i < b.N; i++ {
					copy(data, src)
					//slices.Sort(data)
					slices.SortFunc(data, func(a, b byte) bool {
						return a < b
					})
				}
			},
		)
		b.Run(fmt.Sprintf("sort.Slice size=%d", size),
			func(b *testing.B) {
				for i := 0; i < b.N; i++ {
					copy(data, src)
					sort.Slice(data, func(i, j int) bool {
						return data[i] < data[j]
					})
				}
			},
		)
	}
}

Results

BenchmarkSliceSort/slices.Sort_size=8
BenchmarkSliceSort/slices.Sort_size=8-8         	14945247	        74.58 ns/op
BenchmarkSliceSort/slices.SortFunc_size=8
BenchmarkSliceSort/slices.SortFunc_size=8-8     	12745206	        95.24 ns/op
BenchmarkSliceSort/sort.Slice_size=8
BenchmarkSliceSort/sort.Slice_size=8-8          	 3224732	       414.2 ns/op
BenchmarkSliceSort/slices.Sort_size=64
BenchmarkSliceSort/slices.Sort_size=64-8        	 1298708	      1303 ns/op
BenchmarkSliceSort/slices.SortFunc_size=64
BenchmarkSliceSort/slices.SortFunc_size=64-8    	  468710	      2975 ns/op
BenchmarkSliceSort/sort.Slice_size=64
BenchmarkSliceSort/sort.Slice_size=64-8         	  349065	      2895 ns/op
BenchmarkSliceSort/slices.Sort_size=512
BenchmarkSliceSort/slices.Sort_size=512-8       	  143390	      8908 ns/op
BenchmarkSliceSort/slices.SortFunc_size=512
BenchmarkSliceSort/slices.SortFunc_size=512-8   	   29079	     42405 ns/op
BenchmarkSliceSort/sort.Slice_size=512
BenchmarkSliceSort/sort.Slice_size=512-8        	   51410	     50609 ns/op
BenchmarkSliceSort/slices.Sort_size=4096
BenchmarkSliceSort/slices.Sort_size=4096-8      	    6682	    181700 ns/op
BenchmarkSliceSort/slices.SortFunc_size=4096
BenchmarkSliceSort/slices.SortFunc_size=4096-8  	    3901	    556160 ns/op
BenchmarkSliceSort/sort.Slice_size=4096
BenchmarkSliceSort/sort.Slice_size=4096-8       	    2316	    571302 ns/op
BenchmarkSliceSort/slices.Sort_size=8192
BenchmarkSliceSort/slices.Sort_size=8192-8      	    2896	    506532 ns/op
BenchmarkSliceSort/slices.SortFunc_size=8192
BenchmarkSliceSort/slices.SortFunc_size=8192-8  	    1839	    889690 ns/op
BenchmarkSliceSort/sort.Slice_size=8192
BenchmarkSliceSort/sort.Slice_size=8192-8       	    1056	   1085926 ns/op

This is worth pursuing, but perhaps it would be better to bite off smaller chunks. We've done so before on larger prs that touch everything (e.g. when converting from the form atomic.AddUint64 into using the typed atomic.Uint64), that we split it up in reasonable chunks (one or a few packages per PR).

@danlaine
Copy link
Author

@holiman Sure I'll close this PR and open a bunch of smaller ones that only touch 1 or 2 packages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants