Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nerdsniped - GC2023 #2

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

dancantos
Copy link

@dancantos dancantos commented Nov 10, 2023

Attempting an improvement on the (use of the) Trigram function.

We do this by moving the slice allocation outside the trigram function. Trigrams is only used by Tokenize, and that function can actually compute the result slice size while cleaning up the input. The trigram function now takes the result slice and a location to start populating the output.

Since this inherently changes the function signature, we unexport it, and export the Trigram method as a wrapper for this unexported function that implements the old function signature. Since the exported method must behave as per the old code, it still needs to do its own memory alloc, and as a result does not see much improvement. Most of the improvement here is seen in the Tokenize method.

Benchmarks on Tokenize (on my machine) show ~1.5x improvement

# NEW
goos: darwin
goarch: arm64
pkg: indexer
BenchmarkTokenize-10    	  139582	      7814 ns/op	    8760 B/op	     243 allocs/op
PASS
ok  	indexer	1.287s

# OLD
goos: darwin
goarch: arm64
pkg: indexer
BenchmarkTokenize-10    	  101376	     11273 ns/op	   21608 B/op	     312 allocs/op
PASS
ok  	indexer	1.369s

UPDATE: credit to @Merovius for this idea
Second optimization: remove the alloc for a []rune slice in the trigram function by walking the string with 3 variables to keep track of a start, middle, and end index, using the utf8 package DecodeRune function to find an appropriate increment each iteration without the need to allocate the rune slice.

This brings the final time down to ~2.6us per call to Tokenize, a ~4.2x improvment

goos: darwin
goarch: arm64
pkg: indexer
BenchmarkTokenize-10    	  458174	      2656 ns/op	    6848 B/op	       4 allocs/op
PASS
ok  	indexer	2.477s

improvments inherently involve a change in function signature to
allow the calling function to pass in the result slice and start
location as input parameters. This allows the caller to do all the
memory allocation.

This helps us as the Tokenize method can pre-compute the final
slice size before doing a single large allocation.

The new function is unexported, while the main Trigram method
remains exported to preserve the func signature. However, since
this method must still allocate its own memory, there is no
significant improvement here. What IS improved greater is the
way Tokenize uses the unexported trigram function.

There is also a removal of an unecessary if branch.
@boyter
Copy link
Owner

boyter commented Nov 12, 2023

Sweet. Im going to do a benchmark of all of the PR's against my best effort and turn it into a blog post, and then merge the winner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants