Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp/orderbook: Improve performance of path finding implementation #3818

Merged
merged 1 commit into from
Aug 12, 2021

Conversation

tamirms
Copy link
Contributor

@tamirms tamirms commented Aug 11, 2021

I have implemented 3 changes which improve the performance of the path finding implementation:

  1. remove visited map[string]bool and instead scan through the visited []xdr.Asset slice to determine if we have visited a node at some point in the current path of the DFS
  2. abort the dfs if we have already visited all the terminal nodes
  3. abort the dfs if we're at the max path limit before processing all outgoing edges

The first optimization takes advantage of the fact that doing a linear scan over a small list is faster than a map lookup. Since we're limiting the depth of our DFS to at most 5, the visited list will be small enough that iterating over the list is faster than a map lookup.

The second optimization is effective when we are processing path finding requests with just 1 or 2 destination assets. Most of the path finding traffic fits this case. When we have a few destination assets there is no point in continuing the DFS after we have visited the destination assets.

The third optimization is the most impactful one. Assume we have a max path length of 3, this means our path should have no more than three edges, src -> asset1 -> asset2 -> dest. The dfs function was enforcing the max path length by checking that the length of the visited list was not greater than maxPathLength. This check was done at the very beginning of the function.

I realized that we should move this check to a later point in the code. Specifically, we should first check if we are at a terminal node and then we can check if the maxPathLength condition. By doing that we eliminate the case where we are at the end of a maxPathLength path. We do not need to traverse the node's outgoing edges because the path is going to exceed maxPathLength.

Here are the benchmarks for these changes:

goos: darwin
goarch: amd64
pkg: github.com/stellar/go/exp/orderbook
cpu: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
BenchmarkMultipleDestinationAssets
BenchmarkMultipleDestinationAssets-12    	     260	   4426879 ns/op
BenchmarkTestData
BenchmarkTestData-12                     	       9	 119356122 ns/op
PASS

As a basis of comparison, here are the benchmarks for the code without the changes:

goos: darwin
goarch: amd64
pkg: github.com/stellar/go/exp/orderbook
cpu: Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
BenchmarkMultipleDestinationAssets
BenchmarkMultipleDestinationAssets-12    	      21	  55507880 ns/op
BenchmarkTestData
BenchmarkTestData-12                     	       1	2824609748 ns/op
PASS

There is room for improvement but the other ideas I had would affect the accuracy of the results. The nice thing about these improvements is that the output of the path finding implementation remains identical. Once the orderbook becomes sufficiently large that the path finding implementation becomes too slow even with these improvements, we can start looking at changes which sacrifice accuracy for efficiency.

@tamirms tamirms requested a review from a team August 11, 2021 12:59
@ire-and-curses
Copy link
Member

Wow - that's a factor 12 speedup! 🎉

@2opremio
Copy link
Contributor

Awesome!!

@tamirms tamirms merged commit 65cd406 into stellar:master Aug 12, 2021
@tamirms tamirms deleted the pathfinding-opt branch August 12, 2021 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants