Skip to content

Improving Traveler: Performance

bonzaiferroni edited this page Jun 6, 2017 · 1 revision

The basic algorithm that traveler uses hasn't changed much, but I spent some time investigating small ways that I could improve performance. Much thanks to all the people in #help who put up with my questions/tests. It is best to use the averages in the following graphs as just a rule of thumb, I discovered that there were order effects and there were also small things that might skew the results (like whether one creep gets "stuck" at the end because the other stopped in front of it).

I will follow the same battery of tests that I used in my examination of travelTo vs. moveTo;

Short distances: Path length ~20

Short distances

This gif shows the new visualization that happens when a traveler finds a new path. This is really helpful for debugging and adds pretty much no overhead.

Short distances profile

We don't see much difference here, the pathfinding was fairly quick for both versions and path-following ticks were pretty close. The spike at the end shouldn't be counted against oldTraveler, it was because the newTraveler creep had stopped in front of it after reaching the destination, after the "stuck timer" triggered it found a new path which brought it to the destination.

Medium distances: Path length ~50

Medium distances

Medium distances performance

Oddly satisfying how they seem to fit each other like a glove, really. Had to double check that I really was using two different instances (here's a full gist of the testing code show instantiation).

Moving through traffic

traffic animation

I thought this would be a handy test, one of the potential weak points of Traveler is that it doesn't account for other creeps unless a creep determines that it is "stuck". If creeps need to repath often or they lose a lot of ticks waiting to move, that is a problem.

Because watching creeps move in real time can be like watching paint dry, this is a history view. The newTraveler creep is the one selected.

traffic performance

One thing that concerned me at first, you don't see the initial pathfinding bump in cost. This is because my profiler was averaging the performance over a span of 5 ticks. Since the creeps were idle for the ticks preceding the pathfinding call, this brought the average down. Unfortunately this is a necessary evil; my grafana app cannot check the status every tick. We would potentially lose information if we didn't do some kind of averaging. It should just be noted that the initial pathfinding costs could be mitigated due to this.

Pretty similar performance. Again, the last bump shouldn't be counted against oldTraveler, he was just the last one to arrive and had to do a stuck-repath. These creeps-included pathfinding calls tend to cost a lot because building the matrix is more intense in rooms with a decent number of creeps, and the cost of using the matrix is also higher. Overall, both performed well and both versions do a decent job of avoiding getting stuck as long as the roads are free of idle creeps.

Very long distances: ~1200

very long map

To make sure the long distance pathing still worked as expected, I had them both go on another epic journey.

Very long performance

Both creeps were able to make the journey in one piece! There appears to be a much higher initial spike for newTraveler, but then it paths less often as well.

A performance close-up

I discovered something unexpected in my testing, there were order effects that influenced performance. In all the previous tests, I called newTraveler's function first, and then oldTraveler's. Here is what you see when you zoom in:

order effects 1

newTraveler seemed to be consistently worse than oldTraveler. I spent a long time trying to figure out why this was the case, because I had made a few "optimizations" that should have given it slightly less overhead.

When I switched the order, I saw newTraveler now beat oldTraveler by quite a bit. After randomizing the order that the tests were performed, I saw this.

order effects 2

Much better! Overall, I found that on ticks where no pathfinding is needed, newTraveler was about .01 cpu faster, just over 5%. That may not seem like much, but with ~500 creeps moving each tick that comes to ~5 cpu!

Pathfinding performance seemed to favor newTraveler as well, but more systematic testing should be done to verify that.