-
Notifications
You must be signed in to change notification settings - Fork 38
Profiling Code
Official doc with information about how to interpret the output: https://github.com/gperftools/gperftools/blob/master/docs/cpuprofile.html
Install google performance tools:
Ubuntu
sudo apt-get install google-perftools libgoogle-perftools-dev graphviz
OSX Newer (2018) instructions can be found here.
brew install google-perftools graphviz
Then when you run your code, you need to define CPUPROFILE and LD_PRELOAD
Ubuntu
CPUPROFILE=/tmp/my_executable.prof LD_PRELOAD=/usr/lib/libprofiler.so.0 ./my_executable
OSX
CPUPROFILE=/tmp/my_executable.prof DYLD_INSERT_LIBRARIES=/usr/local/Cellar/google-perftools/2.1/lib/libprofiler.dylib ./my_executable
Ubuntu 18.04
In Ubuntu 18.04 libprofiler.so
is locate at:
/usr/lib/x86_64-linux-gnu/libprofiler.so
Finally, generate the call-graph pdf:
Ubuntu
google-pprof --pdf my_executable /tmp/my_executable.prof > my_executable_profiling.pdf
OSX
pprof --pdf my_executable /tmp/my_executable.prof > my_executable_profiling.pdf
The call graph then looks like:
CPU Profiling with ROS on Ubuntu
Profiling with as roslaunch is easy, since all subprocesses get captured automatically.
First, run roslaunch with the profiler:
env CPUPROFILE=/tmp/my_executable.prof LD_PRELOAD=/usr/lib/libprofiler.so.0 roslaunch dvs_tracking slam_ros.launch
Second, find out the PID of the node you want to profile. It is printed with every LOG(INFO) command.
Finally, generate the PDF call graph (replace PATH_TO_EXECUTABLE and PID):
google-pprof --pdf /home/odroid/catkin_ws/devel/lib/PATH_TO_EXECUTABLE /tmp/my_executable.prof_PID > profile.pdf
Warning: Your node has to shut down properly in order to write the profile to disk. If it doesn't, the profile could be empty.
Other useful variables:
CPUPROFILE_FREQUENCY=x
where the default x = 100 (value is in Hz).
// Pull in the sm_timing definitions
#include <sm_timing/Timer.h>
//Enable timing.
typedef sm::timing::Timer Timer;
//Disable timing by switching the typedef.
typedef sm::timing::DummyTimer Timer;
After adding this definition you can now time specific code segments by adding a timer call:
Timer expensive_operation_timer("ExpensiveOperation");
DoExpensiveOperation();
expensive_operation_timer.Stop();
When you want to print out the timing information you call:
sm::timing::Timing::Print(std::cout);
//or similarly
LOG(INFO) << sm::timing::Timing::Print();
You will get a table containing the following information:
SM Timing
---------
//a b c d e f g
ExpensiveOperation 5 00:00:09:00.12 (00:00:08:56.67 +- 00:00:00.15) [00:00:08.00 00:00:09.000]
Where the columns denote:
a) Name of the timer.
b) Number of calls.
c) Overall time of all calls.
d) Mean time of a call.
e) Std-dev of the times.
f) Min time of a call.
g) Max time of a call.
Then when you run your code, you need to define CPUPROFILE and LD_PRELOAD
Ubuntu
HEAPPROFILE=/tmp/mybin.hprof LD_PRELOAD=/usr/lib/libtcmalloc.so.4 ./my_executable
OSX
HEAPPROFILE=/tmp/mybin.hprof DYLD_INSERT_LIBRARIES=/usr/local/Cellar/google-perftools/2.1/lib/libtcmalloc.dylib ./my_executable
Finally, generate the call-graph pdf:
Ubuntu
google-pprof --pdf my_executable /tmp/profile.0001.heap > my_executable_heap_profiling.pdf
OSX
pprof --pdf my_executable /tmp/profile.0001.heap > my_executable_heap_profiling.pdf
Depending on your settings you will get many *.heap files that are created during the lifetime of your application.