- https://ebpf.io/: Revolutionary sandboxed kernel profiling technology that makes it easier to build various profiling utilities. Tons of options here in Python https://github.com/iovisor/bcc
- Dtrace: Available on Solaris (includes Mac but not Ubuntu) with notable highlights
prstat
andmpstat
.prstat
is not available on Ubuntu but can be replicated withhtop
andps
- collectl: Full system level profiling including CPU, disk, memory and network
- perf: CPU level performance counters
- gprof: sampling and instrumentation aware profiling
- google perf tools
- Heaptrack: a heap memory profiler for linux
- jemalloc: another heap memory profiler
- ETW: Event tracing for windows
- Mac OS instruments: Mac OS instruments for profiling based on top of Dtrace
- Renderdoc: Multi platform graphics debugger for OpenGL and Vulkan
- Windows Perf Analyzer: If
htop
could plot lines, windows only but recently added support for android - htop: Visualize utilization as bar charts or line charts, issue commands to processes
- Magic Trace: High resolution programmable traces
- pprof: pprof is a tool for visualization and analysis of profiling data
- parca: Continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability
- parca-agent: eBPF-based always-on profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed! Supports multiple languages: C/C++, Rust, Go, Python, Ruby, Java, etc.
- viztracer
- psutil: Like htop but from within your python code
- pyinstrument:python call stack visualizer
- pycallgraph: Visualize call stack as a graph (Maintenance mode)
- py-spy: Sampling profiler for Python
- line profiler: Line by line profiling
- palanteer: Fanciest UI, looks like something out of the matrix
- yappi: multi threaded profiling
- Pycharm profiler: Built in profiler in Pycharm
- TAU
- gprof2dot: Graphical call stack visualizer (Maintenance mode)
- snakeviz: Visualize python cprofile data
- scalene: CPU and GPU based profiling with a web GUI
- pprofile: Very low overhead line profile
- austin-python: Line-level very low overhead time & memory profiler with web & terminal UI
- py-perf A low-overhead, sampling CPU profiler for Python implemented using eBPF.
- rbspy: Sampling CPU profiler for Ruby
- rbperf: Low-overhead sampling profiler and tracer for Ruby implemented in BPF
- vernier: Next generation CRuby profiler
- JProfiler: Java profiler for cpu, multithreading, graphical call stack visualizer
- Java visual VM: Bundled with JDK
- Unity profiler: profiling tools specific for game development
- Tracy: Windows only but very comprehensive and helpful for game development
- Callgrind: Valgrind extension
- Chrome profiler: Support for throttling, emulating weak hardware,
- Pytorch profiler: Visual profiles of computations and data loading for PyTorch models, requires changes to code
- PyTorch memory profiler: Can help debug OOMs and memory spikes
- ARM profiling: ARM specific profiling tools, heavyweight UI
- Intel Vtune
- Intel GPA: Intel Graphics performance analyzer
- pynvml: Like
nvidia-smi
for your code with deeper level instrumentation - NVIDIA visual profiler
- NVIDIA tools
- GPU View: Windows specific GPU profiling
- ROC profiler: AMD ROCm profiler
- Omniperf: AMD profiler for MI100 and MI200 accelerators
- NVIDIA NCU: Infinitely more useful than NVIDIA's nsys, does a godbolt style view on PTX and gives actionable performance hints
- Flame Graphs: flame graphs vs flame charts, off cpu profiling, icicle charts and more
- How to read icicle and flame graphs: Flame graphs and icicle graphs are a great way to visualize performance profiles. In this post, we will learn how to read and interpret them.
- Sampling vs Tracing: sampling based profilers are easier to use since they don't require any code change while instrumentation based profilers require code changes but are generally more informative
- C++ performance tools: reddit post with tons of links
- pyreverse: Get python classes and then visualize with
graphviz
- pdb: Use step in functionality or line by line to understand how your code works
- IntelliJ UML Class diagrams