-
Notifications
You must be signed in to change notification settings - Fork 160
Kernel profiling with LIKWID
LIKWID is a great tool for analysing low-level performance. It works by collecting the information provided by counters in the CPU and presenting them to the user as a number of more easily understood metrics.
LIKWID presents the user with a number of different executables, the most important for us being:
-
likwid-bench
- A benchmarking tool to determine platform-specific metrics such as peak floating-point throughput and memory bandwidth. -
likwid-perfctr
- A tool that measures hardware performance counters.
To install LIKWID follow the instructions here. You will also need to install LIKWID's Python bindings.
Problems
Sometimes there can be issues when accessing the counters on a linux machine. E.g.
Warning: Counter PMC3 cannot be used if Restricted Transactional Memory feature is enabled and
bit 0 of register TSX_FORCE_ABORT is 0. As workaround write 0x1 to TSX_FORCE_ABORT:
sudo wrmsr 0x10f 0x1`
A fix is to get msr-tools with sudo apt-get install msr-tools
and then change the access to the counter with sudo wrmsr 0x10f 0x1
.
To get an idea of how fast your kernel actually is, it is essential that you benchmark your system in advance. Some useful metrics you can gather are:
-
Peak throughput (single-threaded)
$ likwid-bench -W S0:2GB:1 -t peakflops
Peak vectorised throughput can also be tested with
peakflops_avx
(and similar). -
Peak streaming memory bandwidth (single-threaded)
$ likwid-bench -W S0:2GB:1 -t load
You can also experiment with the bandwidth of different cache levels by varying the amount of data being transferred (e.g.
-W S0:2GB:1
->-W S0:20MB:1
).Note that there are other tests that can be run to determine memory bandwidth (e.g.
-t stream
) and these may produce different results depending on the number of loads and stores.
Alongside these benchmarks, likwid-topology
can also give you some insight into how your CPU is laid out.
To profile the kernels you need to:
-
Initialise LIKWID in your script.
import pylikwid pylikwid.markerinit()
-
Annotate the kernel with LIKWID's marker API. This ensures that we are not accidentally profiling anything that we don't care about. To do this either apply the following diff in PyOP2:
diff --git a/pyop2/base.py b/pyop2/base.py index e4469890..80f6f413 100644 --- a/pyop2/base.py +++ b/pyop2/base.py @@ -3566,7 +3566,10 @@ class ParLoop(object): # in case it's reused. for g in self._reduced_globals.keys(): g._data[...] = 0 + import pylikwid + pylikwid.markerstartregion("run1") self._compute(iterset.core_part, fun, *arglist) + pylikwid.markerstopregion("run1") self.global_to_local_end() self._compute(iterset.owned_part, fun, *arglist) self.reduction_begin()
or use the PyOP2 branch
with-likwid-markers
, which you can find here: https://github.com/OP2/PyOP2/tree/sv/with-likwid-markers. -
Run
likwid-perfctr
with the performance group of interest. For example, a good starting point is to run:$ likwid-perfctr -C S0:1 -g MEM_DP -m python myscript.py
or
$ likwid-perfctr -C S0:1 -g FLOPS_DP -m python myscript.py
Here are some additional resources you may find useful:
Building locally
Tips
- Running Firedrake tests with different subpackage branches
- Modifying and Rebuilding PETSc and petsc4py
- Vectorisation
- Debugging C kernels with
lldb
on MacOS - Parallel MPI Debugging with
tmux-mpi
,pdb
andgdb
- Parallel MPI Debugging with VSCode and
debugpy
- Modifying generated code
- Kernel profiling with LIKWID
- breakpoint() builtin not working
- Debugging pytest with multiple processing
Developers Notes
- Upcoming meeting 2024-08-21
- 2024-08-07
- 2024-07-24
- 2024-07-17
- 2024-07-10
- 2024-06-26
- 2024-06-19
- 2024-06-05
- 2024-05-29
- 2024-05-15
- 2024-05-08
- 2024-05-01
- 2024-04-28
- 2024-04-17
- 2024-04-10
- 2024-04-03
- 2024-03-27
- 2024-03-20
- 2024-03-06
- 2024-02-28
- 2024-02-28
- 2024-02-21
- 2024-02-14
- 2024-02-07
- 2024-01-31
- 2024-01-24
- 2024-01-17
- 2024-01-10
- 2023-12-13
- 2023-12-06
- 2023-11-29
- 2023-11-22
- 2023-11-15
- 2023-11-08
- 2023-11-01
- 2023-10-25
- 2023-10-18
- 2023-10-11
- 2023-10-04
- 2023-09-27
- 2023-09-20
- 2023-09-06
- 2023-08-30
- 2023-08-23
- 2023-07-12
- 2023-07-05
- 2023-06-21
- 2023-06-14
- 2023-06-07
- 2023-05-17
- 2023-05-10
- 2023-03-08
- 2023-02-22
- 2023-02-15
- 2023-02-08
- 2023-01-18
- 2023-01-11
- 2023-12-14
- 2022-12-07
- 2022-11-23
- 2022-11-16
- 2022-11-09
- 2022-11-02
- 2022-10-26
- 2022-10-12
- 2022-10-05
- 2022-09-28
- 2022-09-21
- 2022-09-14
- 2022-09-07
- 2022-08-25
- 2022-08-11
- 2022-08-04
- 2022-07-28
- 2022-07-21
- 2022-07-07
- 2022-06-30
- 2022-06-23
- 2022-06-16
- 2022-05-26
- 2022-05-19
- 2022-05-12
- 2022-05-05
- 2022-04-21
- 2022-04-07
- 2022-03-17
- 2022-03-03
- 2022-02-24
- 2022-02-10
- 2022-02-03
- 2022-01-27
- 2022-01-20
- 2022-01-13
- 2021-12-15
- 2021-12-09
- 2021-11-25
- 2021-11-18
- 2021-11-11
- 2021-11-04
- 2021-10-28
- 2021-10-21
- 2021-10-14
- 2021-10-07
- 2021-09-30
- 2021-09-23
- 2021-09-09
- 2021-09-02
- 2021-08-26
- 2021-08-18
- 2021-08-11
- 2021-08-04
- 2021-07-28
- 2021-07-21
- 2021-07-14
- 2021-07-07
- 2021-06-30
- 2021-06-23
- 2021-06-16
- 2021-06-09
- 2021-06-02
- 2021-05-19
- 2021-05-12
- 2021-05-05
- 2021-04-28
- 2021-04-21
- 2021-04-14
- 2021-04-07
- 2021-03-17
- 2021-03-10
- 2021-02-24
- 2021-02-17
- 2021-02-10
- 2021-02-03
- 2021-01-27
- 2021-01-20
- 2021-01-13
- 2021-01-06