-
Notifications
You must be signed in to change notification settings - Fork 2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
tests/bench_timers: Add benchmark for periph_timer and xtimer
This test is intended to collect statistics about the runtime delays in the periph_timer implementation. This tool is mainly intended to detect problems in the low level driver implementation which can be difficult to detect from higher level systems such as xtimer.
- Loading branch information
Joakim Nohlgård
committed
May 29, 2018
1 parent
5be4b7f
commit 857eaf4
Showing
8 changed files
with
1,777 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
include ../Makefile.tests_common | ||
|
||
# These boards only have a single timer in their periph_conf.h, needs special | ||
# CFLAGS configuration to build properly | ||
SINGLE_TIMER_BOARDS = \ | ||
native \ | ||
nucleo32-f031 \ | ||
nucleo32-f042 \ | ||
# | ||
|
||
# These boards have too little RAM to support collecting detailed statistics | ||
# with the default settings of TEST_MIN and TEST_MAX, so DETAILED_STATS will be | ||
# disabled by default for these boards unless explicitly enabled | ||
SMALL_RAM_BOARDS = \ | ||
nucleo32-f031 \ | ||
# | ||
|
||
FEATURES_REQUIRED = periph_timer | ||
|
||
# Use RTT as a wall clock reference, if available | ||
FEATURES_OPTIONAL = periph_rtt | ||
|
||
USEMODULE += random | ||
USEMODULE += fmt | ||
USEMODULE += matstat | ||
USEMODULE += xtimer | ||
|
||
ifeq (,$(findstring TIM_TEST_DEV,$(CFLAGS))) | ||
ifneq (,$(filter $(BOARD),$(SINGLE_TIMER_BOARDS))) | ||
CFLAGS += -DTIM_TEST_DEV=TIMER_DEV\(0\) -DTIM_REF_DEV=TIMER_DEV\(0\) | ||
endif | ||
endif | ||
|
||
ifeq (,$(findstring DETAILED_STATS,$(CFLAGS))) | ||
ifneq (,$(filter $(BOARD),$(SMALL_RAM_BOARDS))) | ||
CFLAGS += -DDETAILED_STATS=0 | ||
endif | ||
endif | ||
|
||
# Shortcut to configure the build for testing xtimer against a periph_timer reference | ||
.PHONY: test-xtimer | ||
test-xtimer: CFLAGS+=-DTEST_XTIMER -DTIM_TEST_FREQ=XTIMER_HZ -DTIM_TEST_DEV=XTIMER_DEV | ||
test-xtimer: all | ||
|
||
# Shortcut to configure the build for testing Kinetis LPTMR against a PIT reference | ||
# Usage: make BOARD=frdm-k22f test-kinetis-lptmr flash | ||
.PHONY: test-kinetis-lptmr | ||
test-kinetis-lptmr: CFLAGS+=-DTIM_TEST_DEV=TIMER_LPTMR_DEV\(0\) -DTIM_TEST_FREQ=32768 -DTIM_REF_DEV=TIMER_PIT_DEV\(0\) | ||
test-kinetis-lptmr: all | ||
|
||
# Reset the default goal. | ||
.DEFAULT_GOAL := | ||
|
||
include $(RIOTBASE)/Makefile.include |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,341 @@ | ||
# Benchmark test for RIOT timers | ||
|
||
This test is intended to collect statistics about the runtime delays in the | ||
periph_timer implementation. This tool is mainly intended to detect problems in | ||
the low level driver implementation which can be difficult to detect from | ||
higher level systems such as xtimer. | ||
|
||
## Testing periph/timer | ||
|
||
The basic idea of the test is to generate a wide variety of calls to the | ||
periph/timer API, in order to detect problems with the software implementation. | ||
The test consists of a single thread of execution which will set random | ||
timeouts on a periph_timer device. A reference timer is used to measure the | ||
time it takes until the callback is called. The difference between the expected | ||
time and the measured time is computed, and the mean and variance of the | ||
recorded values are calculated. The results are printed as a table on stdout | ||
every 30 seconds. All of the test scenarios used in this application are | ||
based on experience from real world bugs encountered during the development of | ||
RIOT and other systems. | ||
|
||
### periph/timer functions tested | ||
|
||
Both timer_set and timer_set_absolute calls are mixed in a random order, to | ||
ensure that both functions are working correctly. | ||
|
||
### Rescheduling | ||
|
||
Some bugs may occur only when the application attempts to replace an already | ||
set timer by calling timer_set again while the timer is running. This is dubbed | ||
"resched" in this application. The application will issue rescheduling timer | ||
calls in the normal operation of the test to catch this class of problems. | ||
|
||
### Setting stopped timers | ||
|
||
Another class of bugs is the kind where a timer is not behaving correctly if it | ||
was stopped before setting a timer target. The timer should set the new target, | ||
but not begin counting towards the target until timer_start is called. This is | ||
covered by this application under the "stopped" category of test inputs. | ||
|
||
## General benchmark behaviour | ||
|
||
The benchmark application uses some techniques to improve the quality of the | ||
tests. | ||
|
||
### Avoiding phase lock | ||
|
||
The CPU time used for generation and setting of the next timer target in | ||
software is approximately constant across iterations. When the application flow | ||
is driven by the timer under test, via a mutex which is unlocked from the timer | ||
callback, the application will be "phase locked" to the timer under test. This | ||
peculiarity may hide certain implementation race conditions which only occur at | ||
specific phases of the timer ticks. In the test application, this problem is | ||
avoided by inserting random CPU delay loops at key locations: | ||
|
||
- After timer_stop | ||
- Before timer_set/timer_set_absolute | ||
- Before timer_start | ||
|
||
### Estimating benchmark CPU overhead | ||
|
||
An estimation of the overhead resulting from the CPU processing delays inside | ||
the benchmarking code is performed during benchmark initialization. The | ||
estimation algorithm attempts to perform the exact same actions that the real | ||
benchmark will perform, but without setting any timers. The measured delay is | ||
assumed to originate in the benchmark code and will be subtracted when | ||
computing the difference between expected and actual values. A warning is | ||
printed when the variance of the estimated CPU overhead is too high, this can | ||
be a sign that some other process is running on the CPU and disrupting the | ||
estimation, or a sign of serious problems inside the timer_read function. | ||
|
||
## Testing xtimer | ||
|
||
The benchmark application can be reconfigured for using xtimer instead of the | ||
low level periph/timer interface. Use the Makefile target test-xtimer to build | ||
a test for xtimer. | ||
|
||
### xtimer functions tested | ||
|
||
The application will mix calls to the following xtimer functions: | ||
|
||
- `_xtimer_set` | ||
- `_xtimer_set_absolute` | ||
- `_xtimer_periodic_wakeup` | ||
- `_xtimer_spin` | ||
|
||
These functions cover almost all uses of xtimer. The higher level functions | ||
such as `xtimer_usleep` and `xtimer_set_msg` all use these functions internally | ||
in the implementations. | ||
|
||
## Results | ||
|
||
When the test has run for a certain amount of time, the current results will be | ||
presented as a table on stdout. To assist with finding the source of any | ||
discrepancies, the results are split according to three parameters: | ||
|
||
- Function used: timer_set, timer_set_absolute | ||
- Rescheduled: yes/no | ||
- Stopped: yes/no | ||
|
||
### Expected results | ||
|
||
It is difficult to set a general expected result which applies to all | ||
platforms. However, the mean and variance of the differences should be small, | ||
in relation to the tick duration of the timer under test. What is "small" also | ||
depends the CPU core frequency, because of CPU processing time spent in the | ||
driver code and in the test code. For example, when testing a 1 MHz timer and | ||
comparing with another 1 MHz timer, the mean difference should likely be in | ||
single digits on a Cortex-M CPU. Testing a 32768 Hz timer and referencing a 1 | ||
MHz timer on the other hand should have a mean difference in double digits, the | ||
variance will also be greater because of the quantization errors and rounding | ||
in the tick conversion routine. If the mean or variance of the difference is | ||
exceptionally large, the row will be marked with "SIC!" to draw attention to | ||
the fact. | ||
|
||
### Interpreting timer_set_xxx statistics | ||
|
||
The first part of the results compares the reference time elapsed against the | ||
expected time. A negative value means that the timer alarm is triggered too | ||
soon, compared to the expected time. A positive value means that the timer | ||
alarm is triggered late, compared to the expected time. | ||
|
||
#### Expected variance | ||
|
||
The variance of the results should generally be as small as possible, except | ||
for the case where the reference timer is running at a higher frequency than | ||
the timer under test. There will always be a truncation error when setting a | ||
timer target because of the discrete timer counts that can be set. The | ||
benchmark will compute limits for the expected variance and show a message if | ||
the variance is less than or greater than the expected variance. | ||
|
||
##### Variance too low | ||
|
||
A variance lower than expected indicates that there is a phase correlation | ||
between the reference timer and the timer under test. This can only be | ||
explained by a software bug or a hardware failure. The primary suspect will be | ||
the reference timer. | ||
|
||
##### Variance too high | ||
|
||
A variance greater than expected indicates that there is some extra randomness | ||
in the timeout length, which can be caused by other processes interfering with | ||
the test, a software bug in the timer under test, or hardware failure. | ||
|
||
##### Mathematical explanation | ||
|
||
Let `X0` be the exact time that a timer_set call occurs, `X0` is a real number. | ||
Let `T` be the desired timeout, `T` is a positive integer because the timer can | ||
only count integer number of ticks. Let `x0 = floor(X0)` be the count that the | ||
timer is currently on when the timer is set, `x0` is an integer. If `X0` is | ||
uniformly random, then the truncation error `X0' = (X0 - x0)` will be a sample | ||
from a continuous uniform distribution in the interval `[0,1)` ticks. The timer | ||
target, `x1`, in the timer under test can be expressed as `x1 = x0 + T`. Let | ||
`k` be a real number that satisfies `Y = kx`, where `x` is an interval in the | ||
unit used by the timer under test, and `Y` is an interval in the unit used by | ||
the reference timer, `Y` is a real number. The time measured by the reference | ||
timer can be expressed as | ||
`Y = k(x1 - X0) = k(x0 + T - X0) = k(X0 - X0' + T - X0) = k(T - X0')`. | ||
`Y` will be a sample from a uniform random distribution in the interval | ||
`[kT, k(T + 1)]`. The variance of `Y` is defined as | ||
`Var(Y) = (k(T + 1) - kT)^2 / 12`. | ||
|
||
### Interpreting timer_read statistics | ||
|
||
A separate table is displayed for statistics on timer_read. This table compares | ||
the expected timer under test time after a timer has triggered, against the | ||
actual reported value from `timer_read(TIM_TEST_DEV)`. A positive value means | ||
that the TUT time has passed the timer target time. A negative value means | ||
that, according to the timer under test, the alarm target time has not yet been | ||
reached, even though the timer callback has been executed. | ||
|
||
### A note on BOARD=native | ||
|
||
When running on native, the application will be running as a normal process in | ||
a multi process system which means that most results will have a much greater | ||
variance than expected and the resulting differences will also be much larger | ||
than on a bare metal system. Be careful when drawing conclusions on results | ||
from native. | ||
|
||
## Configuration | ||
|
||
The timer under test and the reference timer can be chosen at compile time by | ||
defining TIM_TEST_DEV and TIM_REF_DEV, respectively. The frequencies for | ||
these timers can be configured through TIM_DEV_FREQ and TIM_REF_FREQ. For | ||
example, to compare timer device 2 running at 32768 Hz against a reference timer | ||
on timer device 0 (default) running at 1 MHz (default) on Mulle: | ||
|
||
CFLAGS='-DTIM_TEST_DEV=TIMER_DEV\(2\) -DTIM_TEST_FREQ=32768' BOARD=mulle make flash | ||
|
||
### Default configuration | ||
|
||
reference timer: TIMER_DEV(0) | ||
timer under test: TIMER_DEV(1) | ||
timer frequency: 1 MHz | ||
|
||
Define USE_REFERENCE=0 to compare a timer against itself. Be aware that this | ||
may hide a class of errors where the timer is losing ticks in certain | ||
situations, which a separate reference timer would be able to detect. | ||
|
||
### Example output | ||
|
||
Below is a short sample of a test of a 32768 Hz timer with a 1 MHz reference: | ||
|
||
------------- BEGIN STATISTICS -------------- | ||
Limits: mean: [-10, 41], variance: [58, 99] | ||
Target error (actual trigger time - expected trigger time), in reference timer ticks | ||
positive: timer is late, negative: timer is early | ||
=== timer_set running === | ||
interval count sum sum_sq min max mean variance | ||
1 - 2: 25705 455646 2002373 2 35 17 77 | ||
3 - 4: 25533 457326 1973191 2 35 17 77 | ||
5 - 8: 25412 451046 1970014 2 35 17 77 | ||
9 - 16: 25699 458563 2017198 2 36 17 78 | ||
17 - 32: 25832 457768 2018909 1 35 17 78 | ||
33 - 64: 25758 440454 1978830 1 35 17 76 | ||
65 - 128: 25335 414320 1942196 0 34 16 76 | ||
129 - 256: 25254 367689 1979616 -3 32 14 78 | ||
257 - 512: 25677 285822 2020736 -7 31 11 78 | ||
TOTAL 230205 3788634 18459378 -7 36 16 80 | ||
... | ||
|
||
The statistics above show that the timer implementation introduces a small | ||
delay on short timers. Note however that it is not clear whether this delay is | ||
in timer_set, or timer_read, but the combined effect of all error sources make | ||
the timer overshoot its target by on average 17 microseconds. The difference | ||
between min and max is close to the expected difference of approximately | ||
1/32768 s = 30.51 µs. The lower min and mean for the longer timer intervals are | ||
most likely caused by a drift between the 1 MHz clock and the 32.768 kHz clocks | ||
inside the CPU, and is expected when using two separate clock sources for the | ||
timers being compared. | ||
|
||
Below is a short sample of a test where there is something wrong with the timer | ||
implementation, again a 32768 Hz timer tested with a 1 MHz reference: | ||
|
||
=== timer_set_absolute resched === | ||
interval count sum sum_sq min max mean variance | ||
1 - 2: 152217 82324291 13339518497 38 1071 540 87635 <=== SIC! | ||
3 - 4: 152199 82254909 13301794832 38 1071 540 87397 <=== SIC! | ||
5 - 8: 152023 82085454 13264873203 38 1071 539 87256 <=== SIC! | ||
9 - 16: 152156 82236539 13265351119 38 1071 540 87183 <=== SIC! | ||
17 - 32: 152639 82606815 13267666812 38 1071 541 86922 <=== SIC! | ||
33 - 64: 151883 81836155 13268661551 37 1070 538 87361 <=== SIC! | ||
65 - 128: 152114 82151495 13228100922 36 1070 540 86962 <=== SIC! | ||
129 - 256: 152363 81806042 13278055359 34 1069 536 87148 <=== SIC! | ||
257 - 512: 152360 81235185 13303811951 29 1066 533 87318 <=== SIC! | ||
TOTAL 1369954 738536885 119197349719 29 1071 539 87008 <=== SIC! | ||
|
||
We can see that the variance, the maximum error, and the mean are very large. | ||
This particular timer implementation needs some work on its timer_set_absolute | ||
implementation. | ||
|
||
## Configuration details | ||
|
||
Configuration macros used by the application are described below | ||
|
||
### Settings related to timer hardware | ||
|
||
#### TIM_TEST_DEV | ||
|
||
Timer under test. Default: `TIMER_DEV(1)` | ||
|
||
#### TIM_REF_DEV | ||
|
||
Reference timer used for measuring the callback time error. Default: `TIMER_DEV(0)` | ||
|
||
#### TIM_TEST_FREQ | ||
|
||
Frequency of the timer under test, used when calling timer_init during the | ||
application startup. Default: `1000000`, 1 MHz | ||
|
||
#### TIM_REF_FREQ | ||
|
||
Frequency of the reference timer, used when calling timer_init during the | ||
application startup. Default: `1000000`, 1 MHz | ||
|
||
#### TIM_TEST_CHAN | ||
|
||
Timer channel used on the timer under test. Default: `0` | ||
|
||
#### USE_REFERENCE | ||
|
||
Shorthand for setting the reference timer to the same as the timer under test. | ||
Default: `0` | ||
|
||
### Settings related to result processing | ||
|
||
|
||
#### DETAILED_STATS | ||
|
||
Keep statistics per timer target offset value if set to 1, otherwise keep | ||
statistics only per scenario. Default: `1` | ||
|
||
#### LOG2_STATS | ||
|
||
Only used when `DETAILED_STATS == 1`. Statistics are grouped according to | ||
2-logarithms of the timer offset value, e.g. 1-2, 3-4, 5-8, 9-16 etc. This | ||
reduces memory consumption and creates a shorter result table for easier | ||
overview. Default: `1` | ||
|
||
#### TEST_PRINT_INTERVAL_TICKS | ||
|
||
The result table will be printed to standard output when this many reference | ||
timer ticks have passed since the last printout. Default: `((TIM_REF_FREQ) * 30)` | ||
|
||
### Settings related to timer input generation | ||
|
||
#### TEST_MIN | ||
|
||
Minimum timer offset tested for `timer_set_absolute`, in timer under test | ||
ticks. Default: `1` for timers < 1 MHz, `16` otherwise, to prevent setting a | ||
timer target in the past | ||
|
||
#### TEST_MIN_REL | ||
|
||
Minimum timer offset tested for `timer_set`, in timer under test ticks. | ||
Default: `0` | ||
|
||
#### TEST_MAX | ||
|
||
Maximum timer offset tested, in timer under test ticks. Default: `128` | ||
|
||
#### SPIN_MAX_TARGET | ||
|
||
The CPU busy wait loop will be calibrated during application start up so that | ||
the maximum random spin length is approximately equal to this many timer under | ||
test ticks. Default: `16` | ||
|
||
#### TEST_UNEXPECTED_STDDEV | ||
|
||
Mark the output row if the computed variance implies a standard deviation which | ||
is greater than this value. This value will be automatically recomputed to | ||
compensate for the variance resulting from the truncation in the tick | ||
conversion if the reference timer is running at a higher frequency than the | ||
timer under test. Default: `4` | ||
|
||
#### TEST_UNEXPECTED_MEAN | ||
|
||
Mark the output row if the computed mean absolute difference is greater than this | ||
number of timer under test ticks. This value will be automatically recomputed | ||
to compensate for the error resulting from the truncation in the tick | ||
conversion if the reference timer is running at a higher frequency than the | ||
timer under test. Default: `2` |
Oops, something went wrong.