-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timer rewrite phase 3: drivers #10556
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10556 +/- ##
==========================================
+ Coverage 48.34% 51.04% +2.69%
==========================================
Files 264 227 -37
Lines 42164 28601 -13563
Branches 10135 7145 -2990
==========================================
- Hits 20386 14599 -5787
+ Misses 17700 11212 -6488
+ Partials 4078 2790 -1288
Continue to review full report at Codecov.
|
ba7ce85
to
319033f
Compare
Update with some fixes to HPET... ...and a reworked nrf_rtc_timer driver that really wants review from Nordic folks. It works for me on a nrf52_pca10056 at least, but needs validation on broader hardware. The size should make review easy at least. |
@@ -1,513 +1,68 @@ | |||
/* | |||
* Copyright (c) 2016-2017 Nordic Semiconductor ASA | |||
* Copyright (c) 2018 Intel Corporation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pizi-nordic please review
319033f
to
c7a9b9e
Compare
Rebased. Fixed an issue on nrf where the driver and kernel don't care about the symbol name of the ISR, but one test does. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still afraid of using now()
time in z_clock_set_timeout()
.
Even with the lock there is small probability of slipping over clock cycle during each clock programming operation. Could we use last announcement as reference here? In such case we will have to read the now()
only when new timeout will be added to timeout queue, and all further operations will be error-free.
{ | ||
u32_t rtc_away, sys_away = 0; | ||
#define COUNTER_MAX 0x00ffffff | ||
#define CYC_PER_TICK (CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we use sys_clock_hw_cycles_per_tick() instead?
IMHO we should have single function/macro for core calculations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to deprecate that as an API, actually. It's dangerous due to precision loss and has a weirdness (it has to look like a function call) just to deal with HPET (which has to query the frequency at runtime). The only code that uses it is in the drivers, and I'd prefer they do the math themselves.
drivers/timer/nrf_rtc_timer.c
Outdated
sys_clock_hw_cycles_per_tick()); | ||
nrf_rtc_event_enable(SYS_CLOCK_RTC, RTC_EVTENSET_COMPARE0_Msk); | ||
nrf_rtc_int_enable(SYS_CLOCK_RTC, RTC_INTENSET_COMPARE0_Msk); | ||
RTC->PRESCALER = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use nrf_rtc_prescaler_set() here.
drivers/timer/nrf_rtc_timer.c
Outdated
|
||
/* Clear the event flag and possible pending interrupt */ | ||
RTC_CC_EVENT = 0; | ||
RTC->EVENTS_COMPARE[0] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use nrf_rtc_event_clear() here.
drivers/timer/nrf_rtc_timer.c
Outdated
{ | ||
unsigned int key; | ||
#ifdef CONFIG_TICKLESS_KERNEL | ||
ticks = ticks == K_FOREVER ? MAX_TICKS : ticks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest parentheses around ticks == K_FOREVER
.
unsigned int key; | ||
#ifdef CONFIG_TICKLESS_KERNEL | ||
ticks = ticks == K_FOREVER ? MAX_TICKS : ticks; | ||
ticks = max(min(ticks - 1, (s32_t)MAX_TICKS), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ticks - 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because our API is werid. Historically we've passed a "1" to set_timeout() to indicate "at the next tick" (and then had to play games with a _TICK_ALIGN symbol needlessly). But in the code here we want to compute a delta from current time rounded up to the next tick boundary, so we start from ticks-1.
|
||
irq_disable(NRF5_IRQ_RTC1_IRQn); | ||
/* Round up to next tick boundary */ | ||
cyc = ticks * CYC_PER_TICK + counter_sub(t, last_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about potential overflow in these calculations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticks gets clamped to [0,MAX_TICKS] in the code above this spot, which is calibrated to prevent that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear: that's about preventing overflow beyond 24 bits of cycles in a single timeout.
There's no need to prevent overflow in the calculation of the next timeout value itself, the cycle counter is cyclic and just rolls over.
|
||
set_comparator(cyc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if while tick to cycle calculation will work correctly if we set tick equal to hw clock cycle. It looks that announce will be always scheduled 1 clock cycle earlier than expected in such case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the point. The cyc value here is the time of the last expiration (last_count) plus some integer number of CYC_PER_TICK increments. Unless the timer skips a cycle when the comparator fires, the ticks should be perfectly in-phase and evenly scheduled (modulo interrupt delivery slop, of course).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I see what you're saying: in tickless, you HAVE to use the current time, because counting interrupts isn't sufficient. But the math is guaranteed not to underrun.
/* Calculate time to program into RTC */ | ||
rtc_away = (RTC_MASK - RTC_COUNTER); | ||
rtc_away = rtc_away > RTC_HALF ? RTC_HALF : rtc_away; | ||
#define MIN_DELAY 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this minimal delay?
In current implementation, this basically reduces timer resolution, which is a thing we would like to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to have a minimum value greater than zero here, otherwise it's possible that the computation of the "next" value based on the "current" timer counter will compute a value that is in the "past" by the timer it arrives in the comparator register.
I picked 1/10th of a tick to be conservative, but given the frequencies here something much lower is probably fine. You guys are the experts :)
e05ed1c
to
5ae7497
Compare
APIC timer refactoring is coming as part of x86_64, FWIW. In fact the need for that is sort of the initial germ that got me started on this whole timer saga. |
7b2b4e9
to
6f39b78
Compare
6f39b78
to
ca05fc1
Compare
Newer, and much smaller driver written to the new timer API. Supports all the features the old one did (including shutting off the clock when clock_always_on is disabled), should be faster in practice, and should be significantly more accurate due to the "lost cycle" trick applied in z_clock_set_timeout(). Signed-off-by: Andy Ross <[email protected]>
The SysTick driver is working with tickless now. Make this our first guinea pig. Signed-off-by: Andy Ross <[email protected]>
Qemu doesn't like tickless. By default[1] it tries to be realtime as vied by the host CPU -- presenting read values from hardware cycle counters and interrupt timings at the appropriate real world clock times according to whatever the simulated counter frequency is. But when the host system is loaded, there is always the problem that the qemu process might not see physical CPU time for large chunks of time (i.e. a host OS scheduling quantum -- generally about the same size as guest ticks!) leading to lost cycles. When those timer interrupts are delivered by the emulated hardware at fixed frequencies without software intervention, that's not so bad: the work the guest has to do after the interrupt generally happens synchronously (because the qemu process has just started running) and nothing notices the dropout. But with tickless, the interrupts need to be explicitly programmed by guest software! That means the driver needs to be sure it's going to get some real CPU time within some small fraction of a Zephyr tick of the right time, otherwise the computations get wonky. The end result is that qemu tends to work with tickless well on an unloaded/idle run, but not in situations (like sanitycheck) where it needs to content with other processes for host CPU. So, add a flag that drivers can use to "fake" tickless behavior when run under qemu (only), and enable it (only!) for the small handful of tests that are having trouble. [1] There is an -icount feature to implement proper cycle counting at the expense of real-world-time correspondence. Maybe someday we might get it to work for us. Signed-off-by: Andy Ross <[email protected]>
These tests was written to try to eliminate timer interrupts, but the mechanism chosen (setting TICKS_PER_SECOND to 1) is dangerously susceptible to overflow conditions on systems with fast cycle counters and high timeout durations. Actually the tests pass just fine if you use a conventional tick rate and use a tickless-capable driver (which eliminates interrupts too, which is the whole point), but there's no easy way in kconfig to do an "if" to select that condition for capable systems only. Just disable tickless to keep the same behavior. Signed-off-by: Andy Ross <[email protected]>
When TICKLESS_KERNEL is enabled, the current time in ticks is based on a hardware counter and not interrupt delivery (which is the whole point of tickless), so irq-locking does not prevent time from advancing. Disable this test in that configuration. Signed-off-by: Andy Ross <[email protected]>
This test was written with an outrageously long timeout of 25 seconds. That blows right through the 32 bit cycle counter on qemu_cortex_m3[1] and produces an essentially random delay instead of the desired number, causing a hang with the new SysTick driver in tickless mode. Push the number down so it doesn't overflow. The root cause, though, is that k_busy_wait() can take arguments it can't handle. It ought to have an outer loop or something so that it can spin for INT_MAX milliseconds correctly. [1] Which has a 12MHz clock rate. Many hardware implementations are much faster still. Signed-off-by: Andy Ross <[email protected]>
Add a TICKLESS_CAPABLE kconfig variable which is used by the kernel to select tickless mode's default automatically on drivers that support it (rather than having to set the default per-board). Select it from the ARM SysTick and Intel HPET drivers. Also remove the old qemu_cortex_m3 default settings which this replaces. Signed-off-by: Andy Ross <[email protected]>
Many drivers won't need to implement z_clock_idle_exit() or sys_clock_disable(). Make those weak stubs too. Signed-off-by: Andy Ross <[email protected]>
Rewritten along the lines of ARM SysTick. Implements only the new, simplified API. MUCH smaller. Works with tickless pervasively. No loss of functionality. Signed-off-by: Andy Ross <[email protected]>
If this function is itself interrupted by a timeslice event, the slicing state can be corrupted. Just re-use the scheduler lock instead of using a new spinlock; this is a low-latency function that won't deadlock. Found by inspection. Signed-off-by: Andy Ross <[email protected]>
Reworked using the older hardware interface code, but with an implementation of the new API only. Much smaller & simpler. As yet, tested (manually) only on a nrf52_pca10056 board. Signed-off-by: Andy Ross <[email protected]>
Rewritten driver along the lines of all the other new drivers, implementing the new timer API. Structurally, the machine timer is an up-counter with comparator, so it works broadly the same way HPET and NRF do. The quirk here is that it's a 64 bit counter, which needs a little more care. Unlike the other timer reworks, this driver has grown by a few lines as it used to be very simple. But in exchange, we get full tickless support on the platform. Fixes zephyrproject-rtos#10609 in the process (the 64 bit timer registers are unlatched for sub-word transfers, so you have to use careful ordering). Signed-off-by: Andy Ross <[email protected]>
Rewritten Xtensa CCOUNT driver along the lines of all the other new drivers. The new API permits much smaller code. Notably: The Xtensa counter is a 32 bit up-counter with a comparator register. It's in some sense the archetype of this kind of timer as it's the simplest of the bunch (everything else has quirks: NRF is very slow and 24 bit, HPET has a runtime frequency detection, RISC-V is 64 bit...). I should have written this one first. Note also that this includes a blacklist of the xtensa architecture on the tests/driver/ipm test. I'm getting spurious failures there where a k_sem_take() call with a non-zero timeout is being made out of the console output code in interrupt context. This seems to have nothing to do with the timer; I suspect it's because the old timer drivers would (incorrectly!) call z_clock_announce() in non-interrupt context in some contexts (e.g. "expiring really soon"). Apparently this test (or something in the IPM or Xtensa console code) was somehow relying on that on Xtensa. But IPM is a Quark thing and there's no particular reason to run this test there. Signed-off-by: Andy Ross <[email protected]>
ca05fc1
to
ee8fe57
Compare
Rebased and pushed a final version. All code is present, all tests are passing. All needed workaround (e.g. for nrf54_bsim's spin loops) are already in the tree. I think all criticism is addressed. Needs final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving in lieu of @pizi-nordic who is away this week
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RISC-V LGTM
Just like the last (still active) PR #10303 this builds on top of the previous one and adds extra patches.
In this case, it takes advantage of the much simpler timer driver API to actually implement much simpler drivers for ARM SysTick and Intel HPET. Seriously, these are like 4x smaller now.
They support all the features the old ones did, and now work with tickless pervasively (and enable tickless by default, as there's essentially no downside anymore).
Please review. Also please test if you can, especially on SysTick hardware (I've only run tests on Qemu and the two boards I have on my desk).
I have similar rewrites in progress for Xtensa and RISC-V MTIME (which are very similar in structure to HPET) and will append those here as I finish them.
While this can be merged as a single unit, it's probably best to merge the older pull request first to smooth the transition. I'll rebase this one as soon as the older one goes in.