-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timers seem to get clobbered in Linux multi-threaded application #164
Comments
Firstly, your 'timer update' is always returning 1. I've found in the past that the stack can get really upset with that simplification. Here's my (monotonic clock) based timer that I've been using successfully for the past few years. It only needs a mutex and a monotonic clock with (at least) ms resolution. Ignore the off-brand function signatures.
|
I'm assuming your timer is configured as a cyclic timer, with the ISR being triggered every X milliseconds. I've had issues keeping precise output timing with that setup on our Linux system, as there's a bazillion more threads and programs running in parallel and causing timers to skew more or less severely over time. That said, the timer Regardless, thank you very much! Great input, I'll try to check again what happens in the main loop seems to stop creating and refreshing the internal delta timers. I'll definitely make sure to avoid headaches if I'll ever need to implement this stack on a real MCU. |
Under Linux, just use std::chrono::steady_clock as your time source - no need for custom timers. |
I had a cyclic clock with chrono::steady_clock time points in a previous iteration, very similar to your snippet actually, but it yielded sub-optimal results, like relatively high CPU usage for the periodic tick thread and very imprecise timing (eg, the 150ms expected TPDO rate skewed more often in the 120-180ms range instead). The timers I'm using are actually provided by |
I've taken inspiration from the Linux port in issue #140, and quickly turned it into a delta-timer, using OS-level timers in nanosecond(!) precision (
timer_create
and friends), where I got back to <2% timer variation.The result looked pretty good, until I realized, after ~2 hours of free-running, the PDOs were no longer triggering. The effective behavior dances around (1) no TPDO will ever transmit until the application is restart, or (2) one of the 2 active TPDOs stops, while the other keeps going normally, and a quick jump between Op-PreOp-Op might restore it temporarily.
In short, the soft-timers seem to get "corrupted" and the HAL timer never gets rearmed by the stack.
I can't provide the real application due to NDA, but I've reproduced the problem in this reduced project, which behaves pretty much the same: https://github.com/TheRealZago/canopen-timers. I've left some debugging notes I've acquired over the last 3 weeks of analyzing this problem, but it's extremely annoying to reproduce and debug.
If anyone has deployed this stack in a Linux environment, did you ever encounter this issue?
Otherwise, what interaction should I be tracking more in detail in the stack for figuring out why the timers seem to get corrupted?
Before getting lapidated, I'm not expecting an "I HAZ CODES" solution, but I'd be very happy to get input from "experts" who've been working with this project for longer than I have... 😄
The text was updated successfully, but these errors were encountered: