-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many edge detection "interrupts" but level remains the same #149
Comments
When trying to cache the previous level detected in my own code, then only report changes up the chain, I have ran into difficulties due to the 'static bound on the FnMut that is passed to set_async_interrupt... I will keep trying though. |
I see this code in interrupt.rs:
Since that is sitting in a thread in a loop, per pin, would you accept a PR to filter out interrupts when the level has not actually changed? I would read the initial level before entering the loop, then compare always to previous and only call callback if level changed, and update previous. |
It has been several years since I last looked into this issue, so it's nice to have some fresh eyes on it. While it's certainly possible RPPAL is handling something incorrectly, my conclusion at the time was that this is a limitation of the underlying linux drivers, with triggered events getting lost when too many occur within a short period of time, likely because the interrupt handler is still running when the interrupt gets triggered again. A switch getting pressed would definitely cause a large amount of events, and would require some debouncing.
The problem with that approach is that we're hiding interrupt events. The level has changed. It's just that the Low between the two Highs didn't get reported, or vice versa. I think a better approach would be to assume an event was lost, and add one in between, but I'm not sure if I'm comfortable manufacturing events instead of just reporting the data we get from the drivers. Perhaps we just need better documentation to make it clear this can happen under certain circumstances, and let the end-user (in this case piggui) decide how best to handle it. |
Understood. Maybe there are bugs in the linux drivers, but that's getting a bit beyond my "expertise"... is there somewhere upstream for the drivers I could ask? |
It's a little beyond my expertise as well, as for anything other than direct GPIO register reads/writes I'm relying on linux drivers. Unfortunately I currently don't have enough time to invest in some dedicated testing to figure out under what circumstances interrupts are lost. If the cause is indeed too many interrupts within a short period of time, it would also happen for only rising or only falling edges. My guess would be these aren't bugs, but limitations of handling interrupts in linux. I would check the official Raspberry Pi forums first, as any Python library that works with interrupts in a similar way as RPPAL does would run into the same issue. |
I posted here on Raspberry Pi forum. |
It would be good to get to the bottom of this, and if at all possible NOT loose edges... But, meantime, FWIW I would have rrpal respect the "contract" with its users, to inform (only?) ok of level changes on an input....hence if the event from the Linux driver reads the same level as before I would not call the callback. |
I appreciate the feedback, but like I mentioned before, barring any bugs in RPPAL or the drivers, a level change did happen. Not calling the callback means that we'd be hiding events, which I don't think is the correct behavior here. We're getting the same level multiple times in a row because one or multiple interrupt triggers were not reported by the underlying drivers. If the cause is a timing issue, we would be exacerbating the problem by skipping even more events. There is an argument to be made for inserting a callback for the missing level change, but in the end RPPAL is just an interface library that makes it easier to access the various peripherals, and not a way to correct the data as reported by the system. I do need to add this to the documentation. For piggui the situation is different, as it's an application for end-users to interact with and report the status of the GPIO pins. I would probably insert the missing interrupt trigger for the most accurate reporting, but it's of course up to you if you think filtering out the events is the better choice. |
Fair enough. I guess it comes down to the definition of "event" for the user of rrpal. For me a level change event, should show a change in level. Two level changes that in fact both report the same level, is kind of a non-event in my book. :-) |
I came across the same issue yesterday. Getting many events in short succession (even with the same state) wasn't a big deal as debouncing it is common practice. The bigger issue though was that the last event of the sequence was not always the expected level. E.g., when pulling the pin from high to low, after some bounces, the last event was sometimes reporting high. Imo, this makes this API somewhat pointless. The only thing the interrupt could be used for then is to be notified that something change so that you can go and actively read the pin after some silence period but the level passed to the interrupt is not usable at all. |
I haven't noticed issues when detecting changes for both rising and falling edges on the same GPIO pin at reasonable speeds. It seems to skip events when the interrupt is triggered in rapid succession, like when pressing a switch. The documentation needs to make this clear. The way I would interpret the event is "a rising or falling edge was detected, this is the current pin level", with the logical conclusion that the level had an opposite value before that. If you want better performance, you would have to bypass the linux drivers or just go bare metal entirely. This isn't a limitation of RPPAL but of the underlying drivers, and while I could insert or hide events to make the output make more sense, I think it's more appropriate for the end-user to decide what to do in that situation. |
Piggy-backing on this issue thread, to mention that we've done a new release of the "piggui" GUI for controling and viewing GPIO hardware on a Raspberry Pi, using rppal. You can find the release notes with videos and more gifs here As a teaser, here is a YouTube video showing the LED and waveform display of an input, connected to a real hardware button. If you have ideas for future functionality, please contribute issues or start a discussion |
As a side note, I started working on transitioning to the GPIO v2 API (here) back when it wasn't certain the Pi 5 would still give us direct GPIO register access, but never completed the transition when it was clear we could use
So while this doesn't solve the issue, once I get around to moving the interrupt handling to the v2 API, which adds those event sequence numbers, at least I can detect when an event has been skipped, and possibly add that info to the callback. EDIT: Theoretically I could make the buffer as large as possible to limit how often it overflows if that's what's causing the missing events. It wouldn't solve the issue, just delay it, but it's something. |
Thanks for pursuing this. When I observed this (using println in my code), I saw repeated levels in events, very quickly.... i.e. it was only one or two events (the initial "down" on a button and maybe a bounce) before I observed a stream of 23 events, all with the same level.... so it didn't appear at first blushes like a buffer overrun of events (unless a buffer size of 1 or 2). FYI. |
I'm not sure what the buffer size is for the v1 API, or if it even buffers events at all, as the The only mention I found of a bug was regarding a race condition with interrupts on multiple GPIO pins, but I assume you weren't testing this on multiple pins at the same time. |
Since I'm adding the event sequence number to the callback, I might as well add the timestamp at the same time (best estimate of time of event occurrence, in nanoseconds, provided by the GPIO v2 API), which can be helpful to detect how much time has passed between two trigger events. |
v1 documentation. The buffer is fixed at 16. And if the buffer overflows the most recent event is discarded, which is really unfortunate for userspace debounce algorithms or anything interested in the final state. So for v1 you really don't want that buffer to overflow. v2 has debounce support, and that is applied before the event buffer. That is probably what you want if you are dealing with noisy hardware. Even v2 doesn't help with interrupts occurring faster then the kernel can handle them though. Those get lost with no indication, as they are too fast for the kernel to see. Can't say I'm a fan of mixing direct register access with using a Linux driver. You have no protection against contention so you might get some entertaining interactions and those would be a serious pain to debug. |
No, just one input connected to a button.
Yes, I add one later as soon as I receive the event, so one closer to the real time would be appreciated.
That is a strange choice....getting the last event and the last known level would seem a better choice to me.
That sounbds positive.
Off course, there is a limit to what the kernel can handle, always possible to be too quick for it. |
Which is exactly why it was changed in v2. Not sure why v1 was that way - it was probably just easier and simpler and no one thought too hard about the implications. Discarding the oldest is more work given the way kernel FIFOs operate, but it is well worth the effort. |
A contributor passed this to me:
Have you tried rppal on a Pi5? |
Thanks for the insight. That explains a lot, and adds even more incentive to upgrade to v2.
I would love to use the GPIO API for everything, but it has some limitations (in both v1 and v2) related to alternate function modes that makes it incompatible with part of RPPAL's feature set. It's something I'll be re-evaluating after the transition to v2, and limit direct register access only to functionality the API doesn't provide. |
RPPAL is compatible with the Pi 5. We moved away from using sysfs a long time ago, and the Pi 5 provides direct GPIO register access on the RP1 through |
Yeah, the GPIO character device is strictly a GPIO API. Call that a limitation if you like. There are other Linux subsystems and APIs for other pin functions. Switching pins between modes is very platform dependent and typically done with device-tree or some other API. |
Quick update on this issue. I've completed the transition to the v2 API for interrupts, and will merge the changes to master when I've had a chance for some hardware testing. The main change for RPPAL users is that all synchronous/asynchronous interrupt trigger functions/callbacks now return an pub struct Event {
/// Best estimate of time of event occurrence.
pub timestamp: SystemTime,
/// Sequence number for this event in the sequence of interrupt trigger events for this pin.
pub seqno: u32,
/// Pin level that triggered the interrupt.
pub level: Level,
} I'll probably replace |
FYI. We have just released "pigg" 0.3.3 and now I will tackle updating to the new API for 0.4! I was holding off on major API changes until I got this out the door. ==== "pigg" (GUI for Raspberry Pi GPIO interaction using Iced) v0.3 is out and ready for use Release notes: https://github.com/andrewdavidmackenzie ... /tag/0.3.3 The main new feature is the addition of Remote GPIO interaction between "piggui" and a remote "piglet" over iroh-net. Video with Dialog to connect: https://youtu.be/aToJ1aT7NeM Video using CLI argument to connect: https://youtu.be/zcEa_Oke014 |
I see last release was on May 18th, before these changes right? I'm ready to migrate to try this new code for our app. Are you planning to do a release with it? Should I give it a try using HEAD of master in the meantime, or still subject to change? Thanks |
Correct. The last release didn't have any of these changes yet. I've been a little swamped with other projects lately, which is why I haven't had a chance to get this completed. I'll try to make some time available this week to get the new API tested and merged in, after which I'll plan a release that includes them. |
@andrewdavidmackenzie the new changes are available on the master branch. I've also updated the dev documentation. Both synchronous and asynchronous interrupts now use the v2 API internally. Interrupt handlers now get an The event buffer size has been set to 1024 by default, but some events are still skipped when the interrupt is triggered a lot. I didn't notice any gaps in the sequence numbers, so it seems they're not caught at all by the underlying drivers. However, I've added support for an optional debounce period, which eliminates the issue where it makes sense to use one, like with button presses. I'll be cleaning up the code and documentation over the coming days, and will schedule a release for next week barring any serious issue. |
OK, I'll check it out.
I was passing on the timestamp to a remote node as a UTC time of when the
event happened, so will see what changes that brings in with it.
Andrew
…On Thu, Aug 8, 2024 at 12:46 PM Rene van der Meer ***@***.***> wrote:
@andrewdavidmackenzie <https://github.com/andrewdavidmackenzie> the new
changes are available on the master branch. I've also updated the dev
documentation <https://docs.golemparts.com/rppal-dev>.
Both synchronous and asynchronous interrupts now use the v2 API
internally. Interrupt handlers now get an Event struct instead of Level
as mentioned above, but timestamp has been changed to a Duration as it
specifies time elapsed since the system was booted, and level has been
replaced by trigger.
The event buffer size has been set to 1024 by default, however some events
are still skipped when the interrupt is triggered a lot. I didn't notice
any gaps in the sequence numbers, so it seems they're not caught at all by
the underlying drivers. However, I've added support for an optional
debounce period, which eliminates the issue where it makes sense to use
one, like with button presses.
I'll be cleaning up the code and documentation over the coming days, and
will schedule a release for next week barring any serious issue.
—
Reply to this email directly, view it on GitHub
<#149 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKF4LF4CPJ4VBQRQP4NRO3ZQND7TAVCNFSM6AAAAABJTMIRXGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZVGUYTIMRYGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah, I originally intended to use |
You get that from the driver you use, as opposed to using now() when you
receive it?
…On Thu, Aug 8, 2024, 12:53 PM Rene van der Meer ***@***.***> wrote:
I was passing on the timestamp to a remote node as a UTC time of when the
event happened, so will see what changes that brings in with it.
Yeah, I originally intended to use SystemTime before I realized what the
original value was based on. I considered converting it in RPPAL, but
decided to leave that up to the user to avoid any unnecessary/double
conversions if they're just using it to check elapsed time between events.
—
Reply to this email directly, view it on GitHub
<#149 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKF4LEAVBWXF6EPRAGN5D3ZQNE3BAVCNFSM6AAAAABJTMIRXGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZVGUZDMOBWGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yep. Calculating the time based on when it's handled by RPPAL wouldn't be as accurate. You could use something like the sysinfo crate to get the Pi's boot time and calculate the timestamp in UTC. |
Is it safe to assume that only either Trigger::RisingEdge or Trigger::FallingEdge will ever be used in Event, as otherwise the level would be ambiguous (if Both were used)? I'd prefer Trigger and Level being two separate types, but can work with it if that's a valid assumption. Maybe add that to the doc comment for set_async_interrupt()? |
Correct, only one of those two values will ever be returned. I'll add a note to the documentation. |
OK, built using the new version from git. Later I'll look into the debounce you added, not this first iteration. Thanks! |
There is the option in the v2 uAPI to select the source clock for edge events. It defaults to the MONOTONIC clock so differences between event times are always consistent. But you can select the REALTIME clock if that is what you would prefer. Just be aware that any corrections to system time will invalidate any calculations based on comparing timestamps. The event timestamp is captured by the kernel interrupt handler and so provides the closest possible time to the physical edge. |
Thanks for pointing that out. I forgot there was an option to set which clock to use. That said, I'll keep it on monotonic precisely for the reason you mentioned, allowing comparisons without having to worry about changes to the system time. Conversion can be done by the user if needed. |
Closing this as the original issue has been addressed as much as we can in the upcoming 0.19.0 release, scheduled for next week. Feel free to open a new issue for any new interrupt-related issues. It's still possible to get the same event multiple times in a row, but based on the sequence numbers we're now adding to each event, these aren't caused by a buffer overflow as no events in the sequence are missing, which means these events are occurring faster than the kernel can handle them. To mitigate this, support for the v2 debounce feature has been added, filtering any input noise from buttons and similar components, which in my tests eliminated the double/missing events. Thanks all for getting this figured out. |
Works perfectly for me and (subjectively) better than before I think. I'll wait for you to do your 0.19.0 release before I merge my PR using it though, I prefer to not depend on the git version. Thanks! |
The timestamp in the let mut time = libc::timespec {
tv_sec: 0,
tv_nsec: 0,
};
let ret = unsafe { libc::clock_gettime(libc::CLOCK_MONOTONIC, &mut time) };
let start = std::time::Duration::new(time.tv_sec as u64, time.tv_nsec as u32); |
I'll look into using that. Thoughts on providing a mechanism in rrpal, to get a clock/Duration that corresponds to that uses in Events sent? (I had thought on asking a timestamp alongside the value provided by read(), but that would be a breaking change and doubt a read_with_timestamp() would be a great addition....?) |
I'll have to give it some thought. Providing a monotonic clock version of |
I'm told "note that CLOCK_MONOTONIC does not count suspended time, while CLOCK_BOOTTIME does" Probably not very important though |
You'll want to use the same clock as is used to calculate the timestamps. From a quick search it looks like it calls |
FYI - just to let you know I finally released 0.4.0 that updated to v2 API and uses hardware timestamps. Thanks again We have just published release 0.4.0 Install it (on your macos/linux/windows "host" and/or your Raspberry Pi) with:
This release has been a long time coming due to Summer vacations, being busy in general, struggling to get GH Action for our release Notable features and changes in this release include:
There are quite a few issues in that 0.5.0 Milestone that are related to small UI improvements, code improvements or the Pico port and allow me to continue to learn new aspects of rust. See the Release 0.4.0 Discussion topic on |
I was debugging some weirdness in my app ("piggui") and added some logging (println!()) to when it detects edges on inputs.
I'm using it with a pretty crappy button, that I imagine it has quite a bit of HW bounce in it.
So I imagined a flurry of high/low/high/low level changes in short succession.
When testing however:
etc.
So, while I expected "bounce" I didn't expect the interrupt to report multiple level changes in short succession with the same level being reported twice in a row.
Thanks for clarifying what best to do.
The text was updated successfully, but these errors were encountered: