Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPIO driver should work around GPIO peripheral interrupt bugs (IDFGH-5910) #7602

Closed
Elijahg opened this issue Sep 25, 2021 · 6 comments
Closed
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally

Comments

@Elijahg
Copy link
Contributor

Elijahg commented Sep 25, 2021

Environment

  • Development Kit: [ESP32-Wrover-Kit, but same issue occurs with any hardware]
  • Kit version (for WroverKit/PicoKit/DevKitC): [v4]
  • Module or chip used: [ESP32-WROOM-32|ESP32-WROOM-32D|ESP32-WROVER]
  • IDF version (run git describe --tags to find it): v4.3.1
  • Build System: [|idf.py]
  • Compiler version (run xtensa-esp32-elf-gcc --version to find it):
    xtensa-esp32-elf-gcc (crosstool-NG esp-2021r1) 8.4.0
  • Operating System: [|macOS]
  • Using an IDE?: [Yes, VSCode]
  • Power Supply: [USB|external 3.3v]

Problem Description

I apologise that this is quite a rambling issue, but I feel there is a need to consolidate the related GPIO peripheral bugs such that Espressif can see that this is quite a big issue that really needs resolving.

There have been many reports about pretty major bugs with the GPIO interrupt peripheral, I've not trawled exhaustively but this is at least the 8th time similar has been reported with no input from Espressif. Some are reported in the arduino-esp32 repo but the issue is IDF not the Arduino core: #955, #1111, #1229, #1250, #2941, #4172, #6089. Espressif acknowledges there are issues with the interrupt peripheral as per the errata, but there is nothing about this bug in the general documentation nor in the examples, and Espressif seems to avoid replying to queries and bug reports around the GPIO interrupt peripheral - all of the above issues are closed without Espressif's input.

The errata briefly explains a workaround, though that doesn't entirely resolve all the issues, nor does it explain exactly what the hardware issues can be, only that the GPIO peripheral "may not trigger [edge]interrupts correctly". This needs to be much more specific. Does it miss interrupts entirely? Does it cause race conditions? Does it cause multiple firings per pin transition? Does it only work in one direction? Only certain pins? If you cannot be more specific, i.e. it's broken in multiple ways with no software workarounds and its use should be avoided entirely, you should bite the bullet and state that. Without this, people could potentially be wasting many hours trying to fix a bug in their code, which turns out to be a hardware problem. I can attest to this.

In the short term, documentation should be updated bringing the errata to the user's attention. Examples should be provided on how to avoid the peripheral's bugs, especially with regards to using level interrupts to detect an edge

Expected Behavior

Reliable and predictable interrupt behaviour that avoids the observed issues above, which will need software workarounds for the hardware bugs in the GPIO peripheral. Since the workarounds presented in the errata don't seem to be reliable, this makes it non-trivial for users to avoid the issues. Ultimately, Espressif should as best they can provide workarounds in the driver software.

Actual Behaviour

The observed behaviour seems to be:

  • ISRs firing multiple times per pin transition with slew rate greater than 2µS, thanks for finding that out @BjarneBitscrambler - too much work went into that to be apparently ignored
  • ISRs firing twice per fast pin transition (apparently down to wrong intr_alloc_flags/esp_intr_flags)
  • ISRs firing a few times then breaking until reset
  • ISRs firing on the wrong edge

Some people seem to have success with changing the very poorly documented esp_intr_flags, some people by using gpio_intr_enable() rather than gpio_install_isr_service(). There seems to be little documentation around why someone would choose to use either gpio_install_isr_service() or gpio_intr_enable(). I'm sure a lot of the issues are due to poor documentation around shared interrupts and the flags that should be used with gpio_install_isr_service(), but some is also hardware.

@espressif-bot espressif-bot added the Status: Opened Issue is new label Sep 25, 2021
@github-actions github-actions bot changed the title GPIO driver should work around GPIO peripheral interrupt bugs GPIO driver should work around GPIO peripheral interrupt bugs (IDFGH-5910) Sep 25, 2021
@cdluv
Copy link

cdluv commented Sep 25, 2021 via email

@negativekelvin
Copy link
Contributor

Only one of the linked issues is about gpio interrupts

@Elijahg
Copy link
Contributor Author

Elijahg commented Sep 25, 2021

Only one of the linked issues is about gpio interrupts

Oops, used the # ref but they were in the arduino-esp32 repo. Thanks for pointing that out!

@songruo
Copy link
Collaborator

songruo commented Sep 30, 2021

Hi @Elijahg,

Thanks for summarizing all the GPIO interrupt related issues! After going through all the issues you refered, I believe they can be attributed to the two following reasons (excluding the issues that are caused by incorrect API usage).

Suprious Interrupts Due to Slow Rise/Fall Time

When an interrupt signal has slow rising/falling edges, you may observe:

  • multiple interrupts are being triggered per edge transition
  • interrupts are being triggered on wrong edge

Basically, whenever the voltage level is betwen VIL and VIH, its logical state of that GPIO is undefined (meaning either a 0 or 1 could be sampled by the HW). The GPIO peripheral samples the logical state of a GPIO at 80MHz. Thus a fast rising/falling edge will mean there are minimal samples taken while the GPIO is at an undefined voltage level.

However, this becomes an issue when the rise/fall time of a GPIO is slow, as it's possible that the GPIO peripheral takes multiple samples in the undefined votlage range.

Therefore, when using GPIO interrupts, we recommend users consider the following design guidelines to handle spurious interrupts:

  • Where possible, users should ensure that the edge transition time between VIL and VIH is as short as possible.
  • If short edge transistions are not feasible from the signal source, consider using a schmitt trigger as a HW workaround.
  • A possible software workaround is to disable/ignore any further interrupts within a preset period of time upon receiving an intiail interrupt, thus filtering out all of subsequent spurious interrupts on a slow rising/falling edge. This is essentially the same logic as handling SW debounce.

Interrupt Lost due to HW errata

The second problem is the potential of interrupt lost due to ESP32 HW errata (please refer to ESP32 ECO and Workarounds for Bugs Section 3.14). This errata only occurs when GPIO interrupt is sampled at the exact same clock cycle as an interrupt status clear operation.

Workrounds are provided in the ECO documentation. The main idea is to change edge-triggered GPIO interrupt to level-triggered, however, we understand that this might not be suitable for all projects (depends on the applications). Maybe you can refer to this post to get some idea on how to make your own workaround if you encounter this issue.

This hardware errata has been fixed on all later chips (ESP32S2, C3...).

I hope I have explained everything clear. Given that this issue has been encountered by signficant number of users, I'll update the GPIO programming guide to describe these issues and its workarounds.

@espressif-bot espressif-bot added Resolution: Done Issue is done internally Status: Done Issue is done internally and removed Status: Opened Issue is new labels Oct 13, 2021
@Alvin1Zhang
Copy link
Collaborator

Thanks for reporting, feel free to reopen.

@KaeLL
Copy link
Contributor

KaeLL commented Oct 21, 2021

@songruo you should update the errata more so than the docs (if by programming guide you mean the docs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

7 participants