RF24 library does not use interrupt and starts polling when waiting for data to be sent #877

stefan123t · 2022-11-04T08:23:31Z

Thanks for your great library and endless effort you put into maintaining this!

~~Please read about common issues first. It addresses the most common problems that people have (whether they know it or not).~~
We have a very short and concise ISR callback which sets a flag only.

Describe the bug

The library is used to send and receive datagrams to our Solar PV inverter (Hoymiles, TSUN and MBOG brands) via NRF24L01+ at 250kBps data rate. We use the IRQ output of the NRF module to trigger our receive code. During night time there is little chance that the solar inverter is able to answer our requests and so only sending of packets occurs. This is when we see for sure a lot of communication via the SPI which according to our analysis looks like constant polling of the status register 0x07 whether the send buffer is clear for the next datagram to be sent.

Please include:

Code to reproduce
For code you can see the our issue at NRF24 polling trotz aktiviertem IRQ lumapu/ahoy#83 (sorry mostly german)
The code is under the tools/esp8266 path in a platformio project https://github.com/lumapu/ahoy/tree/main/tools/esp8266
If you need to see any specific code we can answer you in our issue there or link the relevant sections from here.
Expected behaviour
We would assume that the IRQ is used for both sending and receiving.
Apparently the Interrupt is only triggered when new messages are received from the NRF24 module.
But not when we wait for the send buffer to be emptied / processed.
Here constant polling by querying the status register command 0x07 is used.
See the following screenshot by our project lead @lumapu who traced this behaviour using his oszilloscope:
What device(s) are you using? Please specify make, model, and Operating System if applicable.
We use nRF24L01+ modules for the 250kBps low data rate which has a higher yield to travel far enough reaching the manufacturers inverters. The high data rates of 1MBps and 2MBps are not supported by the inverters firmware. We recommend our users using LNA+PA modules with external antennase which usually work fine in PA_MIN / PA_LOW mode. Whereas the modules with circuit board antennas may require PA_MAX / PA_HIGH to send / receive at the same distance. We also recommend our users to stabilize voltage during sending on the NRF24 modules VCC / GND pins 1&2 using a electrolytic capacitor ~47..100uF.
On the MCU side we use ESP8266 modules (NodeMCU v3 and Wemos D1 mini / Pro) as well as ESP32 modules.

Additional context

The problem occurs when there is a lot of sending and the library starts to poll whether the send buffer of the NRF24L01+ module has been emptied. There are different interrupts which could be enabled according to the Nordic Semiconductor data sheets which would allow the Interrupt to be used for both Sending & Receiving as far as we investigated.

The text was updated successfully, but these errors were encountered:

TMRh20 · 2022-11-04T10:39:01Z

If you want to use interrupts for sending, you can use the startWrite() function The normal write() function will poll until data is sent, but startWrite() will just write the packet to the FIFO buffer and return you to your code. You can then use interrupts to determine if the packet was sent succcessfully or not.

2bndy5 · 2022-11-04T10:43:07Z

This will sound like a info dump, but I really don't know the exact cause of the problem here. I figure if I just put everything on the table, something might lead to a solution 🤷🏼‍♂️ .

Constant "polling" of the status register (0x07) indicates (to me) that whatHappend() is getting called constantly or the app is stuck constantly transmitting. There are few places where we actually write to the 0x07 offset. Reading the data from that register would actually look like 0x27 over MOSI. In fact, we usually get the STATUS byte from the 0x07 offset using the radio's non-op command (the 0xFF on MOSI) because we get the STATUS byte quicker that way (full duplex SPI transactions). The only time we need to write to the 0x07 offset is to reset the IRQ flags, which is done during most write methods and in whatHappened(). Since your app disabled auto-ack, its hard to tell if it is stuck transmitting or constantly calling whatHappened(). With auto-ack enabled, write() would spam the radio with non-op commands until the auto-ack was returned from the receiver or the max auto-retries count was reached.

After calling whatHappened(), the IRQ pin should reset until triggered by another event. If there is another event that triggers the IRQ immediately, then this could lead to constant polling of the 0x07 offset. However, I see your project's hmRadio.h file calls maskIRQ(true, true, false), so it is unlikely that another event is getting triggered immediately. I don't know much about disabling ISRs on the ESP8266, but I would double check the macros your project is using to manipulate the MCU (DISABLE_IRQ and RESTORE_IRQ).

problems I noticed with the code

I see your project is using the *etPayloadSize() functions (for statically sized payloads), but you have dynamic payloads enabled. This is erroneous if the received payload is not exactly 32 bytes. It would be better to use getDynamicPayloadSize() because that will tell you the actual size of the payload you're about to read from the RX FIFO. So, if the following snippet seems erroneous to me:

            mNrf24.setPayloadSize(MAX_RF_PAYLOAD_SIZE); // not used for dynamic payloads
            mNrf24.enableDynamicPayloads(); // payload size will be the amount of data passed to write*()

                        len = mNrf24.getPayloadSize(); // Does nothing over SPI; returns the int from setPayloadSize()
                        if(len > MAX_RF_PAYLOAD_SIZE)  // ??
                            len = MAX_RF_PAYLOAD_SIZE; // should never get executed

I also found this comment which makes me think there's wiring/connection problems as well. It is possible that long wires can cause data to get corrupted in transit from radio to MCU (or vice versa).
Furthermore,

            if(!mNrf24.isChipConnected()) {
                DPRINTLN(DBG_WARN, F("WARNING! your NRF24 module can't be reached, check the wiring"));
            }

this warning could be detected earlier using RF24::begin() instead of RF24::isChipConnected(), just FYI.

stefan123t · 2022-11-04T12:26:00Z

Thanks for the responses we will follow the suggested ideas and will come back to you.

TMRh20 · 2024-02-18T03:20:17Z

Closing, all related issues appear to be closed. Please update if further info etc needed.

stefan123t · 2024-02-18T13:35:07Z

As far as I followed and understood the solution now is to use startFastWrite() instead of startWrite(). The reason being that ACKs were off for some considerable time when switching between write and read mode using startWrite(). When using startFastWrite() this seems to be much quicker transitioning from write to read mode.
So switching between the modes left the PA+LNA switched off which resulted in lost ACK packets and therefor led to retransmissions.

Maybe @lumapu or @tictrick can comment on the solution which was merged downstream in lumapu/ahoy#1414

@TMRh20 & @2bndy5 thanks for your valuable insights and suggestions!

I do not know if you want to add some warning about this switching behaviour in normal mode when ACK is activated, some caveat notes to the documentation or simply make startFastWrite() the recommended default upstream ?

TMRh20 · 2024-02-19T12:37:44Z

Re-opening issue as a reminder to put more info in the docs regarding the difference between write() functions. We have a pinned issue too, so obviously this is an issue to note better.

- Update comment regarding troubleshooting and CE Pin - Add info on the different write() functions #816 #877

* Update COMMON_ISSUES re: write() functions - Update comment regarding troubleshooting and CE Pin - Add info on the different write() functions #816 #877 * Formatting & IRQ info

DanielR92 mentioned this issue Nov 18, 2022

Backlog v0.6.0 lumapu/ahoy#142

Closed

19 tasks

stefan123t mentioned this issue Nov 18, 2022

NRF24 polling trotz aktiviertem IRQ lumapu/ahoy#83

Closed

beegee3 mentioned this issue Jan 26, 2023

HW Watchdog und Exceptions lumapu/ahoy#564

Closed

TMRh20 added the question label Jun 1, 2023

TMRh20 closed this as completed Feb 18, 2024

TMRh20 reopened this Feb 19, 2024

TMRh20 added the documention label Feb 19, 2024

TMRh20 self-assigned this Feb 24, 2024

TMRh20 added a commit that referenced this issue Feb 24, 2024

Update COMMON_ISSUES re write() function

5954cf5

- Update comment regarding troubleshooting and CE Pin - Add info on the different write() functions #816 #877

TMRh20 mentioned this issue Feb 24, 2024

Update COMMON_ISSUES re: write() function #947

Merged

TMRh20 closed this as completed in #947 Feb 24, 2024

stefan123t mentioned this issue Oct 30, 2024

[Bug] MI Inverters not working with startFastWrite usage lumapu/ahoy#1434

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RF24 library does not use interrupt and starts polling when waiting for data to be sent #877

RF24 library does not use interrupt and starts polling when waiting for data to be sent #877

stefan123t commented Nov 4, 2022 •

edited

Loading

TMRh20 commented Nov 4, 2022

2bndy5 commented Nov 4, 2022 •

edited

Loading

stefan123t commented Nov 4, 2022

TMRh20 commented Feb 18, 2024

stefan123t commented Feb 18, 2024

TMRh20 commented Feb 19, 2024

RF24 library does not use interrupt and starts polling when waiting for data to be sent #877

RF24 library does not use interrupt and starts polling when waiting for data to be sent #877

Comments

stefan123t commented Nov 4, 2022 • edited Loading

TMRh20 commented Nov 4, 2022

2bndy5 commented Nov 4, 2022 • edited Loading

problems I noticed with the code

stefan123t commented Nov 4, 2022

TMRh20 commented Feb 18, 2024

stefan123t commented Feb 18, 2024

TMRh20 commented Feb 19, 2024

stefan123t commented Nov 4, 2022 •

edited

Loading

2bndy5 commented Nov 4, 2022 •

edited

Loading