Added a TXBnCTRL bit checker before sending the next CAN frame #54

pepeRossRobotics · 2022-01-27T11:45:00Z

This PR addresses the use of the library while transmitting UAVCAN frames at high speed (single and multi frame messages).

The issue was found when tryint to transmit IMU data at +50Hz and it was observed that many frames where propped on the transport layer. Investigation with candump showed that the error was in the message itself. by adding a small delay between frame transmission, they number of error layers dropeed significantly.

I did this the simplest way I could think of, so if there are any mods that are needed for more compatibility, or better ways to acheive this, please let me know and I am happy to make the changes.

Before the fix 1152 Transport Layer errors in less than a minute of up time

After applying the fix

pepeRossRobotics · 2022-01-27T11:51:02Z

For context, this library is being used with an RP2040 using the Arduino RP2040 library https://github.com/earlephilhower/arduino-pico.

aentinger · 2022-01-27T12:22:14Z

Hi 👋

Thank you for the PR, however the change does not make sense to me. With this changing you are waiting every time until all CAN transmit buffers are empty. This code checks if a transmit buffer is empty and only then proceeds to load the corresponding buffer. Possibly there are problems with frequency stability for your MCP2515 board (i.e. bad crystal), if the CAN timings are off things can go bad real fast. The other thing is, that maybe the status register contains not up-to-date data so maybe its better to check the individual TxBnCTRL instead of status in afore-linked code?

pepeRossRobotics · 2022-01-27T12:57:35Z

Hello, thanks for coming back to this PR so fast :)

I am new to this stack and maybe that is why I took this approach.

I found that just adding a wait between consecutive CAN frames reduced the errors, and the longer the wait the more errors were reduced. Because the frames were being transmitted with errors (usinf candump to visualise this), it made me think that the actual frame data was being corrupted somehow. And even on low frequency messages, that were multi frame, this was happening. so decided to slow down the actual frame transmission.

As everything is called from a LockGuard, I just made a small time wasting routine to test this assumption.
The way that I added the prototype wait was:

.
.
.
  /* Write to transmit buffer */
  _io.loadTxBuffer(txb, tx_buffer.buf);

  /* Request transmission */
  _io.requestTx(txb)
  for(uint32_t i = 0;i<25000;i++)
  {
    __asm__("nop\n\t");
  }

Of course having a random wait in the code was far from ideal, I check what I could check for signs that the CAN frame was transmitted and the next can frame was ready to be loaded.

My logic was that based on the flowchart on page 17 of the MCP2515 datasheet (copied below) that checking which was the state of the TXREQ bit when errors happened. From checking this bit I found out that the errors where mostly happening when one or more TXREQ bits (TXBnCTRL[3]) were set. So decided not to go to the next frame unil the bit is not set and transport layer errors plummeted. As you can see from the image above, there are still errors, but my calculation is that there is only error on 0.03% of the transmission, which is good enough for my application. The fact that there are still transport layer errors might mean that the root of he issue has not been addressed or that there is a compounding issue.

Would love to hear your thoughts.

Best,
Pepe

pepeRossRobotics · 2022-01-27T13:01:21Z

Just as extra information, the erros on the transport layer were present both with a COTS MCP2515 board, and a custom one. which might reduce the chances for it to be due to hardware issues, but would not rule them out.

aentinger · 2022-01-31T09:12:02Z

Superseded by #55.

joekeo added 2 commits January 27, 2022 11:35

Added a TXBnCTRL bit checker before sending the next CAN frame

cf88833

Fixed spell check typo

4f05864

aentinger mentioned this pull request Jan 30, 2022

Fix: Determine if CAN TX is ongoing by reading from the TXBnCTRL register. #55

Merged

aentinger closed this Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a TXBnCTRL bit checker before sending the next CAN frame #54

Added a TXBnCTRL bit checker before sending the next CAN frame #54

pepeRossRobotics commented Jan 27, 2022

pepeRossRobotics commented Jan 27, 2022

aentinger commented Jan 27, 2022

pepeRossRobotics commented Jan 27, 2022

pepeRossRobotics commented Jan 27, 2022

aentinger commented Jan 31, 2022

Added a TXBnCTRL bit checker before sending the next CAN frame #54

Added a TXBnCTRL bit checker before sending the next CAN frame #54

Conversation

pepeRossRobotics commented Jan 27, 2022

pepeRossRobotics commented Jan 27, 2022

aentinger commented Jan 27, 2022

pepeRossRobotics commented Jan 27, 2022

pepeRossRobotics commented Jan 27, 2022

aentinger commented Jan 31, 2022