-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STM32 CAN attach RxIrq unusable. #9495
Comments
#5267 is likely a related issue. |
@ARMmbed/mbed-os-hal @ARMmbed/team-st-mcd |
@ARMmbed/team-st-mcd Could you please review the commit Flowm@c8a37ff? |
Unfortunately this commit is just a workaround, not a proper fix. It simply disables the mutex in Afterwards CAN communication works properly with |
You could subclass |
Hi |
@jeromecoutant is mutex really needed in CAN driver? |
@kjbracey-arm |
I was recently discussing CAN and interrupts here: #6714 I suggest reading through that, as it covers pretty much everything about interrupts and locks. I was not aware of this issue at the time, and I'm not quite sure how to read it. Is it the case that the example code over in that issue won't work on STM? (Signal event in IRQ handler, do read from thread woken by that event?) Locally overriding The HAL CAN API does seem to be woefully underspecified, so this is debatable. I'm not familiar with CAN, or typical hardware implementations, but it seems to be close analog to the UART. There we let the HAL be "thin" for typical UARTs - we don't require there to be buffering in the HAL, but because the HW buffers are small we in practice do need data to be read from interrupt. This is then covered by the separate classes We don't have such a split for |
that's exactly where the problem is - with current implementation. It would be nice if mbed would use buffering supported by HW (3 messages on STM platform) or maybe SW queue and split CAN as Serial - |
Yes, it needs to do one or the other. If hardware normally has sufficient buffers, and you're saying STM does, then apps relying on buffering to do reads from threads should be fine. The STM HAL should be fixed so that the example code here works, if it doesn't. |
It is hard to believe, that this issue still exists and nobody offered a fix or some working solution except disablling and reenabling CAN IRQ flag. |
Wow, just ran into this issue myself on a NUCLEO_F413ZH I am surprised the history of this is so long, detailed, and unsolved! Mbed certainly needs a refactor of its CAN HAL API... It's workable for now but there are many improvements that could be made.
From what I can tell Mbed does use the 3 message FIFO on STM -- the RX FIFOs are managed entirely by hardware. The STM My understanding of the intent behind the @kjbracey-arm @maclobdell I think it's about time the CAN API in Mbed gets some love. Maybe it could be worked onto the back end of the mysterious Mbed schedule? I would be willing to participate in defining an updated HAL spec. |
I have also noticed more unfinished or badly implemented CAN features. For example it is impossible to attach interrupt on Error IRQs. (EwIrq, EpIrq, BeIrq), because of always disabled Global ERRIE flag. Moreover LEC interrupt is not even defined. And IRQ on TX Overrun is also not possible to catch really. STM CAN really needs to be reimplemented from scratch. |
I'm happy to test new CAN things |
@TobinHall thank you for raising this issue.Please take a look at the following comments: Could you add some more detail to the description? A good description should be at least 25 words. NOTE: If there are fields which are not applicable then please just add 'n/a' or 'None'.This indicates to us that at least all the fields have been considered. |
We've updated our automation, I will fix the requirements . |
Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers. |
I believe there won't be any fix for this problem (as it's linked with STM32 hardware and Mbed "one for all" library) except a buffered/unbuffered solution. As Mutex prevent using read method in ISR context, but the only way that STM32 FDCAN have chosen to lower the IRQ flag is a reading. So, using attach method brings you to a deadlock. Thus, this deadlock cannot be solved as it’s may be impossible to remove the mutex from Mbed lib because some others CAN controllers may have the use of it, and STM32 won’t change their hardware (and I don’t wish to have a flag that can be cleared while message is still unread). The best solution at that time (that will remove any portability of Mbed code) is to remove the useless mutex for STM32 (because mutex is just useless with FDCAN, as, you just write or read in memory, it’s thread safe). I know it’s not “clean” but it’s the only solution that ensure a valid CAN use of IRQ for STM32 at least before Mbed create a buffered class. |
@donatieng please comment |
@donatieng to add to @jeromecoutant I have tested CAN using RxIRQ after removing mutex from CAN::read(). It works fine. |
I can confirm it works fine, i created a lib which remove mutex https://github.com/pilotak/NoMutexCAN |
Hi folks, Reading back on this thread - the only "proper" solutions I can see are:
Removing the mutex will of course work fine until you start using the instance concurrently. Not sure how much of a use case this is for CAN though, but definitely the current Mbed OS pattern for drivers. |
Reading back myself, I'm still not 100% clear. I'm potentially reading it differently to Donatien - we don't have to do an IRQ read for latency (which is UART's problem), but rather the STM HAL continuously asserts a level-based IRQ until a read? So we deadlock unless the app reads from the IRQ? I would say that that behaviour is contrary to any other attach method implementation I can think of in the Mbed HAL. It's generally assumed that HAL interrupts act as "edge-based", simulated in HAL software if necessary. IRQs should be free to defer reads to threads. Interrupts should be delatched in the HAL before making the attach upcall. (Not after, as that's racy if the attachee is reading directly). Can that not be done in the STM CAN implementation? Alternatively, mask the interrupt until the read happens? |
@kjbracey-arm The reference manual states that Hence delatching the interrupt is not possible until a read happens, not possible in HAL. |
You may not be able to suppress the interrupt source, but can it not be masked from the interrupt output, either in the FDCAN itself, or the NVIC? Mask it when asserted, and unmask after a read? I'm looking at the STM32G4 reference manual, which doesn't seem to tally with your text above, but the FDCAN_IE register? |
Masking might be the best bet. Looking at the STM32F072 reference manual (RM0091) (which has the bxCAN controller), the Of course, the Mbed CAN API's documentation (and intent, I think) about the RX filters doesn't agree with the STM32 HAL implementation, in which the filters don't work, and one FIFO is unavailable. |
Sorry if I'm a bit naive, I've been following this thread for some time now though. Why is it not possible to cache the received CAN messages in a FIFO and clear the flag in the interrupt? |
Because : There is a mutex in the read The read is the only way to clear the flag The ISR context doesn't allow Mutex. Remove the mutex, and problem is solved. But you are not using the mbed lib anymore. |
But what AGlassOfMilk suggests would be possible inside the HAL implementation - it could unload messages from the hardware into a FIFO for later delivery via the read call. Although what do you do when your FIFO's full? Maybe just throw stuff away, but not sure if that makes sense in the protocol. I'm unfamiliar with CAN. (I've seen things doing that for TCP - which is Very Bad. Turning a reliable protocol with end-to-end flow control forcing the sender to wait until you read, into something that just chucks data away if it's not read fast enough.) |
CAN is typically used in a periodic fashion (eg: broadcast temperature reading every 1s). Most protocols treat CAN as a lossy network and have retry/ACK logic if delivery must be guaranteed. There's also filters so you can route critical messages to a separate FIFO. Missing a packet typically means you will just get the same information next time it is sent. At some point there has to be a constraint on memory -- no machine has infinite RAM (though desktop applications sometimes seem like they assume so). The network speeds are also relatively slow (1Mbps max in classic CAN) so your application should have plenty of time to empty the FIFO and process messages. I think the FIFO approach would be appropriate in this case. |
I actually think that it will be good for me/user of mbed to decide if you want to catch all the messages ie.: generate IRQ that has to be cleared (and cached by circular buffer for example) or if you don't mind some looseness. On the other side @AGlass0fMilk is right - if it's a critical message there is alwas some kind of ACK mechanism so i can agree with him that FIFO approach should the way to go. |
Interrupt handling in STM HAL is not functional. Interrupt enabling and IRQ handling are done in STM CAN API itself. I have tried masking interrupts for legacy/bxCAN in STM CAN API IRQhandler, unmasking in can_read and deferring reads to thread context. This works fine. I have tried this on NUCLEO_F207ZG and NUCLEO-F767ZI platforms. It works without any problems. FDCAN does not have issues deferring reads to thread context, as masking and unmasking of interrupt is already been taken care in CAN API. I have verified this on NUCLEO-G474RE platform. |
Raised a PR for the implementation where can_read can be deferred to thread context as mentioned in the comment just above |
PR is now merged. |
Description of defect
In the CAN implementation for STM32 microcontrollers the CAN::read() function is responsible for clearing the flag that generates the Rx interrupt. This is because the flag is reset by a register that also controls the FIFO buffer. If the data is not pulled from the buffer before the flag is reset then the data will not be available.
The issue is that CAN::read() has uses a mutex which can't be used in an interrupt context.
Because you need to read the data before resetting the flag which is generating the interrupt and you cant read the data in an interrupt because of the mutex in the read function, it is impossible to use CAN with an interrupt.
As far as I can tell, this will affect all STM32 microcontrollers. It may also effect other platforms depending on their implementation.
Either the mutex needs to be removed or the data must be read and cached by a lower level ISR.
Target(s) affected by this defect ?
STM32
Toolchain(s) (name and version) displaying this defect ?
n/a
What version of Mbed-os are you using (tag or sha) ?
n/a
What version(s) of tools are you using. List all that apply (E.g. mbed-cli)
n/a
How is this defect reproduced ?
n/a
The text was updated successfully, but these errors were encountered: