-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion for calculating CRC/etc #63
Comments
Interesting idea! I now know of three ways to do the CRC:
It would be interesting to do a comparison of the three methods. I've had my doubts about whether 1 or 2 is faster on the RP2040, and 2 takes significantly more memory. 3 is probably hard to beat for both speed and memory consumption. Are you looking at https://github.com/raspberrypi/pico-extras/blob/master/src/rp2_common/pico_sd_card/sd_card.c? Are you doing any specific configuration of the sniffer? |
I never noticed that page- I was skulking around the code trying to turn anything and everything I could into DMA chains when I hit that CRC roadblock and found the sniffer in the SDK- useful to see it used by them though! At the moment I've just done this-
as a substitute for the spi_transfer function call in spi.c; it hasn't failed me yet (I did a 12 hour test run of writes with the Pico entering dormancy between recording sessions/etc) so clearly the CRC's it's generating are fine- 0x02 sets CRC to CRC16-CCITT (there are others available, too.) I only put the bit of code under the RX area too because I have no clue if the library would need to use CRCs in that case- it might not be correct to do so (I really just need performance for long write sessions so the RX is inconsequential and I've not even bothered testing it at all...) It'd be cool to know if you have any ideas on "fully DMA'ing" multiple-block write- at the moment I can see that a block is TX'd, then an RX needs to compare the CRC and get the response- I can't see any way on the Pico to non-CPU compare the CRC's and fire off interrupts if they're not identical. I had a thought of getting an external logic AND gate + using PIO for automatic CRC comparison (where it would fire an IRQ if the AND doesn't work to logic 1) but honestly I don't know about how critical that "response" bit is- and then there's the issue of getting an AND gate :( |
Actually, CRC is optional for SPI. You should be able to turn it off by putting something like
in Anyway, I looked into using the DMA Sniffer. The Sniffer is a single, global resource. So, using that would reduce opportunities for concurrency. For example, if I have two SD cards on two SPIs (or an SPI and an SDIO) that are trying to read or write simultaneously, only one can use the Sniffer. So, the other will have to wait (or fall back to software CRC calculation). So, I'd need to have some kind of mutual exclusion lock. That's fine within the library, but what if some other DMA user in the system is using the Sniffer? There does not seem to be any sort of global reservation mechanism in the SDK for the Sniffer. This might seem irrelevant to what you're doing, but have you considered using multiple SD cards and striping the data across them? I'm not sure what that would do to power consumption. Anyway, the ZuluSCSI-firmware SDIO code overlaps the CRC calculation with the DMA transfer, and I realized that the same idea could be applied to the SPI. I've been able to get a significant speedup by overlapping CRC calculation with the DMA transfer. The time to write a 512 byte block went from 377 us to 313 us. The DMA transfer of the block data takes about 244 us, but the CRC16 calculation takes only about 66 us, so there is plenty of extra time. Here's what it looks like when writing multiple blocks: New code with overlapped CRC calculation: With this change, for writing there is nothing to be gained by using the Sniffer to calculate the CRC unless the processor cores have something better to do during that 66 us than calculating the CRC for the block while the DMA is transferring it. For reading, one can't calculate the CRC until the data has been received, but for multi-block transfers the CRC check of a block can be delayed until the DMA completion wait time of the following block, so it is possible to overlap the processing for all but one block. These are the results I'm getting now with system clock at default 125 MHz and SPI baud rate 20833333 Hz:
|
Hi;
This is just a suggestion regarding the calculation of CRC values in the library (I'm unsure as to whether this has been implemented in the SDIO port you produced but... anyway, I digress;)
The manual calculation of the CRC by the library could be replaced by (what I assume, at least) the hardware implementation the RP2040 provides- in my current use case I've changed spi_transfer (explicitly for TX-purposes- I'm strictly writing) via
// Enable the sniffer dma_sniffer_enable( pSPI->tx_dma, 0x02, true );
channel_config_set_sniff_enable( &pSPI->tx_dma_cfg, true ); dma_hw->sniff_data = 0;
// Get CRC *crc = dma_hw->sniff_data;
I'm currently trying to make a specific case of using "full hardware" for multiple block write to minimize power use and have just come across this method for hardware evaluation of the CRC- thought it useful.
Thanks again for the library!
(Also- if you remember the previous topic I had here- I got my SPI up to 50 MHz! It writes crazy fast now ^_^)
-Sebastian
The text was updated successfully, but these errors were encountered: