-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
St7789 async dma and partial update #593
base: main
Are you sure you want to change the base?
St7789 async dma and partial update #593
Conversation
This is extremely interesting, thank you! Always eager to push the state of the art with our display updates and see what the Pico is capable of. I wont get any time to feed back on this until the new year (new baby plus seemingly constant waves of sickness have demolished my productivity and I have a lot of catching up to do) but it might pique the interest of @MichaelBell who's also had some input on our display driving code. If you want/need other screens to tinker with- not necessarily even to test or with any strings attached- let me know. |
You are always welcome to send me screens :) The PicoDisplay2 has arrived, haven't unpacked it yet. I am doing a ILI9341 "Pimoroni" driver at the moment, along with a XMP2046 touch screen driver. The ILI9341 has the same region and async code in, I'm going to also add new additional "pens" for direct updating via display drivers rather than via a frame buffer, this will open up larger displays to be used with the limited memory the pico has. I don't think you sell any displays based on these but I will put in another PR for this code when I have finished the example test code. Might be useful for some of your other customers. When I finish this I will add an example for the PicoDisplay2... p.s. Congratulations on the new baby. |
I have added an example for the Pico Display 2 as well. |
I have added the ili9341 and xpt2046 drivers and examples to this PR : #596 |
This looks awesome, I have also wanted to do partial updates and have some hacky patch to do that, but great to see a suggestion that could be merged. It would be nice to expose the partial update to MicroPython. I think the extra up to ~2KB memory allocation for the chaining DMA commands could be an issue - heap RAM is at a premium especially on PicoW builds. Maybe you could just have a single chain CB, and use the DMA complete interrupt to set it back up for each line? Though that might occasionally stall the SPI transfer and I don't know whether the st7789 can cope with that. That said, you haven't exposed this to MicroPython, so maybe it's not such a big deal unless that is done. Final minor comment: the indenting is inconsistent - I think you need to convert tabs into spaces. |
Your idea of using a single chain is good if the memory is an issue. What I would like to do is to implement a "minimal memory" display list type rendering, so you push updates to the renderer (rects, points, etc), the renderer just uses a ping pong buffer with 2 scanlines, sorts by updates by Y and intersects the objects with a single scanline which is rendered while the dma is transferring the other scanline until all updates are rendered. So higher cpu usage but less memory, not so great for whizing things round the screen but ok for UIs. The renderer can then run on the other core. Maybe the first step is just direct to spi pens with no frame buffer at all... What would be good is a pico with more memory! Concerning the tabs I used the Pimoroni VSCode workspace and hoped that the tabs would be ok, I will look for some hidden setting somewhere and see what is wrong... p.s. I have kept away from python for years using the tactic of finding someone else that knows it and getting them to do whatever needs doing, so I can't be much help with that! |
I have added a pr: #613 with an idea of how to use direct pens without frame buffers. |
For MicroPython, memory is a huge constraint. Pretty much any heap allocation outside of MicroPython's own managed "heap" will either immediately, or eventually, cause an OOM lockup. This is particularly severe on Pico W where the RAM constraints are even tighter and the slightest misstep will cause the MicroPython build to not even finish booting 💀 Any buffers we use, must be possible to allocate using MicroPython's pimoroni-pico/micropython/modules/picographics/picographics.cpp Lines 355 to 368 in 206d408
Even our C++ classes are allocated in MicroPython's heap using pimoroni-pico/micropython/modules/picographics/picographics.cpp Lines 372 to 403 in 206d408
This is a continuous pain point 😭 |
That is all greek to me :) So are you saying the C/C++ code here has to be limited in order to work with the python stuff, so the python stuff is a wrapper around the C/C++ stuff and the python "runtime" uses up enough ram to limit the c/c++ stuff? So 2KB of extra memory means the users of the C/C++ stuff don't get the functionality because the python stuff is near the limit? |
Yep, all of the C++ stuff here (or most of it, anyway) is wrapped directly via MicroPython bindings which are just a thin veneer over the underlying C++ class to handle type conversions, argument checking, throwing errors and all that usual user facing garbage you have to deal with in a non-compiled language. MicroPython eats RAM for breakfast, so the safest assumption is to not use any RAM the user hasn't supplied, or at least give the user the option and default to allocating if they don't supply a buffer. We didn't know this when we started out, quite a trial by fire 😬 |
Ok, Got you. So are we saying that dma_control_chain_blocks should be able to be passed in after being allocated in python or the whole idea of using the dma chaining needs to be changed? |
Set to 1 to use dma interrupts for partial updates instead of control chain. For Python to remove memory requirements for the control block buffer.
Ok, I just added a define USE_ASYNC_INTERRUPTS. If this is set to 1 then the control chain is not used but interrupts are used instead. I guess somehow you can get this define set outside of st7789.hpp depending on if using c++ or python. |
Thank you. IIRC from @MichaelBell's musings with the camera, chained DMA requires alignment that MicroPython can't guarantee and I think that might apply here, but I'm not sure. Either way the buffers would need to be allocated by MicroPython's functions, whether those are user-supplied or handled by the bindings is relatively inconsequential. Setting a define in the respective CMake files for MicroPython vs C++ builds should be straightforward. I am still in very much over my head here! |
The extra alignment was for use with DMA ring buffers, I haven't reviewed in detail but I don't think Andrew is doing that here. I think I did add some explicit alignment to 4-byte boundary on the camera thing just as belt and braces, but fairly sure that buffers from Python are 4-byte aligned unless it's a user provided buffer and they've given a non-multiple of 4 offset into a larger byte buffer. |
USE_ASYNC_INTERRUPTS=1 turns chains off entirely, it is then all handled in the interrupt handler instead. Basically it is then a bodge for stride DMA as the pico doesn't have it. So there is a new structure:
The data part points into the frame buffer, the size if the size in bytes of the scanline, count is the number of scanlines and width is the width of the frame buffer in bytes (stride). This structure is set up and the DMA is done one scanline at a time, when each scanline is complete the interrupt handler sets of the next DMA transfer until all of them are done. So there are no allocations being made at all. I think this was Michaels idea to begin with? |
I put in this PR to ask for some help testing this, basically it adds async DMA transfers for the st7789 and also partial updates.
There is a test example for the pico display for testing the new code, I am also waiting for a pico display 2. When that arrives I will port over the example for that.
I do not have the other st7789 Pimoroni displays though so these will not be tested, the PIO route is also not tested.
So is this PR something you guys would like to add in, you can get a nice speedup with these changes?
Can someone help with testing these other displays?
Cheers
Andy
Changes
Added async DMA for st7789. For 565 we don’t have to wait for the SPI transfer now.
Added partial update of display so we can just update a region via SPI.
Async DMA is also implemented for partial update for 565. As the DMA on the pico doesn’t support stride this is done using two DMA channels with chained transfers.
Added ability to destruct graphics and display.
For an overview look at pico_display_async_region.cpp, this is a test example, comments at top showing what is going on.
st7789 changes
Constructors changed to take use_async_dma parameter, defaults to false.
command() changed to take use_async_dma flag.
is_busy() implemented, returns true if waiting for async DMA.
wait_for_update_to_finish() added, this needs to be called if using async dma. Waits for dma to finish and disables the DMA chain.
Fix for display rotating for rotate180.
Added partial_update() implementation.
setup_dma_control_if_needed() added, sets up a second DMA channel for chained transfers.
Examples changes
Added example test code for the pico display: pico_display_async_region.cpp
pico_graphics changes
rect_convert_rgb565() added, like frame_convert_rgb565() but for Rects/Regions.
rect_convert() added for each pen.
create_owned_frame_buffer() added to create frame buffer owned by us
destructor added to free up frame buffer if owned.
types
Rect.equals() added to test equality of Rects.
pico_graphics_pen_1bit changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
pico_graphics_pen_1bitY changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
pico_graphics_pen_3bit changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
Add empty rect_convert()
pico_graphics_pen_P4 changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
Add rect_convert()
pico_graphics_pen_P8 changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
Add rect_convert()
pico_graphics_pen_RGB332 changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
Add rect_convert()
pico_graphics_pen_RGB565 changes
Use create_owned_frame_buffer() to create frame buffer if none provided.
pico_graphics_pen_RGB888 changes
Use create_owned_frame_buffer() to create frame buffer if none provided.