You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been working on a driver for ST7789 (#132). While doing so, I learned that the current model for HAL has too many layers that do "nothing" and make things runs at half the max speed.
The problem
Currently we have to define a function with:
switch(msg) {
case UCG_MSG_DEV_POWER_UP:
...
and it is set at com_cb at struct _ucg_t. This is used by other layers, like ucg_com_SendString. For example, this code:
SCK is running at 20MHz, but we still have half the time spent outside of this loop from UCG_COM_MSG_SEND_STR. If we remove ucg_com_SendString(ucg, 2, buff) and use this:
Then things run twice as fast, with negligible delay between sending bytes:
This is the difference between refreshing the screen at ~11Hz to ~20Hz! I did some math, this shaves ~50 clock cycles from my STM32 running at 84MHz.
Fix
We add another member to struct _ucg_t, which is a struct, with pointers to functions for each of the cases from com_cb, like UCG_COM_MSG_POWER_UP, UCG_COM_MSG_POWER_DOWN etc:
ucg_Init can have its signature updated to accept struct _ucg_com_cb_funcs as well or we can add another constructor. We can also set defaults for these values at the constructor:
ucg->com_cb_funcs.power_up=default_power_up;
and forward calls to the existing com_cb function:
For those interested, I've given up on fixing ucglib due to lack of response on various PRs / issues, so I've solved the problems by writing a shiny new library https://fornellas.github.io/eglib/
which solves many of the architectural problems with ucglib.
I have been working on a driver for ST7789 (#132). While doing so, I learned that the current model for HAL has too many layers that do "nothing" and make things runs at half the max speed.
The problem
Currently we have to define a function with:
and it is set at
com_cb
atstruct _ucg_t
. This is used by other layers, likeucg_com_SendString
. For example, this code:translates to this at the wire:
SCK
is running at 20MHz, but we still have half the time spent outside of this loop fromUCG_COM_MSG_SEND_STR
. If we removeucg_com_SendString(ucg, 2, buff)
and use this:Then things run twice as fast, with negligible delay between sending bytes:
This is the difference between refreshing the screen at ~11Hz to ~20Hz! I did some math, this shaves ~50 clock cycles from my STM32 running at 84MHz.
Fix
We add another member to
struct _ucg_t
, which is a struct, with pointers to functions for each of the cases fromcom_cb
, likeUCG_COM_MSG_POWER_UP
,UCG_COM_MSG_POWER_DOWN
etc:ucg_Init
can be updated to support this new struct (or we can have an alternative constructor).This allows avoiding
ucg_com_SendString
altogether allowing things to be fast:Benchmarks:
Delay between sending each pack of 2 bytes for a few cases:
With send string:
ucg_com_SendString(ucg, 2, buff);
: 620nsucg->com_cb_funcs.send_str(ucg, 2, buff);
: 240ns 60% fasterWith send byte:
ucg_com_SendByte(ucg, buff[0]); ucg_com_SendByte(ucg, buff[1]);
: 590nsucg->com_cb_funcs.send_byte(ucg, buff[0]); ucg->com_cb_funcs.send_byte(ucg, buff[1]);
: 30ns 94% fasterCompatibility
ucg_Init
can have its signature updated to acceptstruct _ucg_com_cb_funcs
as well or we can add another constructor. We can also set defaults for these values at the constructor:and forward calls to the existing
com_cb
function:This allows all display drivers to continue to work, but some drivers can start onboarding with this new system.
PR
I can write a PR for this implementation, if there's willingness to merge it.
The text was updated successfully, but these errors were encountered: