-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pixel.show() crash with more than 73 pixel on ESP32s3 #375
Comments
Same problem here, also Esp32 (Heltec Wifi Kit), and at around 70 pixels |
I'm experiencing the same issue as well (crash/boot loop when driving more than around 70 pixels). This occurs on both my XIAO ESP32C3 boards and my ESP-WROOM-32 Dev Modules, with the following hardware/software:
Digging around in the Adafruit NeoPixel Library, using the v3.x board package means were also using IDF v5, which is pretty simple, and the NeoPixel Library doesn't appear to be to blame for this issue. I've isolated the crash to where it's calling: I'm not sure at this point where to turn for a fix though, since it looks like this is a problem with Espressif's v3 SDK. I'd really like to get this fixed, as I have a few projects with 150 or more pixels that I currently can't complete with this setup. I'm sure there's plenty of other people as well that expect to be able to drive larger pixel strings, so it'd be nice to get this working for everyone on the ESP32 platform. |
So I posted about this issue on the Espressif forums (https://www.esp32.com/viewtopic.php?f=13&t=40270) and they quite helpfully pointed me in the right direction; it appears that the culprit in the NeoPixel library is the While looking at this code more closely, I also noticed that it's calling |
@teknynja , good thinking, but neither of the provided examples allocate the buffer on the stack. The first has it as a global in .data. The second has it allocated from the heap via operator new[]. So that's not it. The single provided crash wasn't run through the symbolicator. See idf.py monitor and decoding at, e.g. https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/tools/idf-monitor.html Many IDEs offer convenient ways to do this. Note that the ESP32 parts have harsh rules about what registers can be poked at interrupt vs. base times and some other context rules. Is it possible that something is trying to refill a buffer and the hardware disappears from beneath it? That'll show that you'll be deep into a loop and the load/stores that have worked many, many times suddenly result in an invalid opcode (?) exception, thought the opcode hasn't changed. Offhand, is this perhaps taking long enough that the watchdog is firing? This caught my eye as I have a project that uses FastLED to actually feed the LEDs and it also blows up on s3, thought he breaking point is well into the many hundreds of LEDs. The breaking point doesn't seem to be absloute, but at X it'll run for hours and ad X+10 it'll crash within minutes. FastLED and NeoPixel are different code bases, so it's unlikely that they're bug-compatible. The unfortunate memory footprints with either RMT or SPI are bad on long/many strands of WS2812's. It's annoying enough that I've started investigating slightly more expensive parts that have real DRAM - and lots of it - for larger pixel budgets. When 64 and 256MB parts are still under $10 - and predictably fast - I have to ask if the PSRAM pain is worth the pull. |
@robertlipe - I'm definitely not an expert C/C++ coder, but to my eye void espShow(uint8_t pin, uint8_t *pixels, uint32_t numBytes, boolean is800KHz) {
rmt_data_t led_data[numBytes * 8]; // <== stack allocation??
if (!rmtInit(pin, RMT_TX_MODE, RMT_MEM_NUM_BLOCKS_1, 10000000)) {
log_e("Failed to init RMT TX mode on pin %d", pin);
return;
}
... the rather large I haven't had a chance yet to try rigging up a test with allocating that buffer from heap in the actual driver, but I expect that should at least clear up the crashing issue. |
Ah. That code wasn't in the original post.
Yes, that code is being silly. That's very much going to blow up in a
typical FreeRTOS (or most other multitasking) kind of environment.
It's absolutely worth the effort to have pixels[] correctly sized and
allocated (or at least have led_data allocated and reallocated if it grows)
so that this additional buffer isn't necessary. That's indeed the root of
this bug. Good find.
Since you seem to have a test case in hand, you might be able to push this
along with a (non-multithreaded) fix like changing
void espShow(uint8_t pin, uint8_t *pixels, uint32_t numBytes, boolean
is800KHz) {
rmt_data_t led_data[numBytes * 8]; // <== stack allocation??
to
void espShow(uint8_t pin, uint8_t *pixels, uint32_t numBytes, boolean
is800KHz) {
static uint32_t led_buf_size{0};
static rmt_data_t;
if (numBytes * 8 > led_buf_size) {
static led_data = malloc(numBytes * 8);
assert(led_buf_size);
led_buf_size numBytes * 8;
//rmt_data_t led_data[numBytes * 8]; // <== stack allocation??
This isn't really thread-safe, but that's probably OK for 99% of the
callers since it only grows. It also saves the malloc on every draw(),
keeping it but so that it only grows over time to reduce the
possibility of fragementation, failure, or flicker if its slow. It'll
quickly reach a stable state and hold onto that buffer.
Clean that up and propose it as a pull request if that basic approach
works. I'm coding from bed (not feeling well) and not a debugger so
there may be sloppiness in that.
A far, far better approach is to make the caller aware of led_buf_size
and let IT allocate and manage that. That solves the threading
problem. This is a case of trying to hide something that the caller
actually DOES need to know about. Premature abstraction.
Good luck.
…On Sun, Jun 9, 2024 at 7:37 AM teknynja ***@***.***> wrote:
@robertlipe <https://github.com/robertlipe> - I'm definitely not an
expert C/C++ coder, but to my eye
void espShow(uint8_t pin, uint8_t *pixels, uint32_t numBytes, boolean is800KHz) {
rmt_data_t led_data[numBytes * 8]; // <== stack allocation??
if (!rmtInit(pin, RMT_TX_MODE, RMT_MEM_NUM_BLOCKS_1, 10000000)) {
log_e("Failed to init RMT TX mode on pin %d", pin);
return;
}
...
the rather large led_data array is being allocated on the stack (plus in
my demo code, once I change that to a static allocation outside the
function, I'm able to send data to several hundred pixels without it
crashing).
I haven't had a chance yet to try rigging up a test with allocating that
buffer from heap in the actual driver, but I expect that should at least
clear up the crashing issue.
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD34T7VLRZYW6KK6Q73LZGREBPAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJWGUYTMOJVGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I've submitted a pull-request with fixes for this issue. I'm able initialize the driver configured for 500 pixels and it seems to be working fine. Anyone who would like to try this code out can grab the changes at teknynja/Adafruit_NeoPixel/tree/esp32_rmt_memory_allocation_fix - let me know if it works for you or if you encounter any problems. Thanks again for the help from @robertlipe on getting me started on this patch. |
@teknynja Your code works in my case. Thank you and @robertlipe very much. Also the link seems to be dead. This link to teknynja's fixes should work: https://github.com/teknynja/Adafruit_NeoPixel/tree/esp32_rmt_memory_allocation_fix |
For anyone using this library in multi-threaded scenarios, I've created a new branch with the same fix but it also includes a mutex to make it thread safe (works just fine for single-threaded apps as well): teknynja/Adafruit_NeoPixel/tree/esp32_rmt_memory_allocation_fix_safe. This is the preferred update for both single and multi-threaded applications. |
I like this a lot.
There's a race. Condition at 51. Multiple threads could both enter and both
find the muted null and bith create them and the operate with different
lock instances in that first run. Pretty minor. Can it just be allocated in
a larger scope where it'll only be allocated once?
But overall, I really like this. Thanks for doing this and sharing it.
…On Wed, Jun 12, 2024, 12:00 PM teknynja ***@***.***> wrote:
For anyone using this library in multi-threaded scenarios, I've created a
new branch with the same fix but it also includes a mutex to make it thread
safe (works just fine for single-threaded apps as well):
teknynja/Adafruit_NeoPixel/tree/esp32_rmt_memory_allocation_fix_safe
<https://github.com/teknynja/Adafruit_NeoPixel/tree/esp32_rmt_memory_allocation_fix_safe>.
This is the preferred update for both single and multi-threaded
applications.
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD35RKD67MCB43FQXBY3ZHB5CJAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRTGUYTGMJUGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah, i wasn't sure how to avoid that without having a way to initialize with the instance... |
Right, you'll have an allocated an unreachable object and never have a way
to release it. That's not totally terrible. but we can do better. We need
something that acts like a mutex, while we initialize our mutices.
(mutexen? :-) mutexan? 🤠👢? ) This calls for what C++ nerds call a
singleton. How can we enforce locking while bringing up our locks?
We use a std library function that does it behind our back!
static std::once_flag flag_; // Notice the flag isn't even really used; it
just gives us an object we can put junk in.)
If you want a peek at how pros like Google do this, we can see the source
they used before C++11 was common in their super-useful abseil library here
(and while I dig abseil, it's 2024 so C++11 should be everywhere that
matters by now, so we'll use the version in the std:: library instead of
linking against Google's We're just using their source for education. (Yes,
I admit I had to look it up myself...):
https://github.com/abseil/abseil-cpp/blob/master/absl/base/call_once.h
Better still, we can see how they use it in their own tests. (foo_test.cc
tests foo in Google-land.)
https://github.com/abseil/abseil-cpp/blob/master/absl/base/call_once_test.cc
So I think line that code starts to look something more like:
static std::once_flag flag;
static std::mutex show_mutex;
std::call_once(flag, []() { show_mutex = xSemaphoreCreateMutex();
});
Now xSCM will be calle dexactly once and show_mutex will be set exactly
once.
Wait, what is this []() { stuff; } ?
That is a Lamba. It lets you create and use a function in place without
actually giving it a name. We could make a
set_mutex(void) {
show_mutext = xCSM()
}
... but now we have to expose show_mutex to it. We could do that by passing
in an pointer, and then
set_mutex(std::mutex* foop;)
*foop = xCSM()
which for such a tiny function isn't a big deal for a one-liner, but if
there were some heft to that function, and it neede access to lots of
internals, you can probably see this becoming a getter/setter nightmare.
There are ways to pass arguments to lambda:s auto addr = [](int x) {return
x+1;} and exotic ways to 'capture' more arguments and possibly defer the
binding until the actual call. We don't NEED any of that, so I'll keep the
syntax simple and give you a few keywords to satisfy your curiousity, if
I've not already sent you into the woods.
Honestly, it's also pretty rarethat these would be statics in a Real
Program, they'd be stored in some kind of a class holding espShow() and
friends, but I'm trying to keep this light and not rearrange much, keeping
with your original principles.
In case it helps, since I had to look up the syntax myself, here's the
standalone test program I used to sketch it out:
#include <mutex>
#include <thread>
#include <iostream>
void t() {
static std::once_flag flag;
static std::mutex show_mutex;
#if 1 || TEST
static void *p { nullptr };
std::call_once(flag, []() { p = malloc(1024); });
std::cout << p << std::endl;;
#else
std::call_once(flag, []() { show_mutex = xSemaphoreCreateMutex();
});
#endif
}
int main() {
t();
t();
t();
return 0;
}
Here, when building -Dtest, you can see that p is calling malloc each time
we're called (leaking) and we should see P dropping 1K each time if we just
called malloc each time, but you'll see we call malloc once and the
returned (OK, I didn't actually return it; I just printed 'p') actually
stays the same upon every call:
➜ /tmp c++ -std=c++20 x.cc -o x && ./x
0x121808800
0x121808800
0x121808800
So, we could have built a real function and passed the address of the
function to call_once. That function would then do the work and return -
once. But I chose instead to do the exact same thing in a lambda because
here, it's just a bit more tidy and keeps our internals internal without.
Where else can you use lambda? Anywhere you'd use a function pointer.
Perhaps obviously, they're handy for things like GUI callbacks, but qsort
lends iteslf well to lambda use as they're "just" unnamed functions. The
syntax is a little egg-headed, but knowing where to find good examples
helps. YOu can capture by rference [&] (which aligns to pointer syntax) or
bay value [=] which is fairly mnemonic. Varargs are possible, but let's
leave those alone. IMO, they make code easier to read when implemented
well, but the people that think that 1972 C was good enough for everyone
will always scream about them.
So today we learned about a
Singleton - a function that always returns one, correctly initialized
object. It's like a global and some people fuss about that design pattern.
It's difficult to define destruction order, for example and they share lots
of problems with globals like global mutable state and difficulty to mock
for tests. But here, you're always dealing with one THING and dealing with
it in a small, controlled way, so it seems OK. One reasonable description
is given at https://refactoring.guru/design-patterns/singleton
Lambda - Lambdas are just functions with no names (!) that can be used in
line and have a capture list (the thing between []) and variables that are
passed by reference & or by value = and a paramater list included in the
(), just like a function does. Tutorials like
https://smartbear.com/blog/c11-tutorial-lambda-expressions-the-nuts-and-bolts/
do a good job without just repeating the cppreference page, which aren't
always for time before coffee.
It's a lotta typing for what I think was like a three line "fix" to a
problem we've agreed wasn't a totally terrible problem.
My OS used to be used for the controls on submarines, for example, and my
final employer used to have s (true) saying that around (t)here, things
that happen "once in a million times" happen multiple times every day. So I
like to tidy things up as tight as I can.
As always, feel free to use that code in your PR (please) and I hope you
(and anyone else reading this) learned something along the way. Maybe it'll
help the reviewer recognize what we did and why.
Enjoy
…On Wed, Jun 12, 2024 at 1:23 PM teknynja ***@***.***> wrote:
Yeah, i wasn't sure how to avoid that without having a way to initialize
with the object...
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD3ZBGHRI7JH6EDH2XKDZHCGZDAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRTGY2TIMBSGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
hmm, it doesn't appear we have access to the std library in this project (perhaps because this is just a plain ol Your example did get me looking into possible solutions from within the FreeRTOS framework itself, and I discovered the I'm wondering if I could just do something quick-n-dirty, like disabling interrupts around where I'm checking/creating the mutex... it should be pretty quick through that section, and is in keeping with the design philosophy of this library. (Edit: Ugh, disabling interrupts would only solve it on single core apps!) I nags at me too, having even a tiny crack like this race condition in the code - my day job is working on industrial process control systems and we'd never let something like that into production. Interestingly, I just started looking into c++ lambdas a short time ago to try and solve a different problem (I'm a big |
Good grief. It's c++ (I see an operator new) but it doesn't allow C++. The
main neopixel.cpp has a cpp extension and is definitely c++...but is about
half asm(). And Adafruit_NeoPixel::show( could be subclassed for the OS
specific parts instead of #iddeffing it in. Freakin' arduino! :-|
Can you just move the creation up somewhere before all the tasks are
created, ala
https://www.digikey.com/en/maker/projects/introduction-to-rtos-solution-to-part-7-freertos-semaphore-example/51aa8660524c4daba38cba7c2f5baba7
and
https://embetronicx.com/tutorials/rtos/freertos/freertos-mutex-tutorial-using-lpc2148/
and
https://community.nxp.com/t5/MQX-Software-Solutions-Knowledge/How-to-use-mutex-and-semaphores-in-a-FreeRTOS-and-SDK2-0-Project/ta-p/1114196
All those examples just hoist it before all those icky threads are
distracting it.
There's prior art for #ifdefs
inside Adafruit_NeoPixel::Adafruit_NeoPixel(). Then again, rp2040Show() is
doing a very similar sort of thing. Now you have a test inside what's in
what's surely the most critical path testing false for all but the very
first strand of pixels displayed.
Looking at bigger picture, I think I'd lean toward an #ifdef (sigh)
inside Adafruit_NeoPixel::Adafruit_NeoPixel() so that you do this only once
and don't have to take a branch at 120fps or whatever. No 'run_once' magic
needed.
I'm quickly learning that your "I do very little C++" was a bit
understated. :-)
…On Thu, Jun 13, 2024 at 12:04 PM teknynja ***@***.***> wrote:
hmm, it doesn't appear we have access to the std library in this project
(perhaps because this is just a plain ol .c file?). At least when I try
to build in the Arduino environment, it complains that it can't find the
header files. Also not sure how the maintainers would feel about adding
references to std:: since it doesn't appear they are using it anywhere else.
Your example did get me looking into possible solutions from within the
FreeRTOS framework itself, and I discovered the
xSemaphoreCreateMutexStatic
<https://www.freertos.org/xSemaphoreCreateMutexStatic.html> function
which looked *very* promising, but digging through the FreeRTOS source
code, it looks like that call is probably not be thread-safe either. :-/
I'm wondering if I could just do something quick-n-dirty, like disabling
interrupts around where I'm checking/creating the mutex... it should be
pretty quick through that section, and is in keeping with the design
philosophy of this library.
I nags at me too, having even a tiny crack like this race condition in the
code - my day job is working on industrial process control systems and we'd
never let something like that into production.
Interestingly, I just started looking into c++ lambdas a short time ago to
try and solve a different problem (I'm a big abuser user of lambdas in C#
though!)
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD377BPKFRAVJN7HEVCDZHHGLJAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGMYDIMZZHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I was trying to avoid modifying the main In a perfect world, the mutex would be created in the startup thread before any child threads are spawned, but that's not a reasonable thing to expect a typical Arduino user to deal with. The best we could probably hope for is to require the Adafruit_NeoPixel instances to be created on the main thread, and then calls to |
I think that's exactly the needed solution.
Global Constructors are called before (the equivalent of) main() - so
inside a ctor, we know we're single threaded. Having the NeoPixel ctor
handle this is perfect. Stylistically, I'd probably do it in the body
instead of in the initializer list, but I could go either way.
This is why having it in a singleton (that's created as a global ctor) and
used by Adafruit_NeoPixel seems to work. That removes the synchronization
requirement from the caller., right?
…On Thu, Jun 13, 2024 at 2:46 PM teknynja ***@***.***> wrote:
I was trying to avoid modifying the main .cpp file, but at this point it
looks unavoidable. I could put the mutex in the initializer list, but that
feels pretty invasive (not to mention how ugly the #ifdefs around the
ctor would be) and it still doesn't completely close the race condition
(just moves it to the ctors racing). Even if we don't do a "test and set",
there's still an opportunity for a thread to stomp on another thread's
mutex.
In a perfect world, the mutex would be created in the startup thread
before any child threads are spawned, but that's not a reasonable thing to
expect a typical Arduino user to deal with. The best we could probably hope
for is to require the Adafruit_NeoPixel instances to be created on the main
thread, and then calls to .begin(), .show(), et al could be done on the
worker threads. Without low (cpu) level synchronization primitives, it
think it's gonna be tough to create a mutex on a thread that needs to be
protected with that very same mutex.
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD32SAUQKMLO6JUN3FBLZHHZJZAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGY2DINBTHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Updated pull request #394 to move the initialization of the This is a slightly more "invasive" change then before, as there are minor changes now to the |
I am just a kibitzer myself (who ran into this issue), but here's my understanding of the situation, for reference. Background:
So far this is all fine. But we have some wrinkles:
To deal with this RMT driver interface change in ESP-IDF v5.x, Adafruit (Lady Ada herself!) changed the NeoPixel library. On ESP-IDF v5.x, instead of using the ESP-IDF RMT driver, it uses the arduino-esp32 wrapper driver, which in arduino-esp32 v3.x uses the new ESP-IDF RMT driver. BUT, there was a reason to bypass arduino-esp32 originally. (Remember I said I'd get to this.) The ESP32 RMT hardware can be programmed in various ways. One is just to give it a whole sequence of exact bit timing in a big buffer. (Like, "set the line HIGH for 350 microseconds, then LOW for 75 microseconds, then...".) This is the only mode the arduino-esp32 wrapper supports. But there's another mode which ESP-IDF supports, where you store the buffer in your own favorite format, and supply a little translation function that turns a chunk of buffer into a chunk of timed bits, and ESP-IDF calls that function from an interrupt as needed to translate a chunk at a time, which is much more memory efficient. This "streaming" mode is what the NeoPixel library uses on ESP-IDF v4.x. On v5.x, with the Arduino wrapper, it just allocates a (potentially) big buffer -- 32 bytes of buffer for every byte of pixel data! And it does so on the stack. (Note the ironic-in-hindsight exchange in #367 where @ladyada wonders if it's OK to allocate the buffer on the stack, https://github.com/me-no-dev notes that it could be pretty big and should maybe be allocated on the heap, but @ladyada said "the stack memory is fine", probably thinking about post-return references and not about stack exhaustion...)
Going forward, probably the NeoPixel library should have code that uses the new ESP-IDF RMT driver, which does still support streaming mode (but with a whole different interface). Then everything will be dunky hory, though that will be a bunch of work for someone to do. Meanwhile, allocating that big buffer on the heap is probably a good move. It's not a cure-all, though -- depending on the length of your pixel chain, you could end up running out of heap memory! (1000 pixels * 3 bytes/pixel * 32x blowup = 96KB, which is a lot of memory for an ESP32.) |
Nice writeup. A couple of us are kibitzing on this issue. Unfortunately, we can't seem to attract the attention of anyone capable/interested in offering feedback and/or pressing the 'merge' button. Your understanding of the current, very messy, state of affairs matches my own. I'm increasingly tempted to rip out the actual hardware-groping parts of several LED libraries and just rewiring them to use Espressif's own https://github.com/espressif/idf-extra-components/tree/master/led_strip at the bottom end and just using the existing stuff for HSV-RGB transitions, blending, fading, line-drawing, and all that. Maybe Espressif can keep their own code stable and working on a variety of their own chips and IDF version...and we have less Arduino-quality code in the world. Since you have a great "state of the union, mid 2024" edition going, I'll add one other bit of "worse news" to the list that will affect a substantial percentage of hobbyist/users in this market. Let's make it easy for the LLM harvesters (or any humans that actually read this stuff) to ingest it all on a single page. Platformio uses backrevved versions of, well, most everything for ESP32. That C++23 feature you're trying to use? Good luck with that six year old version of g++ they bundle. The reason for this is that Espressif and Platformio are having a very public love withdrawal. Platformio wants Espressif to pay them. Espressif keeps sending patches and trying to keep them running, but Platformio won't even accept the PRs. Espressif doesn't really see a need to pay Platformio to smear a layer of goo atop of their own IDF. This means if you're trying to run a new Espressif part with Platformio on the Arduino platform, you're probably hosed. The web is full of much semi-public feudin' between them, but https://www.cnx-software.com/2024/06/01/espressif-releases-arduino-esp32-core-3-0-0-but-platformio-support-in-doubt/ is a good start if you want to pull at that thread. This spat has been going on for at least a year at this point and I haven't yet seen either side blinking. The result is that a large percentage of the popular blinky code (because it's all dependent on RMT or the more stable SPI) just plain doesn't build right now if you check out the head versions of everything and are using Platformio. It's a mess. The fastled issue tracker (which was already struggling to keep up) is buried in duplicates of "it doesn't work" posts and sometiems "... but don't bother yourself actually fixing it, just downgrade instead" answers. Sidebar: Maybe there's a message here that spreading your development over what you thought was an IDE and what you thought was a convenience wrapper for IDF that happens to look like it was written for an ATMega (ugh) can substantially increase the risk to your project's velocity because parties that you're not directly paying are mad because nobody else is paying them either. Right now, Jason2866 is almost single-handedly holding up the support load and the build/integration issues to publish a mostly viable Arduino3/IDF5 hybrid and that seems like totally the wrong resolution, but it's a nice testament that open source makes that even possible. Sidebar to the sidebar: I'm not sure that all these "let's allow you to mix and match the incompatible drivers" flying around are great for the ecosystem either. If they didn't build versioned compatibility into the API (and sometimes, you just can't...) the effort would be better spent just fixing the callers, IMO. But Espressif doing goofy things like gratuitously renaming DMA registers on RMT between ESP32-S3 and ESP32-Nothing in the same build isn't a great confidence builder, either. Exactly none of this impacts the probable correctness of this PR which deserves to be reviewed and potentially merged. Teknynja's fix is unrelated to this squabble, tackles the very issue that LadaAda and MeNoDev discussed, and is independently confirmed to work. It should be merged. Withholding stability fixes is not helping the state of the blinkyverse right now. 73px on what's currently the flagship Espressif part (P4 any day now...but without WiFi /shruggie) is a pretty low bar. As for it not being a cure-all, that's certainly true; but we know one configuration where you're never going to get 96K and that's on the stack in FreeRTOS. :-) Blink on! |
@egnor – I think that captures (in great detail) the situation surrounding this issue, thanks! And @robertlipe, I had no idea about the drama/issues going on with Platformio (ignorance is bliss!). I agree that the amount of buffer space per NeoPixel is problematic, the primary motivation for my work on this issue was to be able to break the unexpectedly low limit of ~73 pixels when I was using the new ESP-IDF RMT driver for other purposes in my project as well. Initially, I attempted to use the low-level solution of translating the pixel buffer data "on-the-fly", but getting the timing right to where that translation code was called with consistently low enough latency to keep the RMT’s buffers fed proved challenging. If I remember, it also caused problems with the 5.x RMT library’s buffer allocation, which otherwise seems to work very well as long as you stick to the 5.x API. I think until a better solution comes along, the 1000ish pixel limit will get most users of this library where they want to go, and if it doesn’t then they will need to select a different platform to achieve their goals (just like you would need to step up from the ATmega328 if you wanted to drive more than a few hundred pixels). I might be a bit biased about getting my changes merged into this library, but I think given the state of things at the moment, this change allows a large percentage of users to get on with their development and worry about other problems. Thank you for prodding this issue, I was beginning to wonder if it was destined to end up as a footnote on archive.org 😉 |
im somewhat sleep deprived - is there a fix anyone can PR that will address this or do y'all need me to research it? the IDF5 fix was a patch to keep things going, wasn't rigorously tested. |
@ladyada as a quick-ish fix take a look at the two PRs (pick one or the other, not both) by @teknynja
((My guesswork: Probably you'd like to keep the behavior of initializing and deinitializing RMT entirely within If you'd like someone to take a look at "properly" porting to the new RMT API, I/we could take a crack at that...? I think it's mostly mechanical, albeit detailed. (There's also @robertlipe's proposal to use Espressif's example code.) p s. get some sleep @ladyada, I can't believe you run the company AND dive into the weeds of RMT APIs on the ESP32 lol! surely you have an army of Adaminions? |
@ladyada - RMT Buffer Allocation Fix (Thread-Safe) for Issue #375 #394 is the PR I recommend as it includes tweaks to improve the thread-safety of the fix. The other PR tries to keep all the changes within the And yes @ladyada, get some rest! |
+1. I may have no credibility here yet, but I have some. I reviewed and
coached this PR (the coaching was in the other, bit the result is here) and
would approve it into my own project. It should at least be reviewed,
please.
We may wish the RMT DMACs worked differently or that the espressif driver
API would get their act together or that code were massively structured
differently or such, but this is a solid fix with low risc that doesn't
perturb other platforms and is independently confirmed to work.
Oh, and since psram is so common on S3 -and this doesn't actually use more
ram, it just uses it more responsibly - I didn't much blink at the ram
footprint. Twiddling buffers on the fly is the opposite of handing a block
to a DMAC and then running radios, compressing things, reading other
sensors and doing other Real Work. Reliable blinkage counts for a lot.
Please do considerer merging this PR or something very close to it. The
author is capable and willing to help fix a pretty bad problem.
I'm just awakening myself at 1530. Sleep disorders stink. Take care of
yourself, LadyAda.
…On Mon, Aug 5, 2024, 9:20 AM teknynja ***@***.***> wrote:
@ladyada <https://github.com/ladyada> - RMT Buffer Allocation Fix
(Thread-Safe) for Issue #375 #394
<#394> is the PR I
recommend as includes tweaks to improve the thread-safety of the fix. The
other PR tries to keep all the changes within the show() method at the
expense of some thread safety.
And yes @ladyada <https://github.com/ladyada>, get some rest!
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD37PLHNMJHZQSUYUIZLZP6C2NAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRZGE4TONBWGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
For future generations reading this on archivedotorg... Please remember
that 281xs are write-only and there's no ack. You don't NEED a thousand
LEDs, power supplies and all that to TEST multiple channels of a thousand
LEDs. Build code and test at the edges. Just let the electrons fall into
the floor. No recycling necessary. 😉
…On Mon, Aug 5, 2024, 3:50 PM Robert Lipe ***@***.***> wrote:
+1. I may have no credibility here yet, but I have some. I reviewed and
coached this PR (the coaching was in the other, bit the result is here) and
would approve it into my own project. It should at least be reviewed,
please.
We may wish the RMT DMACs worked differently or that the espressif driver
API would get their act together or that code were massively structured
differently or such, but this is a solid fix with low risc that doesn't
perturb other platforms and is independently confirmed to work.
Oh, and since psram is so common on S3 -and this doesn't actually use more
ram, it just uses it more responsibly - I didn't much blink at the ram
footprint. Twiddling buffers on the fly is the opposite of handing a block
to a DMAC and then running radios, compressing things, reading other
sensors and doing other Real Work. Reliable blinkage counts for a lot.
Please do considerer merging this PR or something very close to it. The
author is capable and willing to help fix a pretty bad problem.
I'm just awakening myself at 1530. Sleep disorders stink. Take care of
yourself, LadyAda.
On Mon, Aug 5, 2024, 9:20 AM teknynja ***@***.***> wrote:
> @ladyada <https://github.com/ladyada> - RMT Buffer Allocation Fix
> (Thread-Safe) for Issue #375 #394
> <#394> is the PR I
> recommend as includes tweaks to improve the thread-safety of the fix. The
> other PR tries to keep all the changes within the show() method at the
> expense of some thread safety.
>
> And yes @ladyada <https://github.com/ladyada>, get some rest!
>
> —
> Reply to this email directly, view it on GitHub
> <#375 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACCSD37PLHNMJHZQSUYUIZLZP6C2NAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRZGE4TONBWGI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
(Knowing what we now know, I believe the PR can be simplified a bit, and I left comments to that effect, which you should feel free to adopt or ignore. As far as I can tell it should work as-is, but with simplification would be easier for @ladyada or her Adaminions to review.) Absolutely agreed with @robertlipe that (1) we should take this PR or something like it as a quick fix, (2) one can test without necessarily having million mile LED strips. (I do have very long LED strips around, for what that's worth. I don't have every ESP32 variant under the Sun, but I have several. Regardless, if anyone is set up to test NeoPixel output, it would be Mama Adafruit.) |
The "bitbucket" test is a valid way to make sure the code compiles and runs w/o crashing, but in order to make sure there aren't any visual glitches or other timing artifacts, we'd at least need to connect a reasonable number of actual devices and watch the result (or pour over a logic analyzer's output ;-) ) |
Building and not crashing would put it in the top quartile of blinky
projects. (I mean, we're having this discussion because it did...) The
ESP32-ish blinky ecosystem right now is a mess.
I'd given some thought to trying to build a WS2812 emulator or receiver
that did exactly that, just for testing. I won't compete with Ms. Ada for
the number of strips at my disposal as I don't have a literal warehouse,
but it's a lot ... and it's a pain. Then I realized that doing that in
software is silly.
One of us (in the blink-osphere) should take the real chip (Adafruit
product 1378 :-) and instead of attaching RGB pins to actual LEDs, route
those as inputs to a reasonably powerful microcontroller. Maybe you also
need DOUT. We could do clock recovery (/waves hands broadly) to know when
to sample/latch the signals into a conveniently large bag of pixel bytes/.
Feed those buffers back to a Real Computer over WiFi or USB after some kind
of trivial compression. Do some kind of reset detection to know where a
frame ends. I think one could detect the common, timing induced, visual
glitches by detecting the equivalent of "runt frames" which have fewer than
the expected number of transitions because you'd see > 50uS (a reset code)
present on the bus.
As a blinkie developer, I'm one of dozens of people on the planet that
would pay hobbyist money (not $10K logic analyzer money) for this just so I
don't HAVE to do that inevitable "stare at the strip" exercise. Honestly,
four of the 16x16 panels is a fast way to get to a thousand pixels and, at
moderate intensity, they're not THAT hard to power, but I'd love to have
something on the bench I could use for automated regression testing or even
effect development. After analysis to know what percentage of frames are
malformed (is the glitch level acceptable?) feed those buffers into ledcat
or dozens of other projects that can render an RGB array into a browser, a
GIF file, window, other display, or whatever.
I got hung up on trying to emulate the 2812 when the epiphany is that we
should just capture the output of the pins of the 2811.
I'd be willing to partner in the development of such a project. Just sayin'
RJL
…On Mon, Aug 5, 2024 at 5:58 PM teknynja ***@***.***> wrote:
The "bitbucket" test is a valid way to make sure the code compiles and
runs w/o crashing, but in order to make sure there aren't any visual
glitches or other timing artifacts, we'd at least need to connect a
reasonable number of actual devices and watch the result (or pour over a
logic analyzer's output ;-) )
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD3YI3EVDGJ3PVIUCAOLZP77QPAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZQGA2TQMJUGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
QEMU based esp32 simulator: |
Qemu is awesome, but not the right end of the wire for this, Zach.
Emulating CPU is a road well traveled. I want something to "emulate"
(capture) the LIGHTS. You make FastLED send a pattern and my fanasth board
attaches to the data out pin and captures that frame and sends it to a real
computer for analysis (did it match what we sent? How many fps? What
percentage is mangled), capture (give me a gif for documentation), or
display.
Cheap enough that you can leave one or more ports on it attached to all the
CPUs you care about in your test farm.
Think LED receiver, like a video capture card that gulps DVI and writes PNG
or mov or whatever.
But, yes, you and your peers are FastLED would be good candidates for my
"dozens" of customers.
…On Mon, Aug 5, 2024, 7:36 PM Zachary Vorhies ***@***.***> wrote:
QEMU based esp32 simulator:
https://github.com/mluis/qemu-esp32
@robertlipe <https://github.com/robertlipe>
—
Reply to this email directly, view it on GitHub
<#375 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCSD34KZB7M2F4KRAPOSVDZQALCPAVCNFSM6AAAAABAYU77RSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZQGE2TCNZVHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I actually did the exercise of writing a new-RMT-interface-based streaming LED strip driver, cribbing from Espressif's led_strip component. There's some subtleties (like state management in the encoder function, or interaction with the Arduino library's peripheral manager) but it's not too complicated. So @robertlipe your idea of wrapping Espressif's component as the "backend" seems reasonable, or maybe adapt that code to something a little more backend-able (e.g. no need to keep its own pixel buffer). But overall this is just "there was a major version bump, some APIs changed, gotta move to the new ones", no need to overthink things too much. As for automated testing, I suspect you're not entirely serious about that particular goose chase, but I would have done it with automation on top of a logic analyzer (or maybe scope). The kind you'd need would be expensive, but... |
On Tue, Aug 6, 2024 at 2:03 PM Daniel Egnor ***@***.***> wrote:
I actually did the exercise of writing a new-RMT-interface-based streaming
LED strip driver, cribbing from Espressif's led_strip component
<https://components.espressif.com/components/espressif/led_strip>.
Funny. I'm actively kibbitzing
<espressif/idf-extra-components#344> that very
project, starting just a few days ago, as I'd had the same idea. :-) I
hadn't decided between trying to "fix" an existing well-known library to
sensibly support modern ESP32 or just to replace it, but that driver model
makes it LOOK easy and if that scales out to a few kPixels without all the
hairy conditional compilation we have now, it certainly seems tempting.
IT's the age-old "fork or fix" debate. It seems many projects outgrow the
maintainers ability to build and test all their inevitably supported
combinations of platforms and targets and then collapse under their own
weight. If you only care about IDF5, lots of things get simpler.
As for automated testing, I suspect you're not *entirely* serious about
that particular goose chase, but I would have done it with automation on
top of a logic analyzer (or maybe scope). The kind you'd need would be
expensive, but...
I've thought about it more than a bit. Expensive is the opposite of what
I'd want as being able to attach multiple pins and leave such things in a
CI style build farm seems like a worthwhile goal. I see things like the $20
Sipeed KVM that eats DVI and puts video frames on an ethernet channel and
think that eating a couple of 800kHz streams sounds easier than that. Yes,
that chip has video decoders, so it's not a fair fight, but WS2811 chips
are cheap. Doing it with those awful little USB logic analyzer and Sigrok
crossed my mind, but again, that's a lot of overhead and setup when all
(I think) I want is a buffer of RGB[WW]s.
Message ID: ***@***.***>
… |
FastLED dev here. I'm looking at the ESP32 documentation on the new RMT changes. Some changes aren't that big of a deal. Others are painful. It's possible that a subset of features that intersect between the esp-idf RMT versions can be merged together into a unified driver that somewhat resembles the legacy driver. What makes this easier is that the led devs only need to be concerned about the TX portion of the RMT protocol. The big changes I see are the following:
I haven't dug too much further than the docs, but it could be possible to create a unified driver where led devs only need to change it minimally once for the 4.x codebase and then it just works in the 5.x codebase. Things like dynamic status and pin changing would be disabled in this unified driver but I don't think that's a big problem and could be bridged with a strategy where if you want to change pins you just re-initialize the whole channel. Just my 2c. |
I have weird problem. I test example code or example with new operator and everything works fine if I have 73 or less RGB leds. If I use 74 or more, everything crash. Can anyone explain it?
Debug code: adafruit_neopixel_crash.txt
The text was updated successfully, but these errors were encountered: