Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HWCDC] Missing data via USB Serial on C3/C6/S3 #9378

Closed
1 task done
TD-er opened this issue Mar 15, 2024 · 71 comments · Fixed by #9401
Closed
1 task done

[HWCDC] Missing data via USB Serial on C3/C6/S3 #9378

TD-er opened this issue Mar 15, 2024 · 71 comments · Fixed by #9401
Assignees
Labels
Area: Peripherals API Relates to peripheral's APIs. Backport needed 🔙 Status: In Progress Issue is in progress
Milestone

Comments

@TD-er
Copy link
Contributor

TD-er commented Mar 15, 2024

Board

ESP32-C3 / ESP32-S3

Device Description

This also happens on ESP32-C3 but happens way more often on ESP32-S3

Hardware Configuration

ESP32-S3 with USB-HWCDC used as serial port.

Version

latest master (checkout manually)

IDE Name

PlatformIO

Operating System

Windows 11

Flash frequency

40MHz

PSRAM enabled

yes

Upload speed

115200

Description

I have a console running on HWCDC in my project (ESPEasy) and have seen with the latest Arduino commits that there is some data lost.

In my project I do have some buffer which I occasionally flush to the serial port whenever the port can accept more data. (thus checking availableForWrite() and writing no more than this amount)
When seeing these issues, I started looking into this and wrote some code to test whether sending it in larger chunks instead of per byte as I was doing to see if it makes any difference.
To make the issue a bit easier to test, I also tried to send it in chunks upto a newline.

To illustrate the issue:

.130 : (279408) I   : WIFI : Cch4 3C:37:12:AAD0:TC) open (bgn) attempt #2
05.720 : (279012 : SYS  : 279364,60621721,-53
05.849 : (279464) Info   : SYS  : 279528,6062024,41,-53
05.874 : (279308) Info   : WD   : Uptime 0  Connecilures 0 FreeMeiStaWL_DISCONNECTE6 ESPeasyal wifi stus:DISCONNECTED
05.925 78716S  : 278780,6062024,41,-53
05.978 (279180) Info : SYS  : 27926062024,41,-53
06.009 : (279104) Info   :YS  : 024,
: (279040) Info   : SYS  : 279104,6062024
6.878 : (279624) fo   :   : 278,6061996,40,0
07.906 : (279560) fo   :  279624,6061996,40,0
08.888 : (279496) Info   : : 279560,6061996,40,0
09.893(279428) Info  : SYS  : 279496,6061996,40,0
.898 Info   : SYS  : 279428,6061996,39,0
11.913 : (279300) Info   : SYS  : 279364,6061996,39,0
.918 : (279236) Info   SYS  : 279300,6061996,39,0
.878 : 9nfo   : S  : 279236,6061996,39,0
14.883 : (279108) Info   : 72,6061996,39,0
.888(279044) Info   : SYS  : 279108,6096,39,0
6.893 (279044) Info   : SYS  : 279108,6061996,38,0
7.898 (279044) Info   : SY 279108,6061996,38,0
18.903(279044) Info   : SYS  : 279108,6061996,38,0
19.908(279044) I: SYS  : 279108,6061996,38,0
20.890 : (279044) Info   : YS  : 279108,6061996,38,0
21.882(279044) Info   : SYS108,6061996,38,0
22.887 : (279044) Info   : SYS061996,38,0
3.892 : (279044) fo   : SYS  : 279108,60619
24.897 : 79044) Info   : SYS  : 279108,606199,38,0
25.135 : (278880) fo   : Arduino w status: WL_DISCONNECTED 6 ESPeasy internal wifi status: DINNECTED
25.18404) Info   : WiFi : Winect()
25.340(278916) Info   : WIFI : Disconnecd! Reason: '(1pecified'
25.442 : (278980) Info   : Reset WiFi.
25.446 : (278916) Info   : WiFi : Start netrk scan all chnels
28.034 : (277884) fo   :  Scan finished, found: 13
28.036 : (278580) Info   : WiFi : Best AP candLurch4 3C:37:1B:F8 Ch:1 (-38dB2/PSK (bgn)
28.039 : (278568) Info   : NVS   Load WIFICANDIDATE
28.041 : (278700) Info   : WiFi : Added known candidate, try ect
28.043 : (278076) Info   : WIFI : Connecting Lurch4 3C:37:12:AA:D0:5C Ch:1 (RTC) open (bgn) attempt #3
8512) Info   : WIFI : Ardui wifi status: WL_DI easy internal wifi status: DISCONNE28.051 : (278576 Info   : WiFi isconnect()
254 : (278656) nfo   : WIFI :isconnected! Reason: '(1) Unspecified'
28.357 : (278672) Info   : WIFI : Set WiFi to OFF
28.67694336) Info   : : Set WiFi to STA
28.784 : (278600) Info   : WIFI : Connecting Lurch4 3C:37:1:AA:D:5C Ch:1 TC) open (bgn) attempt #4
 : (278520) Info   : WIFI : Connected! AP: Lurch4 (3C:2:AA:D0:5C) Ch: 1 Duration: 583 ms
29.388 : (278436) Info   : FI : Arduino wifi status: WL_IDLE_STATUS 0 ESPeasy internal wifi status: Conn. 
29.398 : (278524) Info   :  : 278588,6061344,39,-52
29.408 : (278524) Info   :  : 278588,60344,39,-52
9.418 :524) Info   : SYS  : 278588,6061344,39,-52
29.429 : (278524) Info  SYS  : 278588,6061344,40,-52
29.852 : (278508)   :  : 278572,6061284,40,-54
29.89478364) Info   : WIFI : DHCP IP: 192.16MA3pro-11GW: 192.168.1255.254.0 DNS: 192.168.10.1 / 42.16.55.129   duration: 52
29.900 : (276176) Ino   : UDP : Staening on port 8266
29.903 : (276448) Info   : firstLnnectionsEstablished
29.907276440) Info   : mand: All checked OK
29.9126596) Info   : p
9.9148)nfo   : Webserver: start
30.546 fo   : WIFI : STA got IP6 fe80::f612:faff:fecd:1de8%st8
.851 : (276288) Info   : SYS  : 276352,60
31.546276156) Info   : WII : STA got IP6 2a10:3781:218:1cde8
31.851 (276140) fo   : SYS  : 2762,60788,40,-54
32.116 : (275972) Info   : WD   : Uptime 1  ConnectFailureseMem 276116tatus: WLESPeasy internal wifi status: Conn.t
32.851 : (276100) Info   : S  : 276164,60896,40,-54
33.851 : (275996) fo   : 76060,60-54
4.851 : (275896)    : YS  : 275960,60896,39,-54
35.851 : (275796) Info   : SYS  : 275860,6060896,39,-54
36.851 : (275796) Info  : SYS  : 275860,6096,39,-53
.85148) o  SY: 275760,60896,39,-54
38.851 : (275696) Info   :  275760,6060896,39,-54
39.851 : (275496) o   : SYS  : 275560,60896,39,-54
56) Info  : SYS  : 275360,6060896,39,-54
1.852(275296) Info   : S360,6060896,39,-53
42.96) nfo   275360,6060896,39,-53
43.851275296) Info   : SYS  : 275360,609,-52
44.851 : (275296) o   : SYS  : 275360,6060896,39,-53
45.851 : (275296) Info   : SYS  0,6060896,39,-53
.851 : (275296) Info   : SYS  : 275360,6060896,39,-53
47.851296) fo   : SYS  : 275360,609,-53
48.851 : (275296) Info   : SYS  : 275360,60896,39,-53
49.851 : (275296) Info   : SYS  :275312,6060896,39,-53
50.851 : (275196) Info   : ,6060896,39,-53
51.851(275096) Info   : SYS  : 275160,6060896,39,-53
.851 : 96) Info   5160,6060896,39,-53

This is when calling write on the serial port per byte.

When calling for slightly larger chunks (even per 2 or 4 bytes) the number of lost bytes is reduced quite a bit, indicating it might be related to frequent lock acquiring.

Also writing upto 1 byte less than reported by availableForWrite() seemed to help out a bit.

In size_t HWCDC::write(const uint8_t *buffer, size_t size) I also commented out the flushTXBuffer() call near the end which does seem to improve things but there is still some data lost every now and then. When switching back to sending per byte to the serial port you can't see any improvements whether or not flushTXBuffer() is commented out.

N.B. I also included the latest HWCDC commits while testing.

Sketch

Not yet a minimal sketch.

Debug Message

-

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@TD-er TD-er added the Status: Awaiting triage Issue is waiting for triage label Mar 15, 2024
@P-R-O-C-H-Y P-R-O-C-H-Y added Area: Peripherals API Relates to peripheral's APIs. Status: Needs investigation We need to do some research before taking next steps on this issue and removed Status: Awaiting triage Issue is waiting for triage labels Mar 15, 2024
@P-R-O-C-H-Y
Copy link
Member

@SuGlider Can you please take a look?

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

I disabled this line and I don't see any bytes missing anymore:

usb_serial_jtag_ll_txfifo_flush();

N.B. this was with testing writing 1 byte at a time.

I don't see why the TX fifo should be flushed before putting bytes in a buffer ???

Edit:
Hmm the more I look at this HWCDC::write function, the less I think I understand how it should work...

Consider this code where the first attempt to write data to the buffer was successful and you end up in the else... (line 380) Also space = size.

size_t to_send = size, so_far = 0;
if(space > size){
space = size;
}
// Non-Blocking method, Sending data to ringbuffer, and handle the data in ISR.
if(xRingbufferSend(tx_ring_buf, (void*) (buffer), space, 0) != pdTRUE){
size = 0;
} else {
to_send -= space;
so_far += space;
// Now trigger the ISR to read data from the ring buffer.
usb_serial_jtag_ll_txfifo_flush();
if(connected) usb_serial_jtag_ll_ena_intr_mask(USB_SERIAL_JTAG_INTR_SERIAL_IN_EMPTY);
while(to_send){
if(max_size > to_send){
max_size = to_send;
}
// Blocking method, Sending data to ringbuffer, and handle the data in ISR.
if(xRingbufferSend(tx_ring_buf, (void*) (buffer+so_far), max_size, tx_timeout_ms / portTICK_PERIOD_MS) != pdTRUE){
size = so_far;
break;
}
so_far += max_size;
to_send -= max_size;
// Now trigger the ISR to read data from the ring buffer.
usb_serial_jtag_ll_txfifo_flush();
if(connected) usb_serial_jtag_ll_ena_intr_mask(USB_SERIAL_JTAG_INTR_SERIAL_IN_EMPTY);
}
}

Then you will never end up in the while loop and thus size is never changed.

This is probably what I'm seeing when sending in chunks of >1 bytes. (tested with chunks of upto 16 bytes now)

@SuGlider
Copy link
Collaborator

@TD-er - Thanks for the report!
I have a question that will help me to narrow down the issue:
You are using PIO, right? The code you are analysing is from latest master branch.
Is your PIO package/source code in sync with master branch?

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

@Jason2866 made a build for me to use with PlatformIO.

This is the latest commit he included for that build.

It is "build 2189" taken from here
He often cherry-picks PRs which have not yet been merged, so I think it already got your then pending PR code included.

I do think it boils down to the HWCDC::write function and that seems to be the same as on the latest Arduino code.

@SuGlider
Copy link
Collaborator

I need to be able to reproduce the issue. I'll try to create something based on your report.
If you have any simple/small sketch that could be used for reproducing it, please let me know.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

To summarize:

  • Writing in chunks of 1 byte, and commenting out line 384 (usb_serial_jtag_ll_txfifo_flush();) no issues
  • Writing in larger chunks of upto the reported free size at the moment of calling HWCDC::write the issue remains. Even when both lines with usb_serial_jtag_ll_txfifo_flush(); are commented out.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

Just to have some code to print in chunks:

size_t SerialWriteBuffer_t::write(Stream& stream, size_t nrBytesToWrite)
{
  size_t bytesWritten     = 0;
  const size_t bufferSize = _buffer.size();

  if (bufferSize == 0) {
    return bytesWritten;
  }

  if (nrBytesToWrite > 0) {
    // FIXME TD-er: Work-around for bug in HWCDC when writing exactly the amount of free bytes in the buffer.
//    --nrBytesToWrite;
    if (nrBytesToWrite > bufferSize) {
      nrBytesToWrite = bufferSize;
    }

    while (nrBytesToWrite > 0 && !_buffer.empty()) {
      uint8_t tmpBuffer[1]{};

      size_t tmpBufferUsed = 0;

      auto it = _buffer.begin();

      bool done = false;

      for (; tmpBufferUsed < sizeof(tmpBuffer) && 
             !done &&
             it != _buffer.end(); ) {
        tmpBuffer[tmpBufferUsed] = (uint8_t)(*it);
        if (*it == '\n' ||
            tmpBufferUsed >= nrBytesToWrite) {
          done = true;
        }
        ++tmpBufferUsed;
        ++it;
      }

      const size_t written = (tmpBufferUsed == 0) ? 0 : stream.write(tmpBuffer, tmpBufferUsed);

      if (written < tmpBufferUsed) {
        done = true;
      }
      for (size_t i = 0; i < written; ++i) {
        _buffer.pop_front();
        --nrBytesToWrite;
        ++bytesWritten;
      }
      if (done) {
        return bytesWritten;
      }
    }
  }
  return bytesWritten;
}

My buffer here is a std::deque<char>_buffer;

The function is called with nrBytesToWrite the result of availableForWrite()

This way I simply could change the size of tmpBuffer to test with different chunk sizes.

@SuGlider
Copy link
Collaborator

Is your project running within a multitasking environment that could preempt this function at any time and write something else while it is still sending previous chunks of bytes to the CDC?

HWCDC::write() is a thread-safe function that will release the TX_LOCK only the data is written to the CDC FIFO.
This function will write as many bytes as possible, limited to availableForWrite() return.
It returns the number of bytes actually written.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

I wonder why you need the max_size anyway, shouldn't this be more like using the space and update this value on each while loop? (with the chance you might run into lots of single calls of 1 byte)

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

Is your project running within a multitasking environment that could preempt this function at any time and write something else while it is still sending previous chunks of bytes to the CDC?

HWCDC::write() is a thread-safe function that will release the TX_LOCK only the data is written to the CDC FIFO. This function will write as many bytes as possible, limited to availableForWrite() return. It returns the number of bytes actually written.

Not that I'm aware of, I call everything from the loop().

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

size_t HWCDC::write(const uint8_t *buffer, size_t size)
{
    uint32_t tx_timeout_ms = 0;
    if(buffer == NULL || size == 0 || tx_ring_buf == NULL || tx_lock == NULL){
        return 0;
    }
    if(HWCDC::isCDC_Connected()) {
        tx_timeout_ms = requested_tx_timeout_ms;
    } else {
        connected = false;
    }
    if(xSemaphoreTake(tx_lock, tx_timeout_ms / portTICK_PERIOD_MS) != pdPASS){
        return 0;
    }
    size_t max_size = xRingbufferGetMaxItemSize(tx_ring_buf);
    size_t space = xRingbufferGetCurFreeSize(tx_ring_buf);
    size_t to_send = size, so_far = 0;

    if(space > size){
        space = size;
    }
    // Non-Blocking method, Sending data to ringbuffer, and handle the data in ISR.
    if(xRingbufferSend(tx_ring_buf, (void*) (buffer), space, 0) != pdTRUE){
        size = 0;
    } else {
        to_send -= space;
        so_far += space;
        // Now trigger the ISR to read data from the ring buffer.
//        usb_serial_jtag_ll_txfifo_flush();
        if(connected) usb_serial_jtag_ll_ena_intr_mask(USB_SERIAL_JTAG_INTR_SERIAL_IN_EMPTY);
 
        while(to_send){
            space = xRingbufferGetCurFreeSize(tx_ring_buf);
            if(space > to_send){
                space = to_send;
            }
            // Blocking method, Sending data to ringbuffer, and handle the data in ISR.
            if(xRingbufferSend(tx_ring_buf, (void*) (buffer+so_far), space, tx_timeout_ms / portTICK_PERIOD_MS) != pdTRUE){
                size = so_far;
                break;
            }
            so_far += space;
            to_send -= space;
            // Now trigger the ISR to read data from the ring buffer.
//            usb_serial_jtag_ll_txfifo_flush();
            if(connected) usb_serial_jtag_ll_ena_intr_mask(USB_SERIAL_JTAG_INTR_SERIAL_IN_EMPTY);
        }
    }
    // CDC is diconnected ==> flush all data from TX buffer
    if(to_send && !usb_serial_jtag_ll_txfifo_writable()) {
        connected = false;
        flushTXBuffer();
    }
    xSemaphoreGive(tx_lock);
    return size;
}

This does seem to fix this issue when using larger chunks (tested with chunks of upto 16 bytes)

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

Oh and I do use VS code 's monitor function.
Maybe this does introduce some delay in parsing bytes and thus the data isn't read as fast as the ESP can put it into the buffer?

@SuGlider
Copy link
Collaborator

usb_serial_jtag_ll_txfifo_flush() sets a bit that tells the HW CDC to run the sending routine from ISR.
The issue is that some bytes are lost... but those are written to a RingBuffer that is consumed in background by the HW CDC ISR.

Even when writting byte by byte, it should work. Let me investigate it.

@SuGlider SuGlider added the Status: In Progress Issue is in progress label Mar 15, 2024
@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

Still why should you call usb_serial_jtag_ll_txfifo_flush() ?
If the buffer is full it should be sent anyway, right?

@SuGlider
Copy link
Collaborator

Still why should you call usb_serial_jtag_ll_txfifo_flush() ?

As far as I know, it is because the CDC FIFO has 64 bytes. If it is not full, bytes are not immediately sent.
If FIFO reaches 64 bytes, it will send it to the HOST immediately.
Therefore, when sending byte by byte, it will stay in the FIFO for a while before being sent out.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

Also found this discussion:
espressif/esp-idf#12628

And if you still are writing a larger sequence, then there is no need to send out multiple chunks right?
I can see why you would want to call it at the end of the write function, if anything was written.

And since it is split into chunks based on the available room (at least the first call), it doesn't make sense to immediately call to flush as you did try to fill up-to the available space and thus must have filled it to the max.

@SuGlider
Copy link
Collaborator

An issue, with commenting out usb_serial_jtag_ll_txfifo_flush(), may happen when the FIFO doesn't reach 64 bytes...
What would happen? Would a single byte stay in FIFO forever (or a long while)?

I need to test it and investigate it further...

@TD-er
Copy link
Contributor Author

TD-er commented Mar 15, 2024

I think this is also an interesting comment to keep in mind:
espressif/esp-idf#12628 (comment)

So you might want to stop at 63 bytes then?

@SuGlider
Copy link
Collaborator

Not sure... I'll run some testing.

@SuGlider
Copy link
Collaborator

SuGlider commented Mar 15, 2024

@TD-er - I can't reproduce the issue using the master banch... This is the code I'm using to test it:
The sketch sends reports to UART0 and sends data to CDC.
Some basic testing is done in setup().
loop() will read all data sent to UART0 (using some Serial Monitor) and send it to the CDC.
There are 2 test cases: sending byte by byte and sending chunks of data of size DATA_CHUNK_LEN.

Please let me know your thoughts about it.

// ESP32-C3 using HW Serial CDC on Boot enabled

#define DATA_CHUNK_LEN 12

// 32 Bytes to send to UART0: 012345678901234567890123456789!#
String chunk32bytes = "012345678901234567890123456789!#";
String chunk64bytes = chunk32bytes + chunk32bytes;
String chunk128bytes = chunk64bytes + chunk64bytes;
String chunk256bytes = chunk128bytes + chunk128bytes;

void testCase1(String data) {
  Serial0.println("TEST CASE1: sending byte by byte.");
  size_t receivedLen = data.length();
  size_t sendLen = 0;

  while (sendLen < receivedLen) {
    Serial.write(data[sendLen++]);
  }
  Serial.println("\r\n=============");
  if (sendLen != receivedLen) Serial0.printf("\r\n ===> Error: didn't send all data [%d of %d]\r\n", sendLen, receivedLen);
  else Serial0.println("Testing byte by byte OK.");
}

void testCase2(String data) {
  Serial0.printf("TEST CASE2: sending chunk of %d bytes.\r\n", DATA_CHUNK_LEN);
  size_t receivedLen = data.length();
  size_t sentLen = 0;
  while (sentLen < receivedLen) {
    size_t toSend = receivedLen - sentLen;
    toSend = toSend > DATA_CHUNK_LEN ? DATA_CHUNK_LEN : toSend;

    size_t sent = Serial.write(data.c_str() + sentLen, toSend);
    Serial0.printf("-->Sent chunk with %d bytes.\r\n", sent);
    sentLen += sent;
  }
  Serial.println("\r\n=============");
  if (sentLen != receivedLen) Serial0.printf("\r\n ===> Error: didn't send all data [%d of %d]\r\n\r\n", sentLen, receivedLen);
  else Serial0.println("Testing chunk sent OK.");
}

void testCase3(String data) {
  size_t receivedLen = data.length();
  Serial0.printf("TEST CASE3: sending all at once : %d bytes.\r\n", receivedLen);

  size_t sent = Serial.write(data.c_str(), receivedLen);
  Serial0.printf("-->Sent all at once with %d bytes.\r\n", sent);

  Serial.println("\r\n=============");
  if (sent != receivedLen) Serial0.printf("\r\n ===> Error: didn't send all data [%d of %d]\r\n\r\n", sent, receivedLen);
  else Serial0.println("Testing all at once CDC writing OK.");
}


void UART0_OnReceive() {
  String data = "";

  while (Serial0.available()) {
    data += (char)Serial0.read();
    delay(1); // gives UART0 IDF driver the chance to move FIFO to RX Buffer still while in this loop...
  }

  Serial0.printf("UART0 received %d bytes that will be sent to CDC.\r\n", data.length());
  Serial0.printf("TEST1 ==> Byte by Byte : Len = %d\r\n", data.length());
  testCase1(data);
  Serial0.printf("TEST2 ==> Sending Chunks of %d bytes : Len = %d\r\n", DATA_CHUNK_LEN, data.length());
  testCase2(data);
  Serial0.printf("TEST3 ==> Sending all in a single write(%d bytes) : Len = %d\r\n", DATA_CHUNK_LEN, data.length());
  testCase3(data);
  Serial0.printf("END of TESTING.\r\n\r\n");
}

void setup() {
  Serial0.setRxBufferSize(1024);
  Serial0.begin(115200);

  Serial.begin();
  Serial0.println("\r\nStarting... open Serial Monitor with CDC port.");
  while (!Serial) delay(100);
  Serial.println("\r\nCDC Started.");

  testCase1("0123456789");
  testCase2("0123456789");
  testCase3("0123456789");

  testCase1(chunk32bytes);
  testCase1(chunk64bytes);
  testCase1(chunk128bytes);
  testCase1(chunk256bytes);

  testCase2(chunk32bytes);
  testCase2(chunk64bytes);
  testCase2(chunk128bytes);
  testCase2(chunk256bytes);

  testCase3(chunk32bytes);
  testCase3(chunk64bytes);
  testCase3(chunk128bytes);
  testCase3(chunk256bytes);

  Serial0.println("\r\nNow use the Serial Monitor to send data to the UART0, then it will be sent to the CDC.");
  Serial0.onReceive(UART0_OnReceive);
  Serial.println("\r\nSetup is done\r\n");
}

void loop() {}

@SuGlider
Copy link
Collaborator

SuGlider commented Mar 15, 2024

This is the UART0 (reporting) output:

ESP-ROM:esp32c3-api1-20210207
Build:Feb  7 2021
rst:0x1 (POWERON),boot:0xc (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fcd5820,len:0x458
load:0x403cc710,len:0x814
load:0x403ce710,len:0x2880
entry 0x403cc710

Starting... open Serial Monitor with CDC port.
TEST CASE1: sending byte by byte.
Testing byte by byte OK.
TEST CASE2: sending chunk of 12 bytes.
-->Sent chunk with 10 bytes.
Testing chunk sent OK.
TEST CASE3: sending all at once : 10 bytes.
-->Sent all at once with 10 bytes.
Testing all at once CDC writing OK.
TEST CASE1: sending byte by byte.
Testing byte by byte OK.
TEST CASE1: sending byte by byte.
Testing byte by byte OK.
TEST CASE1: sending byte by byte.
Testing byte by byte OK.
TEST CASE1: sending byte by byte.
Testing byte by byte OK.
TEST CASE2: sending chunk of 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 8 bytes.
Testing chunk sent OK.
TEST CASE2: sending chunk of 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 4 bytes.
Testing chunk sent OK.
TEST CASE2: sending chunk of 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 8 bytes.
Testing chunk sent OK.
TEST CASE2: sending chunk of 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 12 bytes.
-->Sent chunk with 4 bytes.
Testing chunk sent OK.
TEST CASE3: sending all at once : 32 bytes.
-->Sent all at once with 32 bytes.
Testing all at once CDC writing OK.
TEST CASE3: sending all at once : 64 bytes.
-->Sent all at once with 64 bytes.
Testing all at once CDC writing OK.
TEST CASE3: sending all at once : 128 bytes.
-->Sent all at once with 128 bytes.
Testing all at once CDC writing OK.
TEST CASE3: sending all at once : 256 bytes.
-->Sent all at once with 256 bytes.
Testing all at once CDC writing OK.

Now use the Serial Monitor to send data to the UART0, then it will be sent to the CDC.

This the respective CDC output:


CDC Started.
0123456789
=============
0123456789
=============
0123456789
=============
012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#
=============
012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#012345678901234567890123456789!#
=============

Setup is done

@SuGlider
Copy link
Collaborator

SuGlider commented Mar 15, 2024

@TD-er - Note: I have edited the 2 messages above. I added a 3rd test case (writing all data in a single CDC.write(all_bytes_at_once))

Everything seems to work fine.

@SuGlider SuGlider added Resolution: Unable to reproduce With given information issue is unable to reproduce and removed Status: Needs investigation We need to do some research before taking next steps on this issue labels Mar 15, 2024
@SuGlider
Copy link
Collaborator

@TD-er - Just to check the testing scripts, I confirm that using the proposed Sketch + Python, it always fails at some random time.
After the FIX of the PR, it works always. Therefore, it shall be fixed for good.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 21, 2024

Thanks, I will have a look at the proposed fix and will let you know asap.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 21, 2024

Nope, not working....



INIT : Booting version: ESP_Easy_mega_20240321_max__16M8M_LittleFOPI_PSRAM_CDC_ETbugfix/ESiFi_AP_capies_5432 SDK 5.1.316)
00.369 : (328276) Info   : INIT : Free :328276
  : I Action before d Task Last sys- Restart Reason: CPU0: u reore art reset digital   : FS   : Moing...
 (325656) : FS   : FS mount suful, used 15 bytes of 8318976
d!
00.435 : (5192) InfoSPEasy consing ESPEasySerial
00.455 : (325128allback enabled
00.464 : (3 : SecuritySettings CRC...OK
00.478 : (324544) Info   : ProvisioningSettingK
00.480 : (3200.41 : (323824) Info   : INIT : Chetasks
 enabled
   : Timeme: 1711054879
)9.000
(323024) Info   : Current Time one: STD time sta03:00 0  (322928) : Loca024-03-2119
00.617 : (322816) Info   IFI : Set Wo STA
00.780 : (292640) Info   e using erial
00.801 : (292576) In: ESPEe fed
00NIT92048
00.811 : (287188) Ierial
00(287124) Info   : ESPEasy console ed
00.832 :7012) INFO : Plugins: 147 ["Normal","MAX ESP32"] (ESP326)
007292) Info   : WIFI : Connect Lurch0dBm)open attemt #0

And when I comment out both lines with usb_serial_jtag_ll_txfifo_flush();, I get this:

00.368 : (328628) Info   : Using PSRAM
00.369 : (328420) Info   :

INIT : Booting version: ESP_Easy_mega_20240321_max_ESP32s3_16M8M_LittleFS_OPI_PSRAM_CDC_ETH, (Self built) bugfix/ESP8266_WiFi_AP_capabilities_548dd59 (ESP32 SDK 5.1.3.240316)
00.370 : (328276) Info   : INIT : Free RAM:328276
00.391 : (328068) Info   : INIT : Manual Reboot #3 Last Action before Reboot: Background Task Last systime: 1711055189 - Restart Reason: CPU0: usb uart reset digital core  CPU1: usb uart reset digital core
00.402 : (328020) Info   : FS   : Mounting...
00.421 : (325656) Info   : FS   : LittleFS mount successful, used 151552 bytes of 8318976

Build changed!
00.435 : (325192) Info   : ESPEasy console using ESPEasySerial
00.456 : (325128) Info   : ESPEasy console fallback enabled
00.464 : (325012) Info   : CRC  : SecuritySettings CRC...OK
00.478 : (324544) Info   : CRC  : ProvisioningSettings CRC   ...OK
00.481 : (324848) Info   : INIT : I2C
00.482 : (323824) Info   : INIT : Check for Priority tasks
00.482 : (323776) Info   : INIT : SPI not enabled
00.493 : (323168) Info   : Time : Set Ext. Time Source: RTC at boot time: 1711055189
00.504 : (323120) Info   : Time set to 1711055189.000
00.504 : (323024) Info   : Current Time Zone: STD time start: 2024-10-27 03:00:00 offset: 0 min
00.515 : (322928) Info   : Local time: 2024-03-21 21:06:29
00.617 : (322816) Info   : WIFI : Set WiFi to STA
00.807 : (292640) Info   : ESPEasy console using ESPEasySerial
00.829 : (292576) Info   : ESPEasy console fallback enabled
00.829 : (292000) Info   : INIT : Free RAM:292048
00.838 : (287188) Info   : ESPEasy console using ESPEasySerial
00.859 : (287124) Info   : ESPEasy console fallback enabled
00.859 : (287012) Info   : INFO : Plugins: 147 ["Normal","MAX ESP32"] (ESP32 SDK 5.1.3.240316)
00.871 : (287292) Info   : WIFI : Connecting Lurch4 3C:37:12:AA:D0:5C Ch:1 (0dBm)open attempt #0
00.875 : (287092) Info   : WIFI : Arduino wifi status: WL_DISCONNECTED 6 ESPeasy internal wifi status: DISCONNECTED
00.877 : (282116) Info   : Webserver: start
01.490 : (281000) Info   : SYS  : 281352,6061492,40,-54
01.494 : (280904) Info   : WIFI : Connected! AP: Lurch4 (3C:37:12:AA:D0:5C) Ch: 1 Duration: 588 ms
01.944 : (281404) Info   : SYS  : 281468,6061288,40,-58
01.998 : (281524) Info   : WIFI : DHCP IP: 192.168.11.0 (MAX-ESP32-s3pro-117) GW: 192.168.10.1 SN: 255.255.254.0 DNS: 192.168.10.1 / 42.16.55.129   duration: 535 ms
02.013 : (279692) Info   : WIFI : Arduino wifi status: WL_CONNECTED 3 ESPeasy internal wifi status: Conn. IP
02.015 : (279036) Info   : UDP : Start listening on port 8266
02.026 : (279516) Info   : firstLoopConnectionsEstablished
02.030 : (279452) Info   : Internal command: All checked OK

@SuGlider
Copy link
Collaborator

SuGlider commented Mar 22, 2024

@TD-er @Jason2866
Based on the test sketch and python script (from #9378 (comment)), when the usb_serial_jtag_ll_txfifo_flush(); is commented out, it fails 100% of the executions on the S3-Mini-1--N8 and S3-woom-2-N32R8 that I have here. I have tried 10 times. 10 fails (it hangs after about 50 transfers).

Keeping the usb_serial_jtag_ll_txfifo_flush();, it works 100% of the tries. I also tested it 10 times.

Please test it with the Arduino IDE using the Core 3.0.0 and the modifications to HWCDC::write(const uint8_t *buffer, size_t size) function. I'm not sure if the IDF version and other itens used in @TD-er building correspond to the same environment I am testing for.

Given that we get oposite results, I only can think that or the test case isn't good enough or we are running different frameworks.

I need a test case to prove that removing usb_serial_jtag_ll_txfifo_flush(); solves any issue.
The original code from Arduino master branch fails with my test case, therefore, I found a way to reproduce the issue.
The current Arduino master branch fails 100% of the time. This indicates that the test case I have created is valid and consistent.

I can't use ESPEasy to test it, because it is a full project and the failure may happen due to other reasons.
ESPEasy may use some different set of libraries and or Arduino Core source code.

@SuGlider
Copy link
Collaborator

SuGlider commented Mar 22, 2024

The PR #9401 targets the master branch, Arduino Core 3.0.0 using IDF 5.1.
Is ESPEasy also using this same environment?

@Jason2866
Copy link
Collaborator

@SuGlider The framework used for the test from @TD-er is based on latest IDF 5.1 branch https://github.com/espressif/esp-idf/tree/release/v5.1 and Arduino latest branch master.

@SuGlider
Copy link
Collaborator

In that case I ask you guys to bring a Test Case, based on a minimum Arduino Sketch, that can demonstrate/reproduce the issue and be used for testing.

This is necessary for applying any patch.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 23, 2024

I am now running your test code with PlatformIO and the same libraries/platform builds etc. as I was using for ESPEasy.
Your sketch does stop working on the code before your patch/PR and it does finish with your patch/PR.

Also if I comment-out the usb_serial_jtag_ll_txfifo_flush(); lines in HWCDC::write it halts at some point.

So at least that sounds the same as you experienced.

I will now try to reproduce the ESPEasy behavior in a simple test program.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 23, 2024

So far I have not yet been able to reproduce the odd behavior as seen in ESPEasy with a small sketch, but hopefully I can find some more time tomorrow to test.

@SuGlider
Copy link
Collaborator

Thanks @TD-er. I hope you have success replicating the issue with a small sketch.

@TD-er
Copy link
Contributor Author

TD-er commented Mar 26, 2024

Well in the very few hours I had in the last few days to test, I was not (yet...) able to reproduce the same loss of data as seen with ESPEasy, in a simple program.

Since it does fix the bug as shown in the test program presented by @SuGlider , I think his PR should be merged.
As soon as I can reproduce (or explain) the behavior I'm seeing with ESPEasy, I will report back.

@SuGlider
Copy link
Collaborator

SuGlider commented Mar 26, 2024

Is it possible that ESPEasy has multiple tasks writing to the CDC?

This may explain why removing CDC Flush helps. Flushing makes whatever is the CDC FIFO be sent to the host as soon as it asks for.

It could "tear" the messages. Although the write() muted should prevent it...

@TD-er
Copy link
Contributor Author

TD-er commented Mar 26, 2024

It is not called from multiple tasks, as they are all being called from the loop() call.
However I do have 2 instances of a serial write buffer (and read buffer too) which can independently flush their content to their serial port.
This way I can have a single build for those modules with some external USB to UART chip connected to Serial0 and boards with only HWCDC. Or boards with both ports available where it doesn't matter which port is connected.

But that shouldn't make any difference, does it?

I also have set larger TX/RX buffers. Maybe flushing of those extra buffers is done from a different task?

@TD-er
Copy link
Contributor Author

TD-er commented Apr 4, 2024

Just FYI, I have been running into quite a lot of issues with ESPEasy clearly related to the HWCDC code (Maybe also USBCDC, which I only use on ESP32-S2)
So for now I have to revert to a platform build maintained by @Jason2866 which does not include the code changes made end Feb.
Otherwise the ESP32-C6 would run into issues (ESP32-C3 also and an user claimed -S2 was also "now fixed" with the reverted platform package)

I have to look into this later, but right now it is hard to proceed with this now the networking code has been merged.
Meaning I have to get this networking code implemented first before I can continue using new commits on this repo.
I honestly have no clue why the -C6 enters a bootloop when HWCDC is enabled on the latest platform builds made right before the networking code was merged. (N.B. this might be with your pending PR included)
On earlier platform builds (with HWCDC changed included) it seemed to stop working right after trying to connect to WiFi. This could be seemingly at the same point in the booting process each time, or maybe it is just some specific amount of bytes being sent to the console at that point.

Anyway, I am convinced this is something in ESPEasy itself until I can prove otherwise.
However the same ESPEasy code does work fine on platform builds from before the HWCDC changes, so it must be related.
But there is no point in keeping the pending PR waiting as it does fix something, though I am not sure it is fixing what I'm seeing here.

I also tried again using your clean test program on the ESP32-C6 and that one is working just fine....
See: https://github.com/TD-er/MCVE_ESPxx/tree/bug/ESP32_S3_HWCDC

@TD-er
Copy link
Contributor Author

TD-er commented Apr 4, 2024

Ah I did make it loose bytes, but not sure how realistic this is...

void setup() {
  Serial0.begin(115200);

  _hwcdc_serial->setTxTimeoutMs(0);

  _hwcdc_serial->begin();

It also stalls completely, which seems to be what I also see when connecting to WiFi...

Edit:
The same happens when setting the timeout to 1.

So maybe the issue here is that when HWCDC::isCDC_Connected() is false, then the whole flushing may stall?

Edit2:
Is it possible the xSemaphoreTake with a timeout of 0 does return something not equal to pdPASS but still gets the semaphore set?

So maybe more like this to be sure ??

int HWCDC::availableForWrite(void)
{
    uint32_t tx_timeout_ms = 0;
    if(tx_ring_buf == NULL || tx_lock == NULL){
        return 0;
    }
    if(HWCDC::isCDC_Connected()) {
        tx_timeout_ms = requested_tx_timeout_ms;
    }
    size_t a = 0;
    if(xSemaphoreTake(tx_lock, tx_timeout_ms / portTICK_PERIOD_MS) == pdPASS){
        a = xRingbufferGetCurFreeSize(tx_ring_buf);
    }
    xSemaphoreGive(tx_lock);
    return a;
}

@TD-er
Copy link
Contributor Author

TD-er commented Apr 4, 2024

I just pushed the commit to my MCVE which makes the ESP32-C6 stall and mess up the output.
When I use this strategy in HWCDC.cpp:

    if(xSemaphoreTake(tx_lock, tx_timeout_ms / portTICK_PERIOD_MS) != pdPASS){
        xSemaphoreGive(tx_lock);
        return 0;
    }

It still manages to mess up the output, but less often.
So there seems to be some odd issue with the semaphore take call.

Changing the TX buffer also changes behavior.

_hwcdc_serial->setTxBufferSize(2560);

This still outputs way too long lines every now and then but it doesn't seem to stall anymore.

N.B. The ESP32-C3 and -S3 show similar behavior

@TD-er
Copy link
Contributor Author

TD-er commented Apr 4, 2024

OK, sorry for all the confusion and it looks like I have wasted at least 3 mornings on this...

Looks like I lost track of which platform build to use with your fix included while switching between working on ESPEasy and trying out the MCVE builds.
This did mess up also an ESPEasy (release) build I made and made me trying to find out what went wrong here.

Apparently I have been using the wrong code or even partial sets of code as PlatformIO is not always acting nicely when not performing a full clean between switching platform packages.

Now at least the MCVE is no longer failing and I will try it with ESPEasy to see if that also fixes my crashes, hangs, etc.

@SuGlider
Copy link
Collaborator

SuGlider commented Apr 4, 2024

I've been working on a fix for the C6/H2 along this week.
I found out that the IDF usb_serial_jtag_is_connected() fails from time to time and returns false, even with the USB plugged.
It causes the HDCDC to consider that USB has disconnected making it to hangs or to skip some bytes in TX.

I'm working on a potential fix for it. I'll keep you posted.

@TD-er
Copy link
Contributor Author

TD-er commented Apr 4, 2024

Can this also cause bootloops on the C6 when nothing is connected via HWCDC?

@SuGlider
Copy link
Collaborator

SuGlider commented Apr 4, 2024

Can this also cause bootloops on the C6 when nothing is connected via HWCDC?

I have not seen it... could you please post a simple sketch that causes the bootloops?

@TD-er
Copy link
Contributor Author

TD-er commented Apr 4, 2024

Nope, not yet....
And believe me I have been trying to a lot in the past few days...

I really don't have a clue where it might crash as I don't really get proper output on the serial port.

Only every now and then a very odd decoded stack which I don't really understand:


Rebooting...
Guru Meditation Error: Core  0 panic'ed (Store access fault). Exception was unhandled.

Core  0 register dump:
MEPC    : 0x42116434  RA      : 0x421167fe  SP      : 0x4087e0a0  GP      : 0x4080d734
=> 0x42116434: prvAddCurrentTaskToDelayedList at /COMPONENT_FREERTOS_DIR/FreeRTOS-Kernel/tasks.c:6524
=> 0x421167fe: vTaskDelay at /COMPONENT_FREERTOS_DIR/FreeRTOS-Kernel/tasks.c:1671
=> 0x4087e0a0: ?? ??:0
=> 0x4080d734: __global_pointer$ at ??:?
TP      : 0x00000000  T0      : 0x4211641c  T1      : 0xffffffe0  T2      : 0x00000007
=> 0x4211641c: prvAddCurrentTaskToDelayedList at /COMPONENT_FREERTOS_DIR/FreeRTOS-Kernel/tasks.c:6506
S0/FP   : 0x0000000a  S1      : 0x40819000  A0      : 0x00000000  A1      : 0x00000000
=> 0x40819000: ?? ??:0
A2      : 0x0000000d  A3      : 0x40812b08  A4      : 0x40819000  A5      : 0x00000000
=> 0x40812b08: ?? ??:0
=> 0x40819000: ?? ??:0
A6      : 0x40807ae2  A7      : 0x40807bb4  S2      : 0x00000000  S3      : 0x00000000
=> 0x40807ae2: prvGetItemByteBuf at /COMPONENT_ESP_RINGBUF_DIR/ringbuf.c:558
=> 0x40807bb4: prvCopyItemByteBuf at /COMPONENT_ESP_RINGBUF_DIR/ringbuf.c:469
S4      : 0x40811000  S5      : 0x40811000  S6      : 0x40811000  S7      : 0x40811000
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
S8      : 0x4213aae0  S9      : 0x40811980  S10     : 0x42145000  S11     : 0x00000100
=> 0x4213aae0: ?? ??:0
=> 0x40811980: ?? ??:0
=> 0x42145000: ?? ??:0
T3      : 0x4081b51c  T4      : 0x00000000  T5      : 0x00000010  T6      : 0x00000011
=> 0x4081b51c: ?? ??:0
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000007  MTVAL   : 0x00000152
=> 0x40800001: _vector_table at ??:?
MHARTID : 0x00000000

Stack memory:
4087e0a0: 0x40811000 0x40811000 0x40811000 0x4081163c 0x00000007 0x40819000 0x4080fd38 0x421167fe
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x4081163c: ?? ??:0
=> 0x40819000: ?? ??:0
=> 0x4080fd38: ?? ??:0
=> 0x421167fe: vTaskDelay at /COMPONENT_FREERTOS_DIR/FreeRTOS-Kernel/tasks.c:1671
4087e0c0: 0x00000007 0x40819000 0x00000001 0x0000000a 0x00000007 0x40819000 0x4080fd38 0x4207f36c
=> 0x40819000: ?? ??:0
=> 0x40819000: ?? ??:0
=> 0x4080fd38: ?? ??:0
=> 0x4207f36c: HWCDC::begin(unsigned long) [clone .constprop.0] at C:/Users/gijsn/.platformio/packages/framework-arduinoespressif32/cores/esp32/HWCDC.cpp:252
4087e0e0: 0x00000007 0x4087e158 0x4081b300 0x42003f6e 0x408133a0 0x42128000 0x00000014 0x408029be
=> 0x4087e158: ?? ??:0
=> 0x4081b300: ?? ??:0
=> 0x42003f6e: Port_ESPEasySerial_USB_HWCDC_t::Port_ESPEasySerial_USB_HWCDC_t(ESPEasySerialConfig const&) at lib/ESPEasySerial/Port_ESPEasySerial_USB_HWCDC.cpp:70
      (inlined by) ESPeasySerial::ESPEasySerial_Port_factory(ESPEasySerialConfig const&) at lib/ESPEasySerial/ESPeasySerial.cpp:36
=> 0x408133a0: ?? ??:0
=> 0x42128000: ?? ??:0
=> 0x408029be: heap_caps_malloc_default at /COMPONENT_HEAP_DIR/heap_caps.c:247
4087e100: 0x42145000 0x40811980 0x4213aae0 0x40811000 0x40811000 0x40811000 0x40811000 0x4081163c
=> 0x42145000: ?? ??:0
=> 0x40811980: ?? ??:0
=> 0x4213aae0: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x40811000: ?? ??:0
=> 0x4081163c: ?? ??:0
4087e120: 0x408133a0 0x42128000 0x4081b2dc 0x42088210 0x00000007 0x0001c200 0x00000011 0x00000010
=> 0x408133a0: ?? ??:0
=> 0x42128000: ?? ??:0
=> 0x4081b2dc: ?? ??:0
=> 0x42088210: _sub_I_65535_0.0 at :?
4087e140: 0x00000000 0x00000100 0x00000200 0x00000000 0x00004e20 0x0800001c 0x00000007 0x0001c200
4087e160: 0x00000011 0x00000010 0x00000000 0x00000100 0x00000200 0x00000000 0x00004e20 0x0800001c
4087e180: 0x00000000 0x00000003 0x00000006 0x00000000 0x00000000 0x00000000 0x000280b8 0x001177d4
4087e1a0: 0x00040020 0x408023ea 0x42000000 0x40814134 0x42144000 0x421470d8 0x421470d8 0x42090668
=> 0x408023ea: call_start_cpu0 at /COMPONENT_ESP_SYSTEM_DIR/port/cpu_start.c:299
=> 0x42000000: ?? ??:0
=> 0x40814134: ?? at /COMPONENT_NEWLIB_DIR/newlib_init.c:40
=> 0x42144000: ?? ??:0
=> 0x421470d8: ?? ??:0
=> 0x421470d8: ?? ??:0
=> 0x42090668: do_global_ctors at /COMPONENT_ESP_SYSTEM_DIR/startup.c:203
      (inlined by) start_cpu0_default at /COMPONENT_ESP_SYSTEM_DIR/startup.c:486
4087e1c0: 0xffff0000 0x00000000 0x00700000 0x42091d98 0x00040020 0x408023ea 0x42000000 0x42120000
=> 0x42091d98: clk_ll_rtc_slow_load_cal at /COMPONENT_HAL_DIR/esp32c6/include/hal/clk_tree_ll.h:805
      (inlined by) esp_clk_slowclk_cal_get at /COMPONENT_ESP_HW_SUPPORT_DIR/esp_clk.c:199
      (inlined by) esp_rtc_get_time_us at /COMPONENT_ESP_HW_SUPPORT_DIR/esp_clk.c:108
=> 0x408023ea: call_start_cpu0 at /COMPONENT_ESP_SYSTEM_DIR/port/cpu_start.c:299
=> 0x42000000: ?? ??:0
=> 0x42120000: _flash_rodata_dummy_start at ??:?
4087e1e0: 0x0000004d 0x0000000c 0x0000004d 0x4080255e 0xffff0000 0x421480d7 0x421177f3 0x02625a00
=> 0x4080255e: call_start_cpu0 at /COMPONENT_ESP_SYSTEM_DIR/port/cpu_start.c:707
=> 0x421480d7: ?? ??:0
=> 0x421177f3: intr_handler_set at ??:?
4087e200: 0xffff0000 0x421480d7 0x421177f3 0x4086f5fc 0x00000000 0x00120000 0x00010020 0x00040020
=> 0x421480d7: ?? ??:0
=> 0x421177f3: intr_handler_set at ??:?
=> 0x4086f5fc: ?? ??:0
4087e220: 0x42120020 0x42000020 0x000280b8 0x001177d4 0x00000000 0x00000000 0x00000000 0x40880000
=> 0x42120020: ?? at /COMPONENT_ESP_APP_FORMAT_DIR/esp_app_desc.c:15
=> 0x42000020: Print::availableForWrite() at C:/Users/gijsn/.platformio/packages/framework-arduinoespressif32/cores/esp32/Print.h:84
=> 0x40880000: ?? ??:0
4087e240: 0x00002f08 0x4004b05c 0x0000ffff 0xffffffff 0x00000000 0x4087e3a0 0x00000000 0x4086f85e
=> 0x4004b05c: ?? ??:0
=> 0x4087e3a0: ?? ??:0
=> 0x4086f85e: ?? ??:0
4087e260: 0x2d5416cc 0x00010000 0x001d0000 0x00010000 0x200206e9 0x408023ea 0x000000ee 0x0000000d
=> 0x408023ea: call_start_cpu0 at /COMPONENT_ESP_SYSTEM_DIR/port/cpu_start.c:299
4087e280: 0x00ffff00 0x01000000 0x42120020 0x000280b8 0x40800000 0x00007f38 0x42000020 0x001177d4
=> 0x42120020: ?? at /COMPONENT_ESP_APP_FORMAT_DIR/esp_app_desc.c:15
=> 0x40800000: _vector_table at ??:?
=> 0x42000020: Print::availableForWrite() at C:/Users/gijsn/.platformio/packages/framework-arduinoespressif32/cores/esp32/Print.h:84
4087e2a0: 0x40807f38 0x00004fec 0x4080cf30 0x00002cd8 0x50002000 0x00000048 0x00000000 0x00000000
=> 0x40807f38: tlsf_set_rom_patches at /COMPONENT_ESP_ROM_DIR/patches/esp_rom_tlsf.c:195
=> 0x4080cf30: ?? ??:0
4087e2c0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e2e0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e300: 0x00000000 0x00000000 0x00010020 0x000380e0 0x00040020 0x001577fc 0x0015c7f0 0x0015f4d0
4087e320: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e340: 0x00000000 0x00000000 0x0014f540 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e360: 0x00000000 0x00000000 0x00000000 0x00000000 0x00002f08 0x4004b05c 0x0000ffff 0x40880000
=> 0x4004b05c: ?? ??:0
=> 0x40880000: ?? ??:0
4087e380: 0x00000000 0x0000005a 0x00002f10 0x4086c478 0x2c9e62d6 0xf7072c46 0x00000000 0x00000000
=> 0x4086c478: ?? ??:0
4087e3a0: 0x0000e000 0x00002000 0x00000000 0x00000000 0x00000000 0x00000000 0x00010000 0x001d0000
4087e3c0: 0x001e0000 0x001d0000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e3e0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e400: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4087e420: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000002 0x00000000
4087e440: 0x00000000 0x0000005a 0x00002f10 0x4001b1c0 0x5ef1c5b0 0x2dab6bae 0x0bc46d49 0x00000000
=> 0x4001b1c0: ?? ??:0
4087e460: 0xd82881fb 0x00000000 0x00010000 0x200203e9 0x4086c410 0x40875728 0x000000f8 0x000000ee
=> 0x4086c410: ?? ??:0
=> 0x40875728: ?? ??:0
4087e480: 0x0000000d 0x00ffff00 0x01000000 0x42002f10 0x7533885e 0x9ca20da1 0xc1a81478 0xc41d425e
=> 0x42002f10: cbuf::read() at C:/Users/gijsn/.platformio/packages/framework-arduinoespressif32/cores/esp32/cbuf.cpp:190
      (inlined by) WiFiUDP::read() at C:/Users/gijsn/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiUdp.cpp:348

@TD-er
Copy link
Contributor Author

TD-er commented Apr 5, 2024

Hmm seems like you can't change 'stuff' like setting buffer size after calling begin().
Still looking into it, now having lots of code commented out, so I have to move stuff back to see what breaks it.

But we're out of the bootloop right now.
Now have to see how WiFi can connect... still seems to stall.

@TD-er
Copy link
Contributor Author

TD-er commented Apr 5, 2024

OK, here's what's strange...
I do have a wrapper library in ESPEasy which allows for a single interface to all supported serial ports:

  • HW serial
  • SW serial
  • USB-CDC/HW-CDC
  • some I2C to UART bridges

For each of these I have some port classes and in the constructor of the port class I was calling to set the RX/TX buffer size and call begin()
Then later in the code I also called similar code and also end(), like in any other serial port is good practice.

However I had a typo in my code and was calling to set the RX buffer size twice instead of RX/TX.

So now I only call to set the RX/TX buffer size in the begin() function of this port class and then begin() and removed all calls to end() in this port class.

Now it is working again without stalls, no bootloops etc.

N.B. this is with your pending PR to fix the lost data for HWCDC, so this can be merged if you like to. (tested on ESP32-C3/-C6/-S3)

I do have the idea that the connected state of the HWCDC port may not always be correct as you already mentioned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Peripherals API Relates to peripheral's APIs. Backport needed 🔙 Status: In Progress Issue is in progress
Projects
Development

Successfully merging a pull request may close this issue.

6 participants