Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/stdio_uart: dropped data when received at once #10639

Closed
jcarrano opened this issue Dec 19, 2018 · 13 comments
Closed

sys/stdio_uart: dropped data when received at once #10639

jcarrano opened this issue Dec 19, 2018 · 13 comments
Labels
Area: sys Area: System Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@jcarrano
Copy link
Contributor

jcarrano commented Dec 19, 2018

Description

When a long stream of characters (greater than 64 bytes, that is the default receive buffer size) is sent all at once to the UART, there is data loss, even though there is enough resources to process all the data.

I could verify this issue on samr21-xpro and pba-d-01-kw2x. Additionally, I used an oscilloscope to make sure that the usb-serial converter was not interfering with communication (i.e, that it did not buffer anything internall.)

This issue replaces #10634 (it turn out Pyterm was not to blame)

Test program

The following application show the issue. Baud rate is set to 1200 to demonstrate that it cannot be a problem related to the CPU being overloaded.

Edit: 600 baud is also failing. I can't set it any lower (i should investigate).

The program should buffer a single line and the echo it back. If it's line buffer is exhausted, the last character is always overwritten.

Makefile

APPLICATION = uart_breakage

# Edited: added stdin module
USEMODULE += stdio_uart stdin 

CFLAGS += -DSTDIO_UART_BAUDRATE=1200

include $(RIOTBASE)/Makefile.include

main.c

/** Illustrate a problem in RIOT's uart */

#include "stdio_uart.h"

#define TEST_BUF_SIZE 256

char test_buffer[TEST_BUF_SIZE];

int main(void)
{
    int received = 0;

    while (1) {
        char c;
        stdio_read(&c, 1);

        test_buffer[received] = c;

        if (received < TEST_BUF_SIZE - 1) {
            received++;
        }

        if (c == '\n' || c == '\r') {
            stdio_write(test_buffer, received);
            received = 0;
        }
    }

    return 0;
}

Steps to reproduce the issue

Set the UART mode manually using stty:

$ stty -F /dev/ttyACM0 line 0 min 0 time 0 -brkint -icrnl \
      -imaxbel -opost -onlcr -isig -icanon -iexten -echo \
      -echoe -echok -echoctl -echoke 1200

Except for the baud rate, it is the same mode as used by pyterm.

In one terminal read the serial port using tail:

$ tail -f /dev/ttyACM0

In another terminal you can perform several experiments

# (1) short line
$ echo hello > /dev/ttyACM0
# (2) very long line
$ echo hhhhhhhhhhhhhhhhhhhhhhhhhklllllllllllllllllllllhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj > /dev/ttyACM0
# (3) try to get the previous line back
$ echo hello > /dev/ttyACM0

Expected results

As soon as the newline is received, whatever is in the line buffer should be printed.

Actual results

Case (1) works. In case (2) nothing gets printed (the newline is never received). Upon sending (3), part of (2), garbage, and finally "hello" get printed.

Versions

Operating System Environment
-----------------------------
       Operating System: "Arch Linux" 
                 Kernel: Linux 4.19.4-arch1-1-ARCH x86_64 unknown

Installed compiler toolchains
-----------------------------
             native gcc: gcc (GCC) 8.2.1 20181127
      arm-none-eabi-gcc: arm-none-eabi-gcc (Arch Repository) 8.2.0
                avr-gcc: missing
       mips-mti-elf-gcc: missing
             msp430-gcc: missing
   riscv-none-embed-gcc: missing
                  clang: clang version 7.0.0 (tags/RELEASE_700/final)

Installed compiler libs
-----------------------
   arm-none-eabi-newlib: "3.0.0"
    mips-mti-elf-newlib: missing
riscv-none-embed-newlib: missing
               avr-libc: missing (missing)

Installed development tools
---------------------------
                  cmake: cmake version 3.13.1
               cppcheck: Cppcheck 1.85
                doxygen: 1.8.14
                 flake8: 3.6.0 (mccabe: 0.6.1, pycodestyle: 2.4.0, pyflakes: 2.0.0) CPython 3.7.1 on Linux
                    git: git version 2.20.0
                   make: GNU Make 4.2.1
                openocd: Open On-Chip Debugger 0.10.0
                 python: Python 3.7.1
                python2: Python 2.7.15
                python3: Python 3.7.1
             coccinelle: missing
@jcarrano jcarrano added the Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) label Dec 19, 2018
@kaspar030 kaspar030 changed the title uart: dropped data when received at once sys/stdio_uart: dropped data when received at once Dec 19, 2018
@kaspar030
Copy link
Contributor

When a long stream of characters (greater than 64 bytes, that is the default receive buffer size) is sent all at once to the UART, there is data loss, even though there is enough resources to process all the data.

Don't understimate the amount of work that the MCU is doing here. Even if the data trickles in at only 1200baud, there'll be an interrupt for every character, with all the context switches that are necessary. There's also no flow control. What probably happens here is that the MCU is busy serving UART interrupts and context switches until the stdio buffer is full, at what point it'll have to drop characters.

Could you, with the osciloscope, graph when the uart ISR is entered and when the main loop runs?

@jcarrano
Copy link
Contributor Author

jcarrano commented Dec 19, 2018

What probably happens here is that the MCU is busy serving UART interrupts and context switches until the stdio buffer is full.

I doubt it. I lowered the baud rate to 600 baud (that's 60 characters a second). The problem persists. I timed the interrupt. I modified

static inline void irq_handler(unsigned uartnum)
{
if (dev(uartnum)->INTFLAG.bit.RXC) {
/* interrupt flag is cleared by reading the data register */
uart_ctx[uartnum].rx_cb(uart_ctx[uartnum].arg,
(uint8_t)(dev(uartnum)->DATA.reg));
}
else if (dev(uartnum)->INTFLAG.bit.ERROR) {
/* clear error flag */
dev(uartnum)->INTFLAG.reg = SERCOM_USART_INTFLAG_ERROR;
}
cortexm_isr_end();
}
to

static inline void irq_handler(unsigned uartnum)
{
    gpio_set(GPIO_PIN(1, 02));

    /* blah blah blah */

    gpio_clear(GPIO_PIN(1, 02));

    cortexm_isr_end();
}

And it's clocking consistently at 7.05µs on samr-21. For reference, there is one reception each 16.67ms.

@kaspar030
Copy link
Contributor

And it's clocking consistently at 7.05µs on samr-21. For reference, there is one reception each 16.67ms.

You mean GPIO_PIN(1, 02) goes high every 7µs? Only if there is a transfer or permanently?

@jcarrano
Copy link
Contributor Author

Sorry, that was unclear. From rise to fall (i.e. the time inside the interrupt) is 7µs. The time between successive interrupts (which of course is equal to 1/10th of the baud rate) is 16.7 ms.

@kaspar030
Copy link
Contributor

From rise to fall (i.e. the time inside the interrupt) is 7µs. The time between successive interrupts (which of course is equal to 1/10th of the baud rate) is 16.7 ms.

Ok, there's indeed something odd. 16.7ms should actually be enough for thousand context switches, and definitely enough for one plus a little ringbuffer read.

@stale
Copy link

stale bot commented Aug 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added State: stale State: The issue / PR has no activity for >185 days and removed State: stale State: The issue / PR has no activity for >185 days labels Aug 10, 2019
@jcarrano
Copy link
Contributor Author

Hey stalebot, the problem is still present.

I created a tgz with the test application: uart-breakage.tar.gz. I updated the makefile to include the stdin module.

@HendrikVE
Copy link
Contributor

I came across the same issue with my nrf52dk while working on #13195. After sending more than 64 bytes at once via UART, my board stopped responding, so I don't even reached garbage output at (3) as you did. @haukepetersen on the other hand sent successfully more than 64 bytes to the nrf52dk. I also tested your code with the nucleo-f334r8 and pba-d-01-kw2x. On both everything is working as expected.

It looks quite like a configuration issue on the local machine sending the data, but I have no idea what component (hard- or software) could cause this. Does somebody has an idea?

@HendrikVE HendrikVE added the State: don't stale State: Tell state-bot to ignore this issue label Feb 7, 2020
@jia200x
Copy link
Member

jia200x commented Apr 16, 2020

this is why the Network CoProcessor of OpenThread was not working :(

@jue89
Copy link
Contributor

jue89 commented May 15, 2020

I'm observing this odd behavior on the samr30-xpro board, as well. Even with stdio completely out of the way.

@kaspar030
Copy link
Contributor

I'm pretty sure this is caused by the usb/uart conversion (within the onboard debuggers). @fjmolinas tested using an external usb/uart on samr21-xpro IIRC and got much better results.

@jia200x
Copy link
Member

jia200x commented Jan 28, 2021

I'm pretty sure this is caused by the usb/uart conversion (within the onboard debuggers). @fjmolinas tested using an external usb/uart on samr21-xpro IIRC and got much better results.

I can confirm this, the Network CoProcessor of OpenThread actually worked using an external usb/uart converter

@aabadie aabadie added the Area: sys Area: System label May 20, 2021
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 21, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@maribu
Copy link
Member

maribu commented May 24, 2023

This seems to be a duplicate of #14548.

(Or rather #14548 is a duplicate of this, but since #14548 is more active and has tracked down the cause, I'd rather close this one.)

@maribu maribu closed this as completed May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: sys Area: System Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

No branches or pull requests

8 participants