Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupt list in ipc_task, causes infinite loop #506

Closed
Ebiroll opened this issue Apr 9, 2017 · 12 comments
Closed

Corrupt list in ipc_task, causes infinite loop #506

Ebiroll opened this issue Apr 9, 2017 · 12 comments

Comments

@Ebiroll
Copy link

Ebiroll commented Apr 9, 2017

Watchdog reset because of infinite loop,
pxNext points to self and finds itself stuck in an infinite loop.

v2,0 works fine,
This was with version f9fba35

(gdb) where
#0 0x4008529e in vListInsert (pxList=0x3ffae5f4, pxNewListItem=0x3ffaeb88)
at /home/olas/esp/esp-idf/components/freertos/./list.c:188
#1 0x400847cb in vTaskPlaceOnEventList (pxEventList=0x3ffae5f4, xTicksToWait=4294967295)
at /home/olas/esp/esp-idf/components/freertos/./tasks.c:2847
#2 0x400836f3 in xQueueGenericReceive (xQueue=0x3ffae5d0, pvBuffer=0x0, xTicksToWait=4294967295, xJustPeeking=0)
at /home/olas/esp/esp-idf/components/freertos/./queue.c:1586
#3 0x4008108b in ipc_task (arg=0x0) at /home/olas/esp/esp-idf/components/esp32/./ipc.c:52

(gdb) p *pxIterator
$7 = {xItemValue = 1, pxNext = 0x3ffaeb88, pxPrevious = 0x3ffaeb88, pvOwner = 0x3ffaeb6c, pvContainer = 0x3ffae5f4}
(gdb) p pxList->xListEnd
$8 = {xItemValue = 4294967295, pxNext = 0x3ffaeb88, pxPrevious = 0x3ffaeb88}

void vListInsert( List_t * const pxList, ListItem_t * const pxNewListItem )
{
...

	for( pxIterator = ( ListItem_t * ) &( pxList->xListEnd ); pxIterator->pxNext->xItemValue <= xValueOfInsertion; pxIterator = pxIterator->pxNext ) /*lint !e826 !e740 The mini list structure is used as the list end to save RAM.  This is checked and valid. */
	{
		/* There is nothing to do here, just iterating to the wanted
		insertion position. */
	}

Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0)
Register dump:
PC : 0x40080b34 PS : 0x00060034 A0 : 0x800842df A1 : 0x3ffaea40
0x40080b34: ipc_task at /home/olas/esp/esp-idf/components/esp32/./ipc.c:106

A2 : 0x3ffae5e0 A3 : 0x3ffaeb74 A4 : 0x00060020 A5 : 0x3ffe3b30
A6 : 0x00000000 A7 : 0x00000000 A8 : 0x3ffaeb74 A9 : 0x3ffaeb74
A10 : 0x00000001 A11 : 0x00000001 A12 : 0x00060020 A13 : 0x3ffe3b10
A14 : 0x3ffb13dc A15 : 0x00000000 SAR : 0x00000000 EXCCAUSE: 0x00000005
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x00000000

Backtrace: 0x40080b34:0x3ffaea40 0x400842df:0x3ffaea60 0x400832bf:0x3ffaea80 0x40080b63:0x3ffaeac0
0x40080b34: ipc_task at /home/olas/esp/esp-idf/components/esp32/./ipc.c:106
0x400842df: xPortGetCoreID at /home/olas/esp/esp-idf/components/freertos/./tasks.c:4415
(inlined by) vTaskPlaceOnEventList at /home/olas/esp/esp-idf/components/freertos/./tasks.c:2852
0x400832bf: xQueueGenericReceive at /home/olas/esp/esp-idf/components/freertos/./queue.c:2034
0x40080b63: ipc_task at /home/olas/esp/esp-idf/components/esp32/./ipc.c:106

If you find your application is crashing here then likely causes are

                             1) Stack overflow -                                                                    
                                see http://www.freertos.org/Stacks-and-stack-overflow-checking.html                 
                             2) Incorrect interrupt priority assignment, especially on Cortex-M                     
                                parts where numerically high priority values denote low actual                      
                                interrupt priorities, which can seem counter intuitive.  See                        
                                http://www.freertos.org/RTOS-Cortex-M3-M4.html and the definition                   
                                of configMAX_SYSCALL_INTERRUPT_PRIORITY on                                          
                                http://www.freertos.org/a00110.html                                                 
                             3) Calling an API function from within a critical section or when                      
                                the scheduler is suspended, or calling an API function that does                    
                                not end in "FromISR" from an interrupt.                                             
                             4) Using a queue or semaphore before it has been initialised or                        
                                before the scheduler has been started (are interrupts firing                        
                                before vTaskStartScheduler() has been called?).                                     
                     **********************************************************************/            

#0 0x40084dbe in vListInsert (pxList=0x3ffae5e0, pxNewListItem=0x3ffaeb74)
at /home/olas/esp/esp-idf/components/freertos/./list.c:188
#1 0x400842df in vTaskPlaceOnEventList (pxEventList=0x3ffae5e0, xTicksToWait=4294967295)
at /home/olas/esp/esp-idf/components/freertos/./tasks.c:2847
#2 0x400832bf in xQueueGenericReceive (xQueue=0x3ffae5bc, pvBuffer=0x0, xTicksToWait=4294967295, xJustPeeking=0)
at /home/olas/esp/esp-idf/components/freertos/./queue.c:1586
#3 0x40080b63 in ipc_task (arg=0x0) at /home/olas/esp/esp-idf/components/esp32/./ipc.c:52

@igrr
Copy link
Member

igrr commented Apr 10, 2017

This looks very much like #496, could you please check the workaround suggested there?

@igrr
Copy link
Member

igrr commented Apr 10, 2017

Should be fixed in 8915c18. If you are still getting the issue, please reopen.

@igrr igrr closed this as completed Apr 10, 2017
@Ebiroll
Copy link
Author

Ebiroll commented Apr 10, 2017

It did not solve the problem, it moved the problem.
Removing nvs_flash_init(); got me further but tasks get stuck later. If you make a few resets it works for a while.
in esp-idf version 47b8f78 my example runs fine.
Are you sure that esp-idf does not
3) Call an API function from within a critical section or when
the scheduler is suspended, or calling an API function that does
not end in "FromISR" from an interrupt.
4) Use a queue or semaphore before it has been initialised or
before the scheduler has been started (are interrupts firing
before vTaskStartScheduler() has been called?).

@igrr igrr reopened this Apr 11, 2017
@igrr
Copy link
Member

igrr commented Apr 11, 2017

Are you able to reproduce this with one of the examples? Is this in single core mode?

@Ebiroll
Copy link
Author

Ebiroll commented Apr 11, 2017

Yes. Problems in both singel core and multicore mode. I use the i2c driver and write to a display. This Will most likely cause lots of i2c data traffic and lots of interupts. I Will test again With The i2c exemple. But also please check The Ipc_task and Ipc_task init functions.

@Ebiroll
Copy link
Author

Ebiroll commented Apr 11, 2017

I was able to reproduce with examples/protocols/https_request

Just add a blinky task.

xTaskCreatePinnedToCore(&blink_task, "blink", 4096, NULL, 20, NULL,0);

#define BLINK_GPIO 5

void blink_task(void *pvParameters)
{
gpio_pad_select_gpio(BLINK_GPIO);
gpio_set_direction(BLINK_GPIO, GPIO_MODE_OUTPUT);

for (;;) {
      gpio_set_level(BLINK_GPIO, 0);
      vTaskDelay(1000 / portTICK_RATE_MS);
      gpio_set_level(BLINK_GPIO, 1);
      vTaskDelay(1000 / portTICK_RATE_MS);
}

}

@Ebiroll
Copy link
Author

Ebiroll commented Apr 11, 2017

Note. I retested on a clean devboard and the example runs fine. It could be hardware related.
This was the example with problems.
https://github.com/Ebiroll/qemu_esp32/tree/master/examples/06_1306_interactive
I will test some more but note that when running in qemu, breaking in ipc_task always gives,
internal-error: inline_frame_this_id: Assertion `!frame_id_eq (*this_id, outer_frame_id)' failed.
Is ipc_task getting enough stack space?
I will investigate more.

@Ebiroll
Copy link
Author

Ebiroll commented Apr 11, 2017

OK. The example actually worked on the duino. The error might have been my poor soldering skills. :-(
However I am not able to run in QEMU single core or multicore anymore because pxIterator is the same as pxIterator->pxNext when called from ipc_task. (stuck in a loop) Also my example works with the poorly soldered board but earlier version of esp-idf. I will continue to investigate this.

@Ebiroll Ebiroll closed this as completed Apr 11, 2017
@igrr
Copy link
Member

igrr commented Apr 11, 2017

Is the cross-core interrupt implemented in Qemu? It is currently used quite extensively in FreeRTOS, even in single core mode.

@Ebiroll
Copy link
Author

Ebiroll commented Apr 11, 2017

Oh, No. You mean this isr?
static void IRAM_ATTR esp_crosscore_isr(void *arg)
xtensa_irq_init() results in
io write 164,c
So I guess qemu should generate interrupt 12 (0xc)
Whenever someone writes to,
WRITE_PERI_REG(DPORT_CPU_INTR_FROM_CPU_0_REG,DPORT_CPU_INTR_FROM_CPU_0);
I can try that.

void esp_crosscore_int_send_yield(int coreId) {
assert(coreId<portNUM_PROCESSORS);
//Mark the reason we interrupt the other CPU
portENTER_CRITICAL(&reasonSpinlock);
reason[coreId]|=REASON_YIELD;
portEXIT_CRITICAL(&reasonSpinlock);
//Poke the other CPU.
if (coreId==0) {
WRITE_PERI_REG(DPORT_CPU_INTR_FROM_CPU_0_REG,DPORT_CPU_INTR_FROM_CPU_0);
} else {
WRITE_PERI_REG(DPORT_CPU_INTR_FROM_CPU_1_REG, DPORT_CPU_INTR_FROM_CPU_1);
}
Thanks for the tip.

@igrr
Copy link
Member

igrr commented Apr 11, 2017

When DPORT_CPU_INTR_FROM_CPU_0 is written, ETS_FROM_CPU_INTR0_SOURCE is triggered.
When DPORT_CPU_INTR_FROM_CPU_1 is written, ETS_FROM_CPU_INTR1_SOURCE is triggered.
Currently we connect ETS_FROM_CPU_INTR0_SOURCE to CPU0 and ETS_FROM_CPU_INTR1_SOURCE to CPU1 (in crosscore_int.c).

Edit: ah, you probably don't have interrupt matrix implemented in Qemu yet. In this case you have to route to the CPU ISR directly, but that's very fragile because it depends on interrupt allocation order. We'll put interrupt matrix support on our list of things to do in Qemu.

@Ebiroll
Copy link
Author

Ebiroll commented Apr 19, 2017

I have implemented crosscore interrupts in qemu. Also fixed an issue with qemu that caused spi_flash_mmap to overwrite flash.rodata at adress 0x3f400000, DPORT_PRO_FLASH_MMU_TABLE[0] was used. https://github.com/Ebiroll/qemu_esp32
Anyway, qemu has never run better, both in single core and multicore mode. Thanks for the tip.

0xFEEDC0DE64 pushed a commit to 0xFEEDC0DE64/esp-idf that referenced this issue May 5, 2021
* Create new espino32 diretory for ESPino32 board

* Delete wrong create file

* Create pins_arduino.h for espino32

* Update boards.txt to support ThaiEasyElec ESPino32

* Re-configure board name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants