Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hspi in slave mode with Wifi enabled produce crash dump #5914

Closed
1 task
dumarjo opened this issue Mar 25, 2019 · 16 comments · Fixed by #5922
Closed
1 task

Hspi in slave mode with Wifi enabled produce crash dump #5914

dumarjo opened this issue Mar 25, 2019 · 16 comments · Fixed by #5922

Comments

@dumarjo
Copy link
Contributor

dumarjo commented Mar 25, 2019

Basic Infos

  • [ x] This issue complies with the issue POLICY doc.
  • [ x] I have read the documentation at readthedocs and the issue is not addressed there.
  • [x ] I have tested that the issue is present in current master branch (aka latest git).
  • [x ] I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • [x ] I have filled out all fields below.

Platform

  • Hardware: [ESP-8266-wroom2]
  • Core Version: [latest git]
  • Development Env: [Arduino IDE]
  • Operating System: [Ubuntu]

Settings in IDE

  • Module: [Generic ESP8266 Module]
  • Flash Size: [2MB]
  • lwip Variant: [v2 Lower Memory]
  • Reset Method: [ck|nodemcu]
  • Flash Frequency: [40Mhz]
  • CPU Frequency: [80Mhz]
  • Upload Using: [SERIAL]
  • Upload Speed: [115200] (serial upload only)

Problem Description

Hi all,

We try to get the spi in slave mode working without crash and we cannot get it. We use 2 esp8266 (one in master and one in slave). We use only the read status command to simplify the setup. Here the Master code. We use the Example SPI Safe Master Demo and modify it to only use the readStatus method.

#include <SPI.h>

class ESPSafeMaster {
  private:
    uint8_t _ss_pin;
    void _pulseSS() {
      digitalWrite(_ss_pin, HIGH);
      delayMicroseconds(5);
      digitalWrite(_ss_pin, LOW);
    }
  public:
    ESPSafeMaster(uint8_t pin): _ss_pin(pin) {}
    void begin() {
      pinMode(_ss_pin, OUTPUT);
      _pulseSS();
    }

    uint32_t readStatus() {
      _pulseSS();
      SPI.transfer(0x04);
      uint32_t status = (SPI.transfer(0) | ((uint32_t)(SPI.transfer(0)) << 8) | ((uint32_t)(SPI.transfer(0)) << 16) | ((uint32_t)(SPI.transfer(0)) << 24));
      _pulseSS();
      return status;
    }

    void writeStatus(uint32_t status) {
      _pulseSS();
      SPI.transfer(0x01);
      SPI.transfer(status & 0xFF);
      SPI.transfer((status >> 8) & 0xFF);
      SPI.transfer((status >> 16) & 0xFF);
      SPI.transfer((status >> 24) & 0xFF);
      _pulseSS();
    }

    void readData(uint8_t * data) {
      _pulseSS();
      SPI.transfer(0x03);
      SPI.transfer(0x00);
      for (uint8_t i = 0; i < 32; i++) {
        data[i] = SPI.transfer(0);
      }
      _pulseSS();
    }

    void writeData(uint8_t * data, size_t len) {
      uint8_t i = 0;
      _pulseSS();
      SPI.transfer(0x02);
      SPI.transfer(0x00);
      while (len-- && i < 32) {
        SPI.transfer(data[i++]);
      }
      while (i++ < 32) {
        SPI.transfer(0);
      }
      _pulseSS();
    }

    String readData() {
      char data[33];
      data[32] = 0;
      readData((uint8_t *)data);
      return String(data);
    }

    void writeData(const char * data) {
      writeData((uint8_t *)data, strlen(data));
    }
};
ESPSafeMaster esp(SS);

void setup() {
  Serial.begin(115200);
  SPI.begin();
  esp.begin();
  delay(1000);
}

uint32_t status;

void loop() {
	esp.readStatus();
}

On the slave we use this code.

#include <Arduino.h>
#include <SPISlave.h>
#include <ESP8266WiFi.h>

void setup() {
	Serial.begin(115200);
	Serial.setDebugOutput(true);
	WiFi.mode(WIFI_STA);
	WiFi.begin("MySSID","MyPassword");
        delay(10000); //delay to not crash the ESP. If this delay is not there, the ESP will crash in sys context
	SPISlave.begin();
        SPISlave.setStatus(0xAA);
}

void loop() {
	
}

With this configuration, the data of the status is 0x55 (checked with an MSO) and as soon as the WiFi is connected, the data of the status is 0x00. No more 0x55 is returned.

Does anyone can try the same setup and give feedback ? We have been able to have a reliable communication in SPI mode.

Jonathan

@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

Core Version: [latest git]

Is that correct? Or are you using 2.5.0?

@JAndrassy
Copy link
Contributor

try the examples of the SPISlave library

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 26, 2019

@JAndrassy I have used the example of the SPISLave and add the WiFi.mode(WIFI_STA) and the WiFi.begin(). When adding this, sometime you get crash dump and sometime spi status values are not correct.

@devyte Yes we have use both in our test, git latest and 2.5.0. Both have the same behaviour.

@JAndrassy
Copy link
Contributor

and what is the esp object?

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 26, 2019

@JAndrassy I updated the Master sketch to add the MasterSafe object

@JiriBilek
Copy link
Contributor

Check the timing of the SPI bus. The SPISlave library has bad timing on MOSI.
See #5489

About the crashes - do you have stable enough power supply?

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 26, 2019

@JiriBilek Yes we have a Laboratory power supply. Not a problem here. I'm looking into the timing right now. Maybe this is related.

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 26, 2019

Here the decoded crash dump. From what I understand, the WiFi library is reading information from Flash while an ISR of the HSPI is comming. This look like the problem. I'm not really familiar with the tensila MCU to know why this is causing the crash. But maybe related with Caching system of the MCU.

Decoding stack results
0x40201d37: check_poison_all_blocks() at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/2.5.0-25-03-2019/cores/esp8266/umm_malloc/umm_malloc.cpp line 892
0x401000bc: _hspi_slave_isr_handler at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/2.5.0-25-03-2019/libraries/SPISlave/src/hspi_slave.c line 32
0x401000bc: _hspi_slave_isr_handler at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/2.5.0-25-03-2019/libraries/SPISlave/src/hspi_slave.c line 32
0x402012c6: std::function ::operator()() const at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/tools/xtensa-lx106-elf-gcc/2.5.0-3-20ed2b9/xtensa-lx106-elf/include/c++/4.8.2/functional line 2465
0x401000bc: _hspi_slave_isr_handler at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/2.5.0-25-03-2019/libraries/SPISlave/src/hspi_slave.c line 32
0x401005dd: __wrap_spi_flash_read(uint32_t, uint32_t*, size_t) at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/2.5.0-25-03-2019/cores/esp8266/core_esp8266_phy.cpp line 309
0x402036e0: malloc_loc(size_t, char const*, int) at /home/cimeq/Desktop/sloeber/arduinoPlugin/packages/esp8266/hardware/esp8266/2.5.0-25-03-2019/cores/esp8266/heap.cpp line 126

@JiriBilek
Copy link
Contributor

Every function that is called from an interrupt has to be declared ICACHE_RAM_ATTR.

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 26, 2019

@JiriBilek You right. The crash is caused by the non RAM function. The first one is the operator() I think. The use of the std::function is not the right choice. We have modified the code to use directly the hspi_slave_onStatusSent() and pass our RAM function to test and we don't have any crash dump. So I'm pretty sure this is the main problem.

here the code that don't crash dump.

void ICACHE_RAM_ATTR SPI_ON_STATUSSENT_CB(void*){
}

void setup() {
	Serial.begin(115200);
	Serial.setDebugOutput(true);
	SPISlave.begin();
	hspi_slave_onStatusSent(SPI_ON_STATUSSENT_CB);
	SPISlave.setStatus(0x55);
	WiFi.mode(WIFI_STA);
	WiFi.begin("MySSID","Password");
}

@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

CC @hreintke

@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

@dumarjo can you please provide the MCVE sketch that reproduces the crash, and a detailed explanation on the change you described which you say fixes it?

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 26, 2019

@devyte both sketch are in the First post. The new code that bypass the std::function is in that post #5914 (comment)

What else do you need ? I can put more information but I think all is there.

From what I understand, all the callback called from the _hspi_slave_isr_handler are stored in a std::function. Since the std::function::operator() is use to dereference the underline function and the std::function::operator is not in RAM buy in Flash, cause the issue. I'm not very familiar with the esp8266 so I just think from the core dump, that 0x402xxxx is in flash section and 0x401 is in RAM section. If I remove the std::function callbacks and using only fonction stored in RAM, like in #5914 (comment) I never seen a crash since.

Let me know if you need more information.

@dumarjo dumarjo changed the title Hspi in slave mode with Wifi enabled Hspi in slave mode with Wifi enabled produce crash dump Mar 26, 2019
@devyte
Copy link
Collaborator

devyte commented Mar 26, 2019

I hadn't seen the edit to the first post. And I now also understand what you meant by code change: you meant just change the slave sketch.
I think it's enough info.

Notes:
Expectation was that std::function::operator()() should be optimized away because inline. That doesn't seem to hold true in this case, maybe others.
Possibilities:

  • move operator()() to IRAM for all cases
  • evaluate some alternative to std::function

@dumarjo
Copy link
Contributor Author

dumarjo commented Mar 27, 2019

@devyte One way to put the operator() in ram is to modify the linker script to put that specitic section into RAM instead of the Flash. The other way is instead of storing the function into an std::function is to use a plain function pointer. This has the draw back to not be able to use lambda with capture, lambda without capture will work. If this is OK, could be a simple fix.

@earlephilhower
Copy link
Collaborator

@dumarjo, that's what we were discussing on the maintainers' channel yesterday. It was a little more complicated than I'd hoped due to GNU ld's limited pattern matching (no negation as far as I could find in MAN and INFO).

Please give #5922 a try. It manually places the single *(.text._ZNKSt8functionIFvvEEclEv) /* std::function<void ()>::operator()() const */ into IRAM. There may be a need to put in the other templated typedefs, though, so keep an eye out while running...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants