Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeed-Studio LoRa-E5-Mini LoRaWAN OTAA timeout #844

Closed
davidfobar opened this issue Oct 11, 2023 · 20 comments
Closed

Seeed-Studio LoRa-E5-Mini LoRaWAN OTAA timeout #844

davidfobar opened this issue Oct 11, 2023 · 20 comments
Labels
bug Something isn't working resolved Issue was resolved (e.g. bug fixed, or feature implemented)

Comments

@davidfobar
Copy link

davidfobar commented Oct 11, 2023

Using the E5 dev board (https://www.seeedstudio.com/LoRa-E5-mini-STM32WLE5JC-p-4869.html), I can get chirpstack to recieve a JoinRequest and issue a JoinAccept however RadioLib provides a timeout error (-6).

I don't know if I am setting the RFSwitches correctly:

////////////////////////////////////////////////////////////////////////////////////////////

#include <Arduino.h>
#include <RadioLib.h>

// no need to configure pins, signals are routed to the radio internally
STM32WLx radio = new STM32WLx_Module();

// create the node instance on the US-915 band
// using the radio module and the encryption key
// make sure you are using the correct band
// based on your geographical location!
LoRaWANNode node(&radio, &US915);

// using PA0 as a placeholder since the E5-mini does not have the PC3 connection
//ref: https://github.com/Seeed-Studio/LoRaWan-E5-Node
static const uint32_t rfswitch_pins[] = {PA0, PA4, PA5}; //TX received in chirpstack
//static const uint32_t rfswitch_pins[] = {PA0, PA5, PA4}; // no response
//static const uint32_t rfswitch_pins[] = {PA4, PA0, PA5}; //TX received in chirpstack
//static const uint32_t rfswitch_pins[] = {PA4, PA5, PA0}; // no response
//static const uint32_t rfswitch_pins[] = {PA5, PA0, PA4}; //TX received in chirpstack
//static const uint32_t rfswitch_pins[] = {PA5, PA4, PA0}; //TX received in chirpstack

static const Module::RfSwitchMode_t rfswitch_table[] = {
  {STM32WLx::MODE_IDLE,  {LOW,  LOW,  LOW}},
  {STM32WLx::MODE_RX,    {HIGH, HIGH, LOW}},
  {STM32WLx::MODE_TX_LP, {HIGH, HIGH, HIGH}},
  {STM32WLx::MODE_TX_HP, {HIGH, LOW,  HIGH}},
  END_OF_MODE_TABLE,
};

void setup() {
  Serial.begin(115200);

  // set RF switch control configuration
  // this has to be done prior to calling begin()
  radio.setRfSwitchTable(rfswitch_pins, rfswitch_table);

  int state = radio.begin();
  if(state == RADIOLIB_ERR_NONE) {
    Serial.println(F("success!"));
  } else {
    Serial.print(F("failed, code "));
    Serial.println(state);
    while(true);
  }

  // first we need to initialize the device storage
  // this will reset all persistently stored parameters
  // NOTE: This should only be done once prior to first joining a network!
  //       After wiping persistent storage, you will also have to reset
  //       the end device in TTN and perform the join procedure again!
  // node.wipe();

  // application identifier - pre-LoRaWAN 1.1.0, this was called appEUI
  // when adding new end device in TTN, you will have to enter this number
  // you can pick any number you want, but it has to be unique
  uint64_t joinEUI = 0x12AD1011B0C0FFEE;

  // device identifier - this number can be anything
  // when adding new end device in TTN, you can generate this number,
  // or you can set any value you want, provided it is also unique
  uint64_t devEUI = 0x70B3D57ED005E120;

  // select some encryption keys which will be used to secure the communication
  // there are two of them - network key and application key
  // because LoRaWAN uses AES-128, the key MUST be 16 bytes (or characters) long

  // network key is the ASCII string "topSecretKey1234"
  uint8_t nwkKey[] = { 0x74, 0x6F, 0x70, 0x53, 0x65, 0x63, 0x72, 0x65,
                       0x74, 0x4B, 0x65, 0x79, 0x31, 0x32, 0x33, 0x34 };
                       //746F705365637265744B657931323334

  // application key is the ASCII string "aDifferentKeyABC"
  uint8_t appKey[] = { 0x61, 0x44, 0x69, 0x66, 0x66, 0x65, 0x72, 0x65,
                       0x6E, 0x74, 0x4B, 0x65, 0x79, 0x41, 0x42, 0x43 };
                       //61446966666572656E744B6579414243

  // prior to LoRaWAN 1.1.0, only a single "nwkKey" is used
  // when connecting to LoRaWAN 1.0 network, "appKey" will be disregarded
  // and can be set to NULL

  // some frequency bands only use a subset of the available channels
  // you can set the starting channel and their number
  // for example, the following corresponds to US915 FSB2 in TTN
  /*
    node.startChannel = 8;
    node.numChannels = 8;
  */

  // now we can start the activation
  // this can take up to 20 seconds, and requires a LoRaWAN gateway in range
  Serial.print(F("[LoRaWAN] Attempting over-the-air activation ... "));
  state = node.beginOTAA(joinEUI, devEUI, nwkKey, appKey);
  if(state == RADIOLIB_ERR_NONE) {
    Serial.println(F("success!"));
  } else {
    Serial.print(F("failed, code "));
    Serial.println(state);
    while(true);
  }

///////////////////////////////////////////////////////////////////////////////////////////
The remainder of the STM32WLx_Transmit_Interrupt.ino code is unchanged.

Serial terminal output:

[LoRa-E5] Initializing ... success!
[LoRaWAN] Attempting over-the-air activation ... failed, code -6

///////////////////////////////////////////////////////////////////////////////////////////

Is this a RF Switch issue?

@jgromes
Copy link
Owner

jgromes commented Oct 11, 2023

Is this a RF Switch issue?

Seems likely, considering that the E5 board only seems to have two RF switch control pins instead of the 3 used by Nucleo STM32WL. So it would suggest it does not have the high-power/low-power transmit modes of the original STM32WL (and therefore you would have to modify the rfswitch_table), but that's just a guess on my side, probably best to clarify with the manufacturer.

Another thing is that I haven't tested against chirpstack, just TTN. Should be noted that in TTN, for US-915 frequencies you have to select a subset of all the available channels to be used. Is that also the case in chirpstack?

Also, maybe try eanbling debug mode if there is more information.

@davidfobar
Copy link
Author

Here is a JoinRequest and JoinAccept from Chirpstack:
[
{
"rxInfo": [
{
"gatewayID": "LPfxEUEQAA4=",
"time": "2023-10-11T17:45:24.901554010Z",
"timeSinceGPSEpoch": null,
"rssi": -113,
"loRaSNR": -7.2,
"channel": 6,
"rfChain": 1,
"board": 0,
"antenna": 0,
"location": {
"latitude": 0,
"longitude": 0,
"altitude": 0,
"source": "UNKNOWN",
"accuracy": 0
},
"fineTimestampType": "NONE",
"context": "MFmmIA==",
"uplinkID": "s+zHL1pyRG2krTT5F10EVg==",
"crcStatus": "CRC_OK"
}
],
"txInfo": {
"frequency": 905100000,
"modulation": "LORA",
"loRaModulationInfo": {
"bandwidth": 125,
"spreadingFactor": 10,
"codeRate": "4/5",
"polarizationInversion": false
}
},
"phyPayload": {
"mhdr": {
"mType": "JoinRequest",
"major": "LoRaWANR1"
},
"macPayload": {
"joinEUI": "12ad1011b0c0ffee",
"devEUI": "70b3d57ed005e120",
"devNonce": 0
},
"mic": "1fad4265"
}
},
{
"txInfo": {
"frequency": 925100000,
"power": 20,
"modulation": "LORA",
"loRaModulationInfo": {
"bandwidth": 500,
"spreadingFactor": 10,
"codeRate": "4/5",
"polarizationInversion": true
},
"board": 0,
"antenna": 0,
"timing": "DELAY",
"delayTimingInfo": {
"delay": "5s"
},
"context": "LpnAIQ=="
},
"phyPayload": {
"mhdr": {
"mType": "JoinAccept",
"major": "LoRaWANR1"
},
"macPayload": {
"bytes": "YxrZt0DgIXMeTdKMoMlTlxc0TYTD0Snbdb3E0w=="
},
"mic": "18242c24"
}
}
]

@davidfobar
Copy link
Author

Also, from an example provided by STM32CubeIDE I was able to figure out the RFSwitchTable, but it didn't seem to help yet. I am still using PA0 as a placeholder since RadioLib expects the table to have 3 pins - changing RFSWITCH_MAX_PINS to 2 does not compile, there are parts of the library that still expect 3 values.

static const uint32_t rfswitch_pins[] = {PA4, PA5, PA0};

static const Module::RfSwitchMode_t rfswitch_table[] = {
{STM32WLx::MODE_IDLE, {LOW, LOW, LOW}},
{STM32WLx::MODE_RX, {HIGH, LOW, LOW}},
{STM32WLx::MODE_TX_LP, {HIGH, HIGH, LOW}},
{STM32WLx::MODE_TX_HP, {LOW, HIGH, LOW}},
END_OF_MODE_TABLE,
};

@jgromes
Copy link
Owner

jgromes commented Oct 11, 2023

You're not meant to change the value of RFSWITCH_MAX_PINS. But there's nothing preventing you from only using two pins for the RF switch, RadioLib has a "not connected" macro: RADIOLIB_NC. That's actually used in the default Rf switch table:

RadioLib/src/Module.cpp

Lines 492 to 506 in ddcce42

void Module::setRfSwitchPins(uint32_t rxEn, uint32_t txEn) {
// This can be on the stack, setRfSwitchTable copies the contents
const uint32_t pins[] = {
rxEn, txEn, RADIOLIB_NC,
};
// This must be static, since setRfSwitchTable stores a reference.
static const RfSwitchMode_t table[] = {
{ MODE_IDLE, {this->hal->GpioLevelLow, this->hal->GpioLevelLow} },
{ MODE_RX, {this->hal->GpioLevelHigh, this->hal->GpioLevelLow} },
{ MODE_TX, {this->hal->GpioLevelLow, this->hal->GpioLevelHigh} },
END_OF_MODE_TABLE,
};
setRfSwitchTable(pins, table);
}

Regarding the join request, I think this is the issue:

"txInfo": {
"frequency": 925100000,
"power": 20,
"modulation": "LORA",
"loRaModulationInfo": {
"bandwidth": 500,
"spreadingFactor": 10,
"codeRate": "4/5",
"polarizationInversion": true
},

If I'm reading that correctly, it seems like your gateway/application server has decided to send the join accept reply at 925.1 MHz, which is downlink channel 3. However, the node used uplink at 905.1 MHz, channel number 14. So the downlink channel should have been 14 % 8 = 6, not 3.

@davidfobar
Copy link
Author

I enabled debug printing and see the following:

///////////////////////////////////////////////////////////////////////////

[SX1278] Initializing ... GPIO pre-transfer timeout, is it connected?
GPIO pre-transfer timeout, is it connected?
GPIO pre-transfer timeout, is it connected?

RadioLib Debug Info
Version: 6.2.0.0
Platform: Arduino STM32 (official)
Compiled: Oct 13 2023 16:05:23

Found SX126x: RADIOLIB_SX126X_REG_VERSION_STRING:
0000320 53 58 31 32 36 31 20 54 4b 46 20 31 41 31 30 00 | SX1261 TKF 1A10.

M SX126x
success!
[LoRaWAN] Attempting over-the-air activation ... Channel frequency UL = MHz
Timeout in 556032 us

//////////////////////////////////////////////////////////////////////////////////////////////////

I cannot find where "GPIO pre-transfer timeout, is it connected?" is printed from, but it seems like that may be an initialization thing since it eventually is cleared. What I also cannot figure out is why the channels and frequencies are not printing, my thought is that maybe the RX channel is not being properly set? Any thoughts on where I can add a debug statement to investigate?

@jgromes
Copy link
Owner

jgromes commented Oct 14, 2023

The GPIO timeouts are checked here:

RadioLib/src/Module.cpp

Lines 311 to 330 in ddcce42

// wait for GPIO to go high and then low
if(waitForGpio) {
if(this->gpioPin == RADIOLIB_NC) {
this->hal->delay(1);
} else {
this->hal->delayMicroseconds(1);
uint32_t start = this->hal->millis();
while(this->hal->digitalRead(this->gpioPin)) {
this->hal->yield();
if(this->hal->millis() - start >= timeout) {
RADIOLIB_DEBUG_PRINTLN("GPIO post-transfer timeout, is it connected?");
#if !defined(RADIOLIB_STATIC_ONLY)
delete[] buffOut;
delete[] buffIn;
#endif
return(RADIOLIB_ERR_SPI_CMD_TIMEOUT);
}
}
}
}

It's very strange, since that message usually means the user provided incorrect BUSY pin number or there's a wiring issue. On STM32WL, this should be handled by the SX126x peripheral.

The missing frequency in debug is most likely caused by the STM32 Arduino core not supporting printf for floats (I got so used to ESP32 doing that I completely forgot other platforms might not be able to do so). Anyway I updated the debug printing to fix this, could you try again?

Another strange thing is that after the uplink, the debug should print downlink frequency, if that doesn't happen then the program never reached that point at all.

@davidfobar
Copy link
Author

davidfobar commented Oct 14, 2023

The printing of floats now works
////////////////////////////////////////////////////////////////
Channel frequency UL = 904.500 MHz
Timeout in 556032 us
Channel frequency DL = 925.100 MHz
failed, code -6
////////////////////////////////////////////////////////////////
My only guess is that the IRQ is not being correctly implemented for this chip.Is that a HAL thing?

I added some timestamps (ms) and debug messages to the beginOTAA function, here is the result of single attempt:
0: [LoRaWAN] Starting OTAA join procedure...
2: Setting uplink/downlink frequencies and datarates...
Channel frequency UL = 914.900 MHz
74: Configuring devNonce...
123: Building join request message...
128: Sending join request...
Timeout in 556032 us
510: Configuring Downlink channel...
Channel frequency DL = 927.500 MHz
521: Starting receive...
530: Waiting for join accept...
8522: Join accept timeout!
failed, code -6

//////////////////////////////////////////////////////////
The code for reference:
`int16_t LoRaWANNode::beginOTAA(uint64_t joinEUI, uint64_t devEUI, uint8_t* nwkKey, uint8_t* appKey, bool force) {
// check if we actually need to send the join request
Module* mod = this->phyLayer->getMod();
if(!force && (mod->hal->getPersistentParameter<uint32_t>(RADIOLIB_PERSISTENT_PARAM_LORAWAN_MAGIC_ID) == RADIOLIB_LORAWAN_MAGIC)) {
// the device has joined already, we can just pull the data from persistent storage
return(this->begin());
}

//start a timer to inlcude with the debug print statments
uint32_t debugstart = mod->hal->millis();

//print the timestamp and the action
RADIOLIB_DEBUG_PRINTLN("%lu: [LoRaWAN] Starting OTAA join procedure...", mod->hal->millis() - debugstart);
// set the physical layer configuration
int16_t state = this->setPhyProperties();
RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Setting uplink/downlink frequencies and datarates...", mod->hal->millis() - debugstart);
// setup uplink/downlink frequencies and datarates
state = this->setupChannels();
RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Configuring devNonce...", mod->hal->millis() - debugstart);
// get dev nonce from persistent storage and increment it
uint16_t devNonce = mod->hal->getPersistentParameter<uint16_t>(RADIOLIB_PERSISTENT_PARAM_LORAWAN_DEV_NONCE_ID);
mod->hal->setPersistentParameter<uint16_t>(RADIOLIB_PERSISTENT_PARAM_LORAWAN_DEV_NONCE_ID, devNonce + 1);

// build the join-request message
uint8_t joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_LEN];

RADIOLIB_DEBUG_PRINTLN("%lu: Building join request message...", mod->hal->millis() - debugstart);
// set the packet fields
joinRequestMsg[0] = RADIOLIB_LORAWAN_MHDR_MTYPE_JOIN_REQUEST | RADIOLIB_LORAWAN_MHDR_MAJOR_R1;
LoRaWANNode::hton<uint64_t>(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_JOIN_EUI_POS], joinEUI);
LoRaWANNode::hton<uint64_t>(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_DEV_EUI_POS], devEUI);
LoRaWANNode::hton<uint16_t>(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_DEV_NONCE_POS], devNonce);

// add the authentication code
uint32_t mic = this->generateMIC(joinRequestMsg, RADIOLIB_LORAWAN_JOIN_REQUEST_LEN - sizeof(uint32_t), nwkKey);
LoRaWANNode::hton<uint32_t>(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_LEN - sizeof(uint32_t)], mic);

RADIOLIB_DEBUG_PRINTLN("%lu: Sending join request...", mod->hal->millis() - debugstart);
// send it
state = this->phyLayer->transmit(joinRequestMsg, RADIOLIB_LORAWAN_JOIN_REQUEST_LEN);
RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Configuring Downlink channel...", mod->hal->millis() - debugstart);
// configure for downlink with default configuration
state = this->configureChannel(RADIOLIB_LORAWAN_CHANNEL_DIR_DOWNLINK);
RADIOLIB_ASSERT(state);

// set the function that will be called when the reply is received
this->phyLayer->setPacketReceivedAction(LoRaWANNodeOnDownlink);

// downlink messages are sent with inverted IQ
// TODO use downlink() for this
if(!this->FSK) {
state = this->phyLayer->invertIQ(true);
RADIOLIB_ASSERT(state);
}

RADIOLIB_DEBUG_PRINTLN("%lu: Starting receive...", mod->hal->millis() - debugstart);
// start receiving
uint32_t start = mod->hal->millis();
downlinkReceived = false;
state = this->phyLayer->startReceive();
RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Waiting for join accept...", mod->hal->millis() - debugstart);
// wait for the reply or timeout
while(!downlinkReceived) {
if(mod->hal->millis() - start >= RADIOLIB_LORAWAN_JOIN_ACCEPT_DELAY_2_MS + 2000) {
RADIOLIB_DEBUG_PRINTLN("%lu: Join accept timeout!", mod->hal->millis() - debugstart);
downlinkReceived = false;
if(!this->FSK) {
this->phyLayer->invertIQ(false);
}
return(RADIOLIB_ERR_RX_TIMEOUT);
}
}`

@davidfobar
Copy link
Author

I found the issue!!!

// set the function that will be called when the reply is received
this->phyLayer->setPacketReceivedAction(LoRaWANNodeOnDownlink);
STM32WLx* mod2 = (STM32WLx*)this->phyLayer;
mod2->setDio1Action(LoRaWANNodeOnDownlink);

the phyLayer was calling the SX126x.h setDio1Action rather than the STM32WLx.h one.

@jgromes
Copy link
Owner

jgromes commented Oct 14, 2023

@davidfobar you're right, for STM32WLx the interrupt will indeed not work. The problem was that the methods like setPacketReceivedAction did not exist for STM32WL, so it defaulted to those in its superclass, which is SX126x. I pushed a fix addressing this, could you check it solved the issue?

@jgromes jgromes added the bug Something isn't working label Oct 14, 2023
@davidfobar
Copy link
Author

I confirmed that the STM32WLx.h changes are good. I have a new issue:

Channel frequency DL = 925.700 MHz
downlinkMsg:
0000000 49 00 00 00 00 01 86 78 f3 00 01 00 00 00 00 1d | I......x........
0000010 40 86 78 f3 00 02 01 00 3a be 0a d4 e0 52 23 57 | @.x.....:....R#W
0000020 5e 5b b7 31 6d 8a 5d 30 1f 13 c2 ba 41 38 39 a2 | ^[.1m.]0....A89.
0000030 57 | W
MIC mismatch, expected 19776fd5, got 57a23938
failed, code -7

///////////////////////////////////////////////////////////////////////////////////////
From the Chirpstack side, this is what is being sent:
image

@gpabdo
Copy link

gpabdo commented Oct 14, 2023

How do you enable DEBUG logging?

@davidfobar
Copy link
Author

davidfobar commented Oct 14, 2023

@gpabdo
Throw this near the top of BuildOpt.h

#define RADIOLIB_DEBUG

I've then added my own debug statements with
RADIOLIB_DEBUG_PRINTLN("... %d", val);

@gpabdo
Copy link

gpabdo commented Oct 14, 2023

@gpabdo Throw this near the top of BuildOpt.h

#define RADIOLIB_DEBUG

I've then added my own debug statements with RADIOLIB_DEBUG_PRINTLN("... %d", val);

Thanks so much!

@davidfobar
Copy link
Author

To help further my MIC mismatch issue, this is the key reported by both the server and end node:
Key: 03 ba e1 46 1f cb b2 a1 57 97 c4 26 fb 51 5e 23

I don't understand why there is a downlink to begin with, chirpstack is reporting the hello world message is unconfirmed, and I am not attempting to send anything either.

@jgromes
Copy link
Owner

jgromes commented Oct 14, 2023

@davidfobar when exactly is this issue appearing? Is it during the join procedure or afterwards?

The downlink shown in the screenshot seems to be empty apart from a RekeyInd command. But the format of that command is invalid, as it sets the minor version to 0, which is reserved by the LoRaWAN specification.

@davidfobar
Copy link
Author

it is only after joining is complete. I have the end node looping and sending incremental hello world messages and then looking for a downlink message. I can get the uplink message out of chirpstack without issue. I am now trying to make sure that I can send messages back to the endnode, but I saw this issue first.

@jgromes
Copy link
Owner

jgromes commented Oct 14, 2023

Which LoraWAN version is chirpstack configured for? It's sending a RekeyInd MAC command, so I would guess it's 1.1, but you didn't specify it.

@davidfobar
Copy link
Author

I have the device profile configured for 1.1, other options are available, but of course this was the only configuration that would all RadioLIB to connect without a MIC issue using the end node example.

@jgromes
Copy link
Owner

jgromes commented Oct 15, 2023

I have the device profile configured for 1.1

Then it is rather strange that chirpstack sends RekeyInd with revision set to 0. MIC calculation changed quite a lot across different LoRaWAN versions, so if there is some mismatch there it could cause this issue. I'm confident it's correct on RadioLib end, since I've tested extensively against TTN on v1.1.

I don't understand why there is a downlink to begin with

It's just the MAC command being sent from server to the node without any user data, that's not uncommon.

@davidfobar
Copy link
Author

davidfobar commented Oct 16, 2023

Updating to chirpstack v4 resolved the RekeyInd MAC command. Thank you for adding support for the STM32wle5 interrupts.

I will continue my discussion with bad downlink MICs in a new issue.

@jgromes jgromes added the resolved Issue was resolved (e.g. bug fixed, or feature implemented) label Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working resolved Issue was resolved (e.g. bug fixed, or feature implemented)
Projects
None yet
Development

No branches or pull requests

3 participants