Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoRaWAN] Implement full session persistence & more v1.1 specification #835

Merged
merged 6 commits into from
Oct 23, 2023
Merged

[LoRaWAN] Implement full session persistence & more v1.1 specification #835

merged 6 commits into from
Oct 23, 2023

Conversation

StevenCellist
Copy link
Collaborator

Hi Jan, great work on the LoRaWAN integration. As someone pointed out to me, the first 90% is working, but now we need to add the other 90%. Here's my contribution!
I've got a list of items that still need to be done but which are more closely related to the actual hardware integration regarding frequency, datarate etc. Once we've gone through this PR, I'll open up some issues so we can tackle those as well.

Implemented the following items:

  • check JoinNonce
  • implement downlink frame counters (32-bit), checked as well
  • implement sending multiple MAC commands
  • save all relevant session parameters to persistent storage and load upon session restore
  • only allow FPort = 0 when sending MAC-only payload
  • implement LoRaWAN v1.1 errata (linked below the v1.1 specification download)
  • added new function deleteMacCommand that properly deletes an enqueued MAC command
  • renamed APB to ABP

Minor note about the change in LoRaWANMacCommand_t (size_t -> uint8_t: this saves precious NVM storage)

I just realized that the method popMacCommand did not correctly remove items from the queue - this should solve the problem
@HeadBoffin
Copy link
Collaborator

Smoke test is fine - running it overnight to accumulate some uplinks & subsequent potential downlinks from TTN to process

@HeadBoffin
Copy link
Collaborator

@StevenCellist, we really need to get ADR sorted for this not to break local networks - SF12 isn't good citizenship!

@StevenCellist
Copy link
Collaborator Author

@jgromes what is your plan on implementing the PR? Is it under review (which I sure believe takes some time!) or are you waiting for further development?
I would love to add other things & continue the work, but the sooner essentials are integrated into RadioLib, the less people start working with RF- and LNS-unfriendly devices :)

@jgromes
Copy link
Owner

jgromes commented Sep 25, 2023

@StevenCellist sorry for being a bit less responsive than usual - I will try to review the PR ASAP. However, there is one major point that I think can be addressed right away: not all platforms supported by RadioLib have persistent storage support. How will those be affected? Since this PR includes checking things such as the last used nonce, I would expect those to fail to join now.

Also, in general I think it would be better to have an option to bypass these checks. Even TTN has the option to allow duplicate nonces.

@StevenCellist
Copy link
Collaborator Author

StevenCellist commented Sep 25, 2023

@jgromes no worries, take your time! But a simple heads up as to when you'll likely get around to it is welcome :)

Regarding the use of persistent storage, there are two options, of which I strongly prefer nr.1:

  1. Restrict the use of LoRaWAN to devices with persistent storage; either internal or external. Adding a few (mega)bytes of PSRAM or whatever is simple and low-cost, and is less hacky then disabling all security checks on the network side - which exist for a reason too. I'm sure @HeadBoffin will like to chime in on this as well.
  2. Add a whole range of #defines and un-readability, and comments to explain all caveats.

But yes, can add a LoRaWAN-type godmode that just disables the security checks if that's what you want for RadioLib. However I'd heavily discourage disabling the whole NVM.

BTW - even your code would fail at that point, as you retrieve the frame counter for uplinks from NVM ;)

@HeadBoffin
Copy link
Collaborator

There is providing options to bypass checks and then there are options for bypassing checks.

GodMode that a developer has to turn on with some mild effort is one thing. Providing a magic switch on the front door is likely to end up with the less technically savvy shooting themselves in the foot trying to figure out what's going on and not likely to be well received for those helping them debug when they find out the LNS has been subverted and that the device is behaving more like ABP than OTAA. Or has managed to run totally amok and is now saturating the airwaves at SF12.

But that's more about @jgromes policy / view on the LW implementation. It would be nice to have a very compliant clean stack that can do Arduino or bare metal. And I hear on the grapevine that someone who can run the compliance tests is happy to help out. I doubt there are shed loads of cash awaiting for creating an almost totally certifiable stack, in the LoRaWAN Alliance sense, but there is definitely some kudos to be had.

That said, there are SAMD EEPROM libraries which are the main missing link and a mop up of any missing ARM boards can be done after. It was something I was looking at before the firehose of coding output from @StevenCellist hit my desk and I've not been able to keep up since.

Alternatively supporting v1.0.3 with its random DevNonce's could be reviewed. v1.1 is so much more a super set of the v1.0.x that this shouldn't be too onerous. Another potential is to provide detailed instructions on resetting the DevNonce - via the console or the CLI - so that a device can join and do its job but have to rejoin if power goes out. Mostly power doesn't go out, most of this activity is because ESP32 deep sleep & preserving state in LMIC is a constantly arising issue, but having to spot a device is trying to rejoin wouldn't be great. For production it's likely it will hit the next number in the sequence reasonably quickly anyway, depending on how TR007 is implemented. So at the very least, having a sleepy ESP32 stack available to the world is a good thing.

@jgromes
Copy link
Owner

jgromes commented Sep 25, 2023

@StevenCellist regarding option 1, i.e. adding persistent storage - the major disadvantage is that this isn't portable. Until now, RadioLib has worked on any off-the-shelf development kit that has exposed SPI bus and a couple GPIO pins to connect to the radio.

as you retrieve the frame counter for uplinks from NVM

Agree, but it would be able to join, "just" the frame counters would be always set to the same value; I wonder if the network servers actually check that ...

TL;DR I think the best course of action now to unblock this PR is to disable LoRaWAN on platforms that do not have non-volatile storage provided by their Arduino core.

@HeadBoffin

That said, there are SAMD EEPROM libraries which are the main missing link and a mop up of any missing ARM boards can be done after.

SAMD is the big one, but also the Arduino mbed boards (Nano 33, Portenta H7 and RP2040) don't support any sort of EEPROM emulation. I find that puzzling since ESP32 has had that for years, and for a few bytes like this I definitely do not want to have a filesystem. A simple EEPROM flash emulation with rudimentary wear leveling would be my go-to solution in a less generic environment. But overall I don't like the idea of either relying on external dependencies (that are not part of the Arduino core) or implementing our own mechanism.

On a side note:

It would be nice to have a very compliant clean stack that can do Arduino or bare metal

I don't think a full compliance is the primary goal here; the goal is to have a simple to use LoRaWAN implementation that can be deployed quickly on various hardware.

@HeadBoffin
Copy link
Collaborator

"just" the frame counters would be always set to the same value; I wonder if the network servers actually check that ...

Yup, they definitely do - the #1 gotcha for APB but not an issue for an OTAA join where they are reset.

TL;DR I think the best course of action now to unblock this PR is to disable LoRaWAN on platforms that do not have non-volatile storage provided by their Arduino core.

It moves forward to let Steven do more MAC stuff whilst I look at SAM EEPROM.

That said, there are SAMD EEPROM libraries which are the main missing link and a mop up of any missing ARM boards can be done after.
SAMD is the big one, but also the Arduino mbed boards (Nano 33, Portenta H7 and RP2040) don't support any sort of EEPROM emulation.

It can be emulated with the libraries available. Nano 33 is SAMD and there are NVM implementations for RP2040. I can check in to STM32H747.

I find that puzzling since ESP32 has had that for years, and for a few bytes like this I definitely do not want to have a filesystem.

Various ARM implementations have had NVM for years but for what ever reason they've not surfaced in to the Arduino world. In LoRaWAN land, ESP32 has been a third class citizen until now - LMIC isn't easy to save state as mostly AVR & SAMD boards deep sleep without losing RAM so there's never been an incentive to get a good implementation in to the main library. It's all a matter of perspective. You & Steven have turned this around. No file systems need be harmed in the creation of NVM.

A simple EEPROM flash emulation with rudimentary wear leveling would be my go-to solution in a less generic environment. But overall I don't like the idea of either relying on external dependencies (that are not part of the Arduino core) or implementing our own mechanism.

Theses are libraries that are in the Arduino eco system - they just don't work as an overlay to the original EEPROM library so not as simple as the ESP32 drop in replacement. An abstraction layer, which is what almost all of Arduino is anyway shouldn't be a stretch. The benefits in return are a whole extra range of boards to use.

But as I say, let's get started with the boards you are happy with and take it from there. Rome wasn't built in a day.

It would be nice to have a very compliant clean stack that can do Arduino or bare metal
I don't think a full compliance is the primary goal here; the goal is to have a simple to use LoRaWAN implementation that can be deployed quickly on various hardware.

Very compliant ≠ full compliance. I know the spec. I know what compliance takes. I'm not that keen nor naieve to suggest working towards something where you've done the first 90%, Steven seems happy to work on the next 90% and then we all have to work on the last 90%. It would be a fools errand, some of the edge cases defy belief, but being a good TTN citizen, ie not shredding the local airwaves / network is a reasonable goal.

Copy link
Owner

@jgromes jgromes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally I had a bit of time to review this, I left a couple comments. Coming back to the platforms without NVM support, I realized that we can keep supporting them by forcing the to rejoin every time. The counters can be saved in volatile storage. So after this is merged, I will address that.

src/protocols/LoRaWAN/LoRaWAN.cpp Show resolved Hide resolved

// assume a 16-bit to 32-bit rollover when difference in LSB is smaller than MAX_FCNT_GAP
// if that isn't the case and the received fcnt is smaller or equal to the last heard fcnt, then error
if (fcnt16 <= fcntDownPrev && 0xFFFF - (uint16_t)fcntDownPrev + fcnt16 > RADIOLIB_LORAWAN_MAX_FCNT_GAP) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add parentheses to make the order of operations explicit (this can lead to warning or error for some compiler configurations).

src/protocols/LoRaWAN/LoRaWAN.cpp Show resolved Hide resolved
src/protocols/LoRaWAN/LoRaWAN.cpp Show resolved Hide resolved
@HeadBoffin
Copy link
Collaborator

Coming back to the platforms without NVM support, I realized that we can keep supporting them by forcing the to rejoin every time.

With an incrementing DevNonce, that leads to a potential mess but with the other platforms their sleep current is very low & they save RAM so no need to rejoin.

The counters can be saved in volatile storage

Works for OTAA, works for ABP if counters checks are turned off. I suspect many users deploying on Arduino environment will be OK / not care until it comes to bite them.

. So after this is merged, I will address that.

I have SAMD NVM as work in progress - please let me know what the plan is so we don't duplicate effort.

@jgromes
Copy link
Owner

jgromes commented Sep 29, 2023

@HeadBoffin

I have SAMD NVM as work in progress - please let me know what the plan is so we don't duplicate effort.

That's appreciated, but I don't think it should be directly part of RadioLib. I think that should be work for the SAMD Arduino cores, as their "EEPROM library" emulation.

@HeadBoffin
Copy link
Collaborator

@HeadBoffin

I have SAMD NVM as work in progress - please let me know what the plan is so we don't duplicate effort.

That's appreciated, but I don't think it should be directly part of RadioLib. I think that should be work for the SAMD Arduino cores, as their "EEPROM library" emulation.

I wasn't thinking of doing anything specific, just implementing the best library candidate - which will require some mods for things like:

#if !defined(RADIOLIB_EEPROM_UNSUPPORTED)
#include <EEPROM.h>
#endif

in ArduinoHal.cpp as at first glance not many of the existing libraries use an EEPROM.h header. But easily done by expanding the #if for board definitions.

@StevenCellist
Copy link
Collaborator Author

@jgromes your comments should be resolved now

@StevenCellist
Copy link
Collaborator Author

Ah... Strings. Copied that from the examples but that was not my brightest idea.

@StevenCellist
Copy link
Collaborator Author

@jgromes I am not sure what is causing the checks to fail to install arduino-cli? I assume it is unrelated to my PR; is it something you can look into?

@jgromes
Copy link
Owner

jgromes commented Oct 20, 2023

@StevenCellist I'll look into it, the important thing is that esp-idf and RPi builds succeed, so there is no issue in the PR.

@jgromes jgromes merged commit 556f37f into jgromes:master Oct 23, 2023
29 checks passed
@jgromes
Copy link
Owner

jgromes commented Oct 23, 2023

@StevenCellist PR merged, thank you so much for this contribution! And also thanks to @HeadBoffin for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants