Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RX Timing is wrong for SF12 #442

Closed
terrillmoore opened this issue Sep 9, 2019 · 12 comments
Closed

RX Timing is wrong for SF12 #442

terrillmoore opened this issue Sep 9, 2019 · 12 comments
Assignees
Labels

Comments

@terrillmoore
Copy link
Member

terrillmoore commented Sep 9, 2019

LoRaWAN MACs calculate a start time for starting the receiver.

It looks like SF12 is wrong for the LMIC -- too late. I found this with EU RX testing. Here are the times in ms.

Code SF12 SF11 SF10 SF9 SF8 SF7
LMIC 41 8 -8 -16 -20 -22
Semtech reference code 33 17 5 -2 -8 -9
mbed 88 39 14 2 -4 -7

(Semtech reference: RegionCommon.c RegionCommonComputeRxWindowParameters()).

It's OK if you start early, but not OK if you start late, as you can miss the start of the packet.

In fact, the current LMIC only shows problems at SF12.

Part of the problem is that both use a specific number of symbols as "min syms". But LMIC uses 5 whereas Semtech uses 6. If LMIC uses 6, the table becomes:

Code SF12 SF11 SF10 SF9 SF8 SF7
LMIC 41 8 -8 -16 -20 -22
Semtech reference code 33 17 5 -2 -8 -9
mbed 88 39 14 2 -4 -7
LMIC with min RxSym = 6 29 -3 -19 -27 -32 -34

We'll test tightening this up, but also will test just changing RxSym to 6.

@terrillmoore
Copy link
Member Author

It appears that there's more to this than meets the eye. See #483 (comment)_, #311 and #477.

@terrillmoore
Copy link
Member Author

More testing revealed that grounding (and ground problems) can mess up the SX1276 front end (no big surprise). See new note at #483 (comment).

However, even with crummy grounds, it was clear that starting RX before TX was better than starting while TX. As far as I can tell, there is no power advantage to starting late, unless there is a downlink. If there's a downlink, then starting late means you run the receiver for less time just watching the preamble. However, if there's no downlink, it doesn't matter at all. And most of the time, there's no downlink.

So I plan to push a change to the 3.0.99 branch for test, that will move the RX window to start before the TX window. We'll set RXSYMs high enough to allow us to be sure we have some overlap with the preamble. Even at SF7, 1023 symbols is over 1 second, so we have plenty of margin (as long as we apply the fix for #467). I think we should ignore user-specified clock inaccuracy of more than 1%; and we should track the number of times we miss deadlines. (We should schedule assuming a slow clock, and set the number of symbols assuming a fast clock.) But our target should be just before the tx time.

@cyberman54
Copy link

Sounds promising. Hope we get rid of the EU868 TTN join problem this way.

@terrillmoore
Copy link
Member Author

@cyberman54 I'll have something for others to test later today, I think.

@cyberman54
Copy link

@terrillmoore i will be standby

@terrillmoore
Copy link
Member Author

Here's the plan.

  1. I will add a check for "late" window opening (i.e., a call with the target receive time in the past). This will record a few statistics in LMIC.
  2. we'll change strategy to bias towards opening the window promptly.
  3. we'll deprecate the clock error because... it was probably compensating for a non-problem. (I'll leave a conditional compile to enable it, but... it really should not be used.)

@terrillmoore
Copy link
Member Author

Status: I discovered that the delayMicroseconds() function on at least the MCCI BSP is often too short. After I moved the window forward, this was causing us to open the window too early, breaking SF7. I also discovered that the STM32L0 internal clock can only be calibrated (typical) to +/- 0.4%. This means that a clock error of 4000ppm is needed on those platforms, so I relaxed the constraint. This also revealed a calculation error for the number of syms needed; I was off by roughly a factor of 2 in the time requirement, which means a lot at the higher rates.

I have patches for all this, but I want to run the compliance test and review patches before committing. However, I think all these things (plus the 'late window' strategy) explain the problems we've been seeing. On the STM32L0, specifically, we might be better off using LPTIM and the (crystal) LSE oscillator.

Compliance is passing at faster rates. We'll see how things go at the slower rates in an hour or two.

@terrillmoore
Copy link
Member Author

Looks decent for EU868. I will review and push changes tomorrow.

@terrillmoore
Copy link
Member Author

Changes are pushed to issue453 branch, I will merge to master once CI tests pass. Looks good for EU868. Passes RWC pre-compliance test.

@cyberman54
Copy link

cyberman54 commented Dec 30, 2019

@terrillmoore i can't find a branch named issue453 ?
Edit: oh, i just see it was already merged with master. Thank you! I will test and report.

@terrillmoore
Copy link
Member Author

Thanks in advance for testing.

@cyberman54
Copy link

Here's my feedback, late due to holidays.

(+) with current head (commit #3ca90f3 ) i can connect to EU868 TTNv2 network with all my test boards (all ESP32 based) and my paxcounter application.
(-) i still sometimes encounter "JOIN_WAIT" situations, i guess slightly less times than with previous mcci lmic version.

Unfortunately i don't own suitable measurement equipment to track this down. So this could be a local gateway or a general EU868 ttn problem, as well as a timing problem related to my ESP32 multitasking application.

I traced the value of LMIC.radio.rxlate_count in my application: it is always 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants