-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Link between srsenb and srsue goes up and down and after a while they both crash #738
Comments
This is an issue in the UHD driver we also see internally with the X310. There is not much we can do on the srsRAN side I am afraid. |
Hi, could I please get some more clarity before we close this thread. Does this mean that one cannot use X310 for a stable setup with the latest version of srsRAN ? Am I better off using the B210 if I have access to it ? I even tried using the latest UHD version (4.2) to see if they have fixed any of these issues I am having with the X310 and it seems that they have removed references to device3 (from the file device3.hpp) and made a generic device.hpp, so when I try to compile srsRAN (version 21) with this UHD version (4.2) It fails because srsRAN is looking for device3.hpp Finally I want to know is you think that this issue is with the coming together of X310, the UHD version and srsRAN version ? or is it a problem with just the UHD and the X310 hardware ? This might help me decide if I should pursue trying to find a fix in the UHD community. Thanks a ton if you read this far. |
Hey, when saying we see similar issues internally I was referring to:
That's something I am not sure what it is and how to solve it but this is not an srsRAN issue. That's a bug in UHD. Regarding the UHD version, we've been testing with 3.15 and 4.1 and those compile and work reasonably well. We have not tried 4.2 yet. The X310, in general, is a good device and once good streaming parameters have been found it's working reasonably well. Thanks |
Thanks. So I have hope of getting the X310 devices working with latest srsRAN and UHD 3.15 is what I am hearing :). I have tried varying the MTU and send/recv frame sizes, the sampling rates and the network buffer sizes. But I am yet to get something stable. I have even given srsenb and srsue high priority when I start the process. I am also setting CPU to performance mode. Could you please tell me if I am missing the tuning of any other streaming parameters ? So that I have a list of all the knobs to try to tune. Again thanks a ton. |
Those params are the right ones to play with. But frankly speaking giving definite advice here is difficult. Also I am not saying that with 3.15 everything works flawlessly. We do have see issue there too. To be honest I would prefer to use the latest stable UHD and ask Ettus to solve the issue there. Here is a set of params we use |
This is very helpful. Thanks. I shall try these and then reach out to the UHD community for additional help with getting X310 to behave. Currently I am unable to have a stable system that just stays up for even 100s. Last question. I am using the internal clock, I see that you have an external clock (GPS ?). Does this in any way affect the performance of the system ? I am not doing any MIMO. |
Well if you don't see clock instabilities or issues when attaching commercial phones its probably fine. We use the OctoClock in our setups because some devices have a crazy offset and don't have good oscillators. |
I have never connected to a commercial phone. I only use another X310 srsue. Sorry to press, but how would one know if they have clock instabilities ? Do I need to play around with any additional parameters since I am connecting to a srsUE running with another X310 ? |
check the CFO output on the srsUE stdout when attaching to the eNB |
Hmm. Checking the ue_metrics.csv file I see that the CFO is almost always around -4700. So 4.7KHz offset when each resource block is 180 KHz seem not bad. |
Issue Description
I can bring up the eNB and connect the UE to it. I start iperf traffic from the enb to the UE. It looks OK for a few seconds then I usually see this message at the eNB
[INFO] [UHD RF] Tx while waiting for EOB, timed out... 67.3145 >= 56.3065. Starting new burst…
But the network still works OK and traffic is going through.
Then I see that the UE disconnects and reconnects many times and each time this happens traffic stops for like 1-10s
@ue I see:
Warning: Detected Radio-Link Failure
RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure")
Random Access Transmission: seq=44, tti=5211, ra-rnti=0x2
Random Access Complete. c-rnti=0x4b, ta=1
Reestablishment OK
RRC Connected
Warning: Detected Radio-Link Failure
RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure")
Random Access Transmission: seq=5, tti=5541, ra-rnti=0x2
Random Access Complete. c-rnti=0x4c, ta=0
Reestablishment OK
RRC Connected
/home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.199: Invalid RBG subset=3 for nof_prb=50 where P=3
/home/bob/srsRAN/lib/src/phy/phch/ra_dl.c.641: Configuring resource allocation
Warning: Detected Radio-Link Failure
RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure")
Random Access Transmission: seq=49, tti=3131, ra-rnti=0x2
Random Access Complete. c-rnti=0x4e, ta=1
Reestablishment OK
RRC Connected
Warning: Detected Radio-Link Failure
RRC Connection Reestablishment to PCI=1, EARFCN=3350 (Cause: "otherFailure")
Selected cell no longer suitable: Going to RRC IDLE
RRC IDLE
Finally after around 100s or so I see these error messages at the eNB and then the connection fully breaks down (I check this by seeing that there are no packets in the eth link between the USRP and host)
[ERROR] [X300] 193.10.65.34: x300 fw communication failure #1
EnvironmentError: IOError: x300 fw poke32 - reply timed out
[ERROR] [X300] 193.10.65.34: x300 fw communication failure #2
EnvironmentError: IOError: x300 fw poke32 - reply timed out
[ERROR] [X300] 193.10.65.34: x300 fw communication failure #3
EnvironmentError: IOError: x300 fw poke32 - reply timed out
[ERROR] [UHD] An unexpected exception was caught in a task loop.The task loop will now exit, things may not work.EnvironmentError: IOError: 193.10.65.34: x300 fw communication failure #3
EnvironmentError: IOError: x300 fw poke32 - reply timed out
and I then see many many instances of
/home/data/srsRAN/lib/src/phy/rf/rf_uhd_imp.cc.522: USRP reported the following error: EnvironmentError: IOError: Block ctrl (CE_01_Port_40) packet parse error - EnvironmentError: IOError: Expected packet index: 3495 Received index: 3505
Setup Details
srsRAN version 21.04.0
UHD version 3.15
Ubuntu version 18.04.1 for srsrenb, 20.04.2 for srsue
1GbE connection between USRP and host laptop
X310 hardware with UBX daughterboards for both enb and ue
SISO model of operation
I have set CPU to performance
I have set network buffers using
sudo sysctl -w net.core.rmem_max=2426666
sudo sysctl -w net.core.wmem_max=2426666
Expected Behavior
Continuously send traffic over the network
Remain stable without errors
Actual Behaviour
UE disconnects and reconnects many time resulting in around several 10s of seconds of no traffic, after which it comes back again. And then after a while (around 200s) errors are thrown by srsenb and srsue
Steps to reproduce the problem
I am using iperf UDP to send a 1Mbps data stream from enb to ue, but the same behaviour is seen irrespective of how I generate the traffic. The errors I see after a while happen even when I don't send any traffic. The required files are attached. The command I used to run enb is
sudo nice -n -15 srsenb --expert.rrc_inactivity_timer 36000000
The command I used to run ue is
sudo nice -n -15 srsue
Additional Information
I have tried playing around with increasing the network buffer size, increasing the send/recv frame size after increasing the MTU, increasing the srate value, reducing the #PRBs from 50 to 25, disabling the rrc_inactivity_timer that I set in the command, changing TX_gain at eNB and UE, but none of these have resulted in a stable setup.
The error messages that I see are also very variable and I do not always see the same errors.
There are several errors that I see that I have looked up in git issues and in the srsran users group and they have not received any responses.
I can provide more information and a list of all the different errors I see, some of which are infrequent in case that helps.
Thanks!
enb_conf.txt
epc_conf.txt
ue_conf.txt
The text was updated successfully, but these errors were encountered: