-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow on RPi: manage retrievecontentpack download #5575
Comments
Thanks a lot for the report @holta - it sounds really weird. What download speeds are you producing if you just fetch the package directly from here? http://pantry.learningequality.org/downloads/ka-lite/0.17/content/contentpacks/ |
wget and browser[*] all download the 0.9GB en.zip completely within less than 5min using the above-mentioned Time Warner/Spectrum cable/business line — and in less than 10min with the above-mentioned AT&T DSL/home line. Regardless whether the user(s) in question used Chrome, Edge or Firefox on Windows 10 for comparison — this very rapid download of en.zip was also confirmed with wget from Raspbian Lite. Whereas the equivalent download when running "kalite manage retrievecontentpack download en" takes overnight (or days, yep!) |
Typo fixed in above bug report...what I meant was...and the problematic command in question is:
(I had accidentally pasted in "kalite manage help retrievecontentpack", apologies!) |
Yaiks, but did this use to work? Or was it always slow? |
Thanks a lot for the thorough description @holta |
Good Question! I wish I knew. FWIW: AT&T and Time Warner/Spectrum are both very large, but also regional ISP's, used in different parts of the USA. It's (certainly possible) that prior implementers/users in those areas just gave up when facing this issue. Conversely it's also possible that major ISP's are implementing severe new bandwidth shaping / bandwidth throttling techniques that are hitting us very hard now, but that might not have existed in prior years. Certainly the USA's net neutrality regulation ended on 2018-06-11 — whether or not this is related I have no idea :/ |
I'm unable to reproduce this on a 100 mbit connection. It took only 3 min 21 seconds. I do smell some kind of bandwidth throttling. Would you be able to perhaps do some tests like trying with VPN? See: https://www.highspeedinternet.com/resources/how-can-i-tell-if-my-internet-is-being-throttled-by-my-isp/ |
Maybe they red-flagged you somehow?
|
Highly doubtful when the same exact thing happens across different ISP's and different network environments in different places. Note we've reconfirmed since Tuesday that the implementation of "kalite manage retrievecontentpack download en" is indeed the problem: Running "wget http://pantry.learningequality.org/downloads/ka-lite/0.17/content/contentpacks/en.zip" and then "kalite manage retrievecontentpack local en en.zip" works flawlessly within ~10 min across these ISP's. To Be Clear: "kalite manage retrievecontentpack download en" fails to complete in these exact same situations. |
I would be. But realistically, it's not at all easy to ask distant users to modify their networks. Particularly when they already have a 2-line workaround (yes they were completely stuck — and of course are very happy now with the ~100X speed improvement ;-) |
@holta I wasn't proposing VPN as some work around, just as a means for identifying the issue :) If you are victim of throttling, then I think a lot of activists downloading data from various places will find them selves at risk. Not just from using a Learning Equality server for fetching a KA Lite content pack, but I'm sure there's many other resources that would end up in such filters :/ Anyways, if you could catch the error where at the same instance you have limited bandwidth without VPN and then a VPN gives you bandwidth, then I think we have at least found the cause, although a VPN might of course also be throttled :) |
It's not the Learning Equality server. It's the Learning Equality client (downloader) that's the problem here. Again (just FYI, among other choices you face!) if you chose to reimplement "kalite manage retrievecontentpack download en" as follows, all the problems would go away:
|
RECAP: the proof that https://github.com/learningequality/ka-lite/blob/master/kalite/distributed/management/commands/retrievecontentpack.py is failing is that Internet-in-a-Box has replaced it (if this is what "kalite manage retrievecontentpack download en" calls) with For the approx 40 Million families/businesses who use AT&T and Time Warner/Spectrum as their ISP, they are now back in business according to all tests so far. Whereas every known download to date (over those ISP's) came to an absolute crawl right around 20% of the way into the download of http://pantry.learningequality.org/downloads/ka-lite/0.17/content/contentpacks/en.zip when using "kalite manage retrievecontentpack download en" Again, I apologize I don't know the reasons. But if other KA Lite users can benefit from this workaround IIAB has developed, the more the merrier. |
@holta sorry that I'm not re-iterating and clearly acknowledging what you have already stated.
But I would really appreciate if you can try For instance, we have changed the But I can't really troubleshoot this through guessing. I need to know that something in the client makes the network throttle the traffic or something else in the client makes it fetch data in a slow way (that I cannot reproduce). |
@holta - okay, I understood from your email that VPN is not an option for this particular deployment because it's difficult to set up and test. I'll try on my own RPi 3+ to measure bandwidth of On your RPi device with the low network speed, is it connected through WIFI or Ethernet? And are there any other specific settings that I should be aware of? |
I tried but they don't have the Linux skills or VPN contract to make this happen. If 1-line alternatives later emerge, for these folk to test "kalite manage retrievecontentpack download en" over VPN later, please LMK.
The fact that it consistently stalls/freezes/slows-to-a-crawl 20% of the way into the download of this 0.9GB file is a major hint. But so far we are unable to reproduce the problem with ISP's other than AT&T and Time Warner/Spectrum.
Most all Internet-in-a-Box implementers (like these here) use Raspbian Lite on RPi 3 B+ with Ethernet gateway to Internet.
FYI here is the install script they run: http://download.iiab.io/6.6/load.txt (see the very final lines at the bottom of this bash script!) |
Data Point and very preliminary theory that this is might be a wider problem, seemingly causing slow(er) downloads even when the ISP is not AT&T or Time Warner/Spectrum: Prelim Evidence: Installing a BIG-sized Internet-in-a-Box (by running http://d.iiab.io/6.6/load-big.txt) onto RPi 3 B+ just now completed in less than 2 hours, which is (about) an hour faster than the usual 3 hours. Some of this speedup appears to be the result of avoiding "kalite manage retrievecontentpack download en" (which generally takes Many Minutes to download the 0.9GB en.zip, even on ISP's unrelated to AT&T and Time Warner/Spectrum). Whereas in this case "wget http://pantry.learningequality.org/downloads/ka-lite/0.17/content/contentpacks/en.zip" completed in 81 seconds, savings many minutes in comparison to what we are used to. |
@holta thanks - and right it does sound like there could be a memory leak or some i/o thing going wrong on RPi.. |
Yes, this is easy to reproduce! Working fine on a laptop, horrible on RPi. Same network connection. Sorry for all the fuzz about network throttling @holta ! |
Just to be fair: This KA Lite release has been used on Raspberry Pi before, and I sense that something in the latest Raspian Stretch release has changed in a negative way for our poor download method. |
Interesting. FWIW this particular downloader has had problems in the past (beyond its failure to resume!) Doesn't make it easy to diagnose when some people experience this nearly-complete-collapse in download speeds 20% of the way thru the 0.9GB en.zip (whereas others do not, even with the same version of Raspbian!) |
Feel free to close if this has been tested! (Internet-in-a-Box currently uses a different technique to obtain en.zip online or preposition it offline, to avoid the massive downloading costs in places like Haiti) |
Summary
Running "kalite manage retrievecontentpack download en" takes many hours/days to complete on KA Lite 0.17.4 (repeatedly over past weeks, on 2 different ISP's, Time Warner/Spectrum business line and AT&T DSL home line).
It sometimes begins quickly, but then progress takes about an hour to increase by each additional 1% step, such that the 0.9GB (929,916,955 bytes) en.zip file never downloads realistically.
The command-line workaround is to run the following 2 steps, which completes about 100X faster — in a few short minutes — instead of hours/days:
System information
How to reproduce
Screenshots
Real-life consequences (anything community should be aware, for instance how it affects your deployment)
Internet-in-a-Box install scripts (http://download.iiab.io/6.6) have become unusable across major US ISP's (such as AT&T and Time Warner/Spectrum) that have this problem.
Time Warner/Spectrum has about 22 million broadband customers and AT&T has about 16 million broadband customers. It is possible that other US and int'l ISP's are affected as well. We have confirmed that Comcast customers (another major US ISP) are not affected to date.
Internet-in-a-Box (https://github.com/iiab) has documented the workaround for command-line-savvy users (at http://FAQ.IIAB.IO #27) under "Mandatory English Pack taking too long to download?"
However, sadly, many/most all others are giving up in frustration :(
Thanks @benjaoming for contacting me if there are more specific diagnostics we can ask implementers to provide, towards resolving this tedious-but-very-real-issue preventing people from using KA Lite!
The text was updated successfully, but these errors were encountered: