Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Individual content download is very slow at the first launch of the app #739

Closed
Tracked by #750
vanessa-chang opened this issue Aug 9, 2023 · 16 comments
Closed
Tracked by #750
Assignees
Labels
bug Something isn't working

Comments

@vanessa-chang
Copy link

vanessa-chang commented Aug 9, 2023

At the first launch of the app, after you get into the Discovery page after content pack downloaded.
Try to download an individual content, it will take a long time. (>5 mins)
The download time will reduce to 1 min after you restart the app.

Step to reproduce:

  1. Install the app and Select a content pack to download
  2. Get into the main Discovery page
  3. Get into a channel page and select a content to download
  4. Check the download time

Result:
It will take more than 5 mins to dowload the content.
If you quit the app, it will take ~1 min to complete the download.

This issue can be reproduced in both Chromebook and Windows app.

Logs:
Chromebook (app version: 6.31-379)
content_download.txt

Windows (app version: 6.32.1.0)
kolibri.txt

@vanessa-chang vanessa-chang added the bug Something isn't working label Aug 9, 2023
@erikos
Copy link
Contributor

erikos commented Aug 9, 2023

Could it be that this is because we are downloading the metadata in the background? On first launch - if you would let the app sit and wait until the metadata has been downloaded in the background and then try to download individual pieces without an app restart, would that also be slow downloading?

@vanessa-chang
Copy link
Author

vanessa-chang commented Aug 10, 2023

I attempted to idle the the EK app for 30 minutes and then retook the test. The download speed is normal.

@erikos
Copy link
Contributor

erikos commented Aug 14, 2023

I attempted to idle the the EK app for 30 minutes and then retook the test. The download speed is normal.

Ok, thanks for this test. This points to an issue with the downloading of the metadata in the background. So this has to be investigated.

@erikos
Copy link
Contributor

erikos commented Aug 15, 2023

Just played a bit more with it. I started the download of the individual item, and the thumbnails were already there in the channel I was looking at (Music, in the Artist pack). The first try the download took very long. The second time the app rendered unresponsive even.

Image

@erikos
Copy link
Contributor

erikos commented Aug 15, 2023

We have three options here:

  • (a) don't download thumbnails at all
  • (b) move the thumbnail download into the onboarding
  • (c) separate service to not block the main app

Next step here is to try (c) and have (a) as a fallback.

@dbnicholson
Copy link
Member

I'm pretty sure this is because all the thumbnails are being downloaded in the background. Right now all the tasks run in the main process and it's probably overwhelming the webview. I'm going to look into splitting the task worker out into a separate service.

@dbnicholson
Copy link
Member

Oh, I think there might be a much simpler fix for this. Right now we're using the defaults where kolibri starts 4 regular priority workers (can start regular or high priority jobs) and 2 high priority workers (starts only high priority jobs). The thumbnail downloads are done at regular priority while the user requested downloads are done at high priority.

So, I think we can probably just reduce the number of workers quite a bit. Something like 1 regular worker and 2 high priority workers. That way you can get 2 user initiated downloads going in parallel while the background thumbnail tasks will completely serialize. I'll experiment with that.

@erikos
Copy link
Contributor

erikos commented Aug 16, 2023

That is interesting. Also it would be a good test to turn off downloading the thumbnails and test behavior then, if that solves all of the issues. Making sure sure that we are barking up the right tree.

@jofilizola jofilizola mentioned this issue Aug 16, 2023
13 tasks
@dbnicholson
Copy link
Member

I played with this before the end of the day yesterday and it seems like the app is more responsive with less background downloading. However, I immediately tried to download a document and it just spun until the thumbnails were done downloading. Since it was a high priority task it should have been scheduled immediately. I don't know what was going on there. I'm going to do a bit more debugging of the kolibri tasks system to day to try to understand it better.

@dbnicholson
Copy link
Member

Hmm, when the download task comes in from the API it's being enqueued at regular priority. This is unlike when the task is enqueued directly from the backend and we specify the priority as high.

Ohhh, unlike the remotechannelimport task, the remotecontentimport task does not specify a default high priority. Can I specify the task priority through the API? I think the default should be changed and I'll send a patch to LE while maybe making a downstream patch in our installer.

@dbnicholson
Copy link
Member

Sent learningequality/kolibri#11113 upstream. I think we should do this regardless of the rest of the solution here since user requested downloads should take precedence over other tasks.

The downside of reducing the number of workers is that also slows down the initial download on the welcome screen. Currently you can get 6 tasks going in parallel during initial download. That allows more of the channels from a pack to be downloaded in parallel. It also speeds up the part at the end where all the remaining channels from the collection are downloaded.

The alternative solution involves #592. Currently the way the thumbnail tasks are handled is that they're all created at once and scheduled before the collection downloader completes. Kolibri then goes and runs as many of them as there are available workers. This is really simple since we get out of the collection downloader and let Kolibri deal with it. What would be better would be scheduling the thumbnail tasks one by one so that Kolibri only runs one at a time regardless of how many workers are available.

The reason why the extra channel download is done in the collection downloader is that we need the channel metadata first so that the thumbnail nodes can be determined. Right now the only way to serialize that is to do it in the collection downloader. Instead you'd need something that reacts to completed tasks and queues the next task. You can do that with a StorageHook, but it would also involve storing the outstanding tasks in the database. It's not insurmountable by any means, but certainly more work.

@dbnicholson
Copy link
Member

learningequality/kolibri#11113 was rejected. I think we'll need to carry a downstream patch until the better alternative solution can be implemented.

@dbnicholson
Copy link
Member

Actually, the better way would be to restore our custom remotecontentimport task we've used before in the explore plugin. The only difference is that it will be specified with high priority.

dbnicholson added a commit that referenced this issue Aug 16, 2023
When we enqueue the `remotecontentimport` task from
`CollectionDownloadManager`, we can specify the priority of the jobs as
high. However, when a user initiates a content download from the
frontend using the tasks API, that's not possible. Since the upstream
default priority is regular, it can get blocked behind any other queued
tasks such as the background thumbnail downloads.

This adds a copy of upstream's `remotecontentimport` with the default
priority set to high and arranges for it to be used throughout the
plugin.

Helps: #739
@dbnicholson
Copy link
Member

I think #754 is an easy win here. It's not perfect since all the task workers are likely occupied with thumbnail tasks, so you still have to wait for one of them to complete. But it's better than before where you'd have to wait for all of them to complete because the content download task is the same priority as any outstanding thumbnail tasks.

dylanmccall pushed a commit that referenced this issue Aug 17, 2023
When we enqueue the `remotecontentimport` task from
`CollectionDownloadManager`, we can specify the priority of the jobs as
high. However, when a user initiates a content download from the
frontend using the tasks API, that's not possible. Since the upstream
default priority is regular, it can get blocked behind any other queued
tasks such as the background thumbnail downloads.

This adds a copy of upstream's `remotecontentimport` with the default
priority set to high and arranges for it to be used throughout the
plugin.

Helps: #739
@erikos
Copy link
Contributor

erikos commented Aug 17, 2023

Available in https://github.com/endlessm/kolibri-explore-plugin/releases/tag/v6.35.0 and in internal testing in the Android app.

@vanessa-chang
Copy link
Author

Verified passed in Chromebook Manatee 6.38-389 & Windows 6.38.5.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants