Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the requirements archive to install at init #1022

Open
whitphx opened this issue Jul 19, 2024 · 7 comments
Open

Make the requirements archive to install at init #1022

whitphx opened this issue Jul 19, 2024 · 7 comments

Comments

@whitphx
Copy link
Owner

whitphx commented Jul 19, 2024

One major time-taking part of the init process may be downloading and installing the packages (we must measure it first!).
So utilizing the archive-loading feature that is currently used in @stlite/desktop only may improve it.

@whitphx whitphx changed the title Make the requirements Make the requirements archive to install at init Jul 19, 2024
@Abdelgha-4
Copy link

Interesting! In most apps I've built with Stlite, installing packages definitely took +70% of the boot-up time.

I'm no expert in the underlying technology behind Stlite, but I wonder what other similar parts that could be done at build time?
I believe that currently Stlite do some work each time you open the app to mount the python source code, data etc.. (and maybe some sort of compilation? not sure). Is is possible to do all the repetitive work (from source to boot-up) in build time?

@whitphx
Copy link
Owner Author

whitphx commented Aug 7, 2024

This is the network profile capturing the initialization phase of Stlite Sharing.
CleanShot 2024-08-07 at 14 18 29@2x

The most time-taking part is loading the packages into the memory, rather than network access itself in my env.
The blank areas between the network accesses are it.
However anyway we can't tackle this part at this moment while it might be solved by introducing the memory snapshot by Pyodide,
and seems like optimizing the network access can also have some improvements like ~1s in my env, and more with slower networks.

Especially the last bulk of the network accesses includes many small ones which cause the long idle time as shown below, and it looks like a room for improvement in this approach.
CleanShot 2024-08-07 at 14 31 04@2x

@whitphx
Copy link
Owner Author

whitphx commented Aug 7, 2024

This approach can't be introduced to Stlite Sharing because the user updates the requirements and the dependencies are resolved at runtime so we can't package the initial fixed site-packages.
Contrarily, #901 can take this strategy for example as all the requirements can be resolved at packaging phase and additional requirements are never added.

@whitphx
Copy link
Owner Author

whitphx commented Aug 7, 2024

However, the prebuilt packages must be installed from the Pyodide distribution and their versions are fixed anyway.
So we can bundle the initially required prebuilt package wheels into a single file to avoid the HOL blocking problem, which is different from the site-packages snapshot mechanism used in the desktop package.
The desktop app doesn't have the HOL blocking problem, so the wheel files are just shipped as separate files in the app package, however, in the case of web app, there can be an advantage on bundling them into one.

Moreover, when there is no user-specified requirements, the packages loaded from PyPI is just two (blinker and tenacity) so we can ignore them and dealing with the prebuilt package wheels should have more impacts.

Because of this problem (#572, and maybe #833), the prebuilt packages can't be pre-installed in the site-packages. We should consider another way, e.g. package the wheels into a single archive, download and unpack it at runtime, and install them one by one with micropip.install().

Update: There is no HOL blocking. The CDN we are using supports HTTP/2 and HTTP/3. The green indicated "waiting the server response", not "in the queue".
For example, when the server responds each requests quickly, the network waterfall can be like this where the most length of time is occupied by downloading the resource.
CleanShot 2024-08-07 at 20 30 57@2x

@whitphx
Copy link
Owner Author

whitphx commented Aug 7, 2024

#1053 is an experimental PR implementing loading a single archive of the required prebuilt packages.
It is rough impl and doesn't contain some necessary parts for prod-level release such as the packaging script so we can't merge it but can see how this idea work in the deployed pages.

@whitphx
Copy link
Owner Author

whitphx commented Aug 8, 2024

Compare them:

CleanShot 2024-08-08 at 18 56 55@2x
CleanShot 2024-08-08 at 18 57 25@2x

Looks like this approach doesn't make so much improvement.

Also, the impl includes a kind of hack using Pyodide's hidden API.
We should give it up at this moment.

@whitphx
Copy link
Owner Author

whitphx commented Sep 2, 2024

TODO: Measure the loading performance again with Chrome's network throttle simulator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants