Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activation and execution of cells is slower when using Conda Run #8580

Closed
DonJayamanne opened this issue Dec 20, 2021 · 10 comments · Fixed by #8674
Closed

Activation and execution of cells is slower when using Conda Run #8580

DonJayamanne opened this issue Dec 20, 2021 · 10 comments · Fixed by #8674
Assignees
Labels
notebook-execution Kernels issues (start/restart/switch/execution, install ipykernel) notebook-getting-started perf Performance issues
Milestone

Comments

@DonJayamanne
Copy link
Contributor

Conda run seems to be very slow

  • Possibly because conda run is slow
  • Or we have python extension running conda run for all conda environments and we're running as well
  • Or other

@rchiodo @IanMatthewHuff You might recall, that running conda activate on CI can cause issues specially when run in parallel.
And we have code that retries the activation (basically conda isn't designed to activate multiple environments at the same time, due to some file locking issue).

Hence I believe using conda run in parallel (in python extension & then also running in jupyter) could be causing issues.

This is all hypothetical.

@DonJayamanne DonJayamanne added the bug Issue identified by VS Code Team member as probable bug label Dec 20, 2021
@greazer greazer added perf Performance issues conda notebook-getting-started notebook-execution Kernels issues (start/restart/switch/execution, install ipykernel) labels Jan 3, 2022
@greazer greazer added this to the January 2022 milestone Jan 3, 2022
@rchiodo rchiodo self-assigned this Jan 11, 2022
@rchiodo
Copy link
Contributor

rchiodo commented Jan 11, 2022

Conda is definitely slower for me (on first run). Takes almost 10 seconds to start up.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 11, 2022

Note to self. this command was super useful for debugging this stuff:

image

It clears the memento storage, forcing all the environment caching to rerun.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

Conda activation itself takes 10 seconds on my machine for a specific environment.

  • Calling 'C:/Users/aku91/miniconda3/Scripts/activate && conda activate golden_scenario_env && echo 'e8b39361-0157-4923-80e1-22d70d46dee6' && conda info -s' took 10 seconds
  • Calling 'C:/Users/aku91/miniconda3/conda run -n golden_scenario_env python printVariables.py' took 9 seconds

There's no way around this, although it should be cached. Sometimes it seems like it isn't.

I've also proven (on my machine at least) that conda run is as fast as anything else.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

Slowdown isn't entirely getting variables though. Takes 15 seconds to get the activated environment variables in total.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

Getting activated environment variables requires the following:

  • Get conda location (searches the registry) = Takes 9.5 seconds
  • Get conda version (runs conda to get info) = Takes 10ms
  • Get conda environment (runs conda run with python process) = Takes 8.5 seconds

None of these can go any faster than that (for the machine I'm on). So minimum is 18 seconds.

We do cache it however.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

Yeah on rerun it takes 14ms to get all the same information.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

Looking at the kernel execution now.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

Kernel execution (if environment variable activation is cached) takes 5 seconds to start the process. 2 seconds to run all of the startup code for a total of 7 seconds to get to the code actually executing.

The 5 seconds for launching can be broken down into:

  • 2 seconds to check dependencies and start daemon
  • 3 seconds for kernel to be ready (kernel startup time)

The 2 seconds to run all the startup code might be shortened if we combined all of the startup code into one cell.
The 2 seconds for the starting of daemon might be shortened if we auto started daemons in the background, but that seems like people might be pissed to have all of these python processes running.

@IanMatthewHuff
Copy link
Member

Thanks for the breakdown. Those unavoidable items are slow :(. Kinda a bummer that our best case scenario would still have to take that hit.

@rchiodo
Copy link
Contributor

rchiodo commented Jan 12, 2022

I did find a typo that may be causing the caching to be skipped for somethings (this line here should read await Promise.race([cacheInfo.promise, latestInfo])).

Effectively that race always returns immediately but then the cacheInfo isn't completed so we always wait for the latestInfo promise. That should be cached on the python side though, so not sure it makes much of a difference.

I'm going to try moving all of the kernel warmup code into a single execution to see if I can speed that up a little.

@DonJayamanne DonJayamanne removed bug Issue identified by VS Code Team member as probable bug notebook-regression labels Jan 26, 2022
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
notebook-execution Kernels issues (start/restart/switch/execution, install ipykernel) notebook-getting-started perf Performance issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants