Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return a default controller as fast as possible to enable run button on native notebooks #5861

Closed
joyceerhl opened this issue May 12, 2021 · 10 comments
Labels

Comments

@joyceerhl
Copy link
Contributor

joyceerhl commented May 12, 2021

Background

  • Kernel push currently doesn't show run button until a controller is available for the notebook
  • 90% of users experience kernel search time taking 5 seconds or less
  • Our goal is to allow notebooks to load ASAP and allow users to run code ASAP
  • Listing interpreters is the slow part of kernel search at the moment

Ideas for making first-time controller search as fast as possible

  • Generate a dummy preferred controller and hotswap it out for the real preferred kernel when the user attempts to execute code
    • If we can't find a kernel that matches what the notebook metadata says, we should ask the user if the one we found instead is OK. Classic Jupyter doesn't prompt--it picks randomly
    • If Python extension has an active interpreter and notebook metadata only says 'Python 3', use that one
    • Scenario: notebook metadata says 'Python 3' --> when we find an interpreter e.g. Python 3.9, the kernel picker will change from 'Python 3' to 'Python 3.9'
    • We only put up our dummy kernel when there aren't any other extensions which might be able to provide a better kernel, e.g. we won't show our dummy kernel for .NET interactive notebook
  • Generate a partial list of kernels vs doing a full search

For subsequent activation we can also cache interpreters/kernelspecs (we currently don't do any caching)

  • However there are potential issues with when we invalidate the cache e.g. for users whose cache is months old

Other ideas

  • Get Python extension to move all its interpreter discovery into an npm package
    • Assumption here is that Python extension activation is the bottleneck
      • But we need the active interpreter which relies on VS Code settings so npm package might not work either
@DonJayamanne
Copy link
Contributor

DonJayamanne commented May 19, 2021

Note: This is also imporatnt for getting started scneario:

  • User install Juptyer extension for first time
  • User opens a notebook & immediately goes into kernel pcker
  • At this point we should have added the Python kernel so they can see it (instead of not having anything)

@claudiaregio /cc (added this to account for scenarios we've been discussing)

@DonJayamanne
Copy link
Contributor

Blocked here microsoft/vscode#124535

@DonJayamanne DonJayamanne added the candidate Issue identified as probable candidate for fixing in the next release label Jun 10, 2021
@greazer greazer removed the candidate Issue identified as probable candidate for fixing in the next release label Jun 10, 2021
@DonJayamanne DonJayamanne self-assigned this Jun 10, 2021
@DonJayamanne
Copy link
Contributor

DonJayamanne commented Jun 21, 2021

Why?

  • Users can click Run button ASAP for Python Notebooks
  • New users who don't have Python but install this extension and want to run Python notebooks
    • Add placeholder Python, and when clicked they'll be prompted to install Python

Note

  • As a result of the delay search, its possible we could have new non-python kernels as well. E.g. after we discover Conda environments, we could have discovered conda specific non-python kernel specs.

Details

  • (5) Add a placeholder Python Kernel
    • Whether we have Python or not
    • If no python, then running cells will prompt users to install Python
    • Once Python is installed, we can update this to point to a real kernel (see further below)
  • (1) Fetch non-python kernels first & list them as controllers
    • (2) Fetch preferred kernel from this list (for non-Python kernels)
  • (3) Fetch Python kernels
    • (4) Fetch preferred kernel from this list (for non-Python kernels)
    • (5) If Python kernel is not used (not associated with anything), then dispose that
    • (5) Update the Python kernel with information of the preferred kernel
  • (5) How we manage the place holder kernel will require some work

The order in which the work will be done is numbered.
Today we load the Python kernels slowly, hence loading non-python kernels first doesn't introduce any problems.

Problems

  • One moment a user will have an entry named Python and the next it will be gone
  • Our code as an assumption of 1-1 mapping between controllers & Kernels
    • With this change, it can change & that causes issues.
  • Jupyter will need to keep track of the actual kernels used against each Notebook (similar to what VSC does)

@DonJayamanne
Copy link
Contributor

TODO:

  • Ensure Python can improve the interpreter discovery (there were a few issues)
  • Add telemetry to see how long a cold start takes to get active interpreter & all interpreters
  • Once we have the telemetry re-vist

Solutions:

  • Can we improve interpreter discovery even more
  • Will splitting interpreter discovery into separate package help
  • Will discovering interpreters one at a time help (instead of waiting for all to be discovered)
    • Does user need to see everything, is it not possible they want to just pick active interpreter or a specific conda environment
  • Can we update the VS Code API/UX to work around this.

@IanMatthewHuff
Copy link
Member

@DonJayamanne I feel like this was resolved by the work that you did with interpreter searching and cleanup. If that's off, just reactivate.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 10, 2021
@DonJayamanne
Copy link
Contributor

DonJayamanne commented Sep 13, 2021

@DonJayamanne DonJayamanne reopened this Sep 13, 2021
@DonJayamanne
Copy link
Contributor

Updates

  • Python extension will be adding an API to list the interpreters as they are discovered
    We will not necessarily have to wait for all interpreters to be discovered & returned by Python extension.
    We could use this API to add the controllers one by one as the interpreters are discovered.

@greazer greazer added engineering debt Code quality issues and removed needs-triage labels Sep 16, 2021
@greazer greazer added this to the September 2021 milestone Sep 16, 2021
@greazer greazer removed this from the September 2021 milestone Sep 23, 2021
@DonJayamanne DonJayamanne removed their assignment Oct 11, 2021
@DonJayamanne
Copy link
Contributor

Summary of issues

  • New virtual environments not discovered
  • Slow to load environments (we still wait for everything to be discovered)

Solutions

@DonJayamanne
Copy link
Contributor

TODO:

  • Review (& provide feedback) the new Python API
    • What will it take to implement this
    • Do we need changes in the API
  • Add active interpreter as a controller (while discovering other interpreters)
  • Add telemetry (if we don't already have this) to determine how quickly we return controllers
  • Add telemetry (if we don't already have this) to determine how quickly we get interpreters from Python extension

@DonJayamanne
Copy link
Contributor

Closing in favor of #9628

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants