Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression compared to ptvsd: debugging python running in a bazel runtime sandbox causes duplicate tabs to open #743

Closed
koenlek opened this issue Oct 6, 2021 · 13 comments
Labels
enhancement New feature or request
Milestone

Comments

@koenlek
Copy link

koenlek commented Oct 6, 2021

Context

When using python in a bazel sandbox (bazel run //python/target), bazel symlinks the required files into a 'runfiles' sandbox. My approach to debugging is to run it under the debugger and make it wait for client to connect (formerly ptvsd, now I'm exploring switching to debugpy).

Once attached, breakpoints trigger, but open a copy of the file at the sandbox path (with some visual quirks, such as that breakpoints don't visually sync between both copies, see screenshot). This pollutes tabs and makes debugging messy, especially when doing complex debugging sessions that involve many files.

image
This screenshot shows both the double tabs issue, as well as they breakpoints visually not synchronizing between both copies of the same file.

With ptvsd, there was a simple workaround for this issue, just add this to the launch.json debug config:

"pathMappings": [
    {"localRoot": "${workspaceFolder}", "remoteRoot": "${workspaceFolder}"}
],

This seemingly no-op mapping effectively solved this issue, presumably because as part of the path mapping, it canonicalizes the path (like os.path.realpath in python, or readlink -f in bash for macos and linux) in the process, causing breakpoints to cleanly trigger in the original files.

You might as yourself: why not just do this:

"pathMappings": [
    {"localRoot": "${workspaceFolder}", "remoteRoot": "<sandbox_path>"}
],

This isn't tractable for 2 reasons:

  1. The bazel sandbox path has a random hash in it, so updating the path mapping accordingly for each workspace would be annoying
  2. The bazel sandbox symlinks individual files and the sandbox doesn't share the same folder structure as the workspace, i.e. files might be in some folder structure in the workspace, but are symlinked into a different folder structure in the sandbox. This means we'd need many pathmappings to get this to work and those will vary from bazel target to target. This is not manually doable in practice (unless solved programmatically perhaps).

I'd like to switch from ptvsd to debugpy, as debugpy has significantly better multiprocess handling. But this issue is a major blocker.

Environment data

  • debugpy version: 1.5.0
  • OS and version: Ubuntu 18.04 (accessed via remote ssh plugin from MacOS)
  • Python version (& distribution if applicable, e.g. Anaconda): 3.9 (from apt: python3.9)
  • Using VS Code or Visual Studio: VS Code

Actual behavior

  • With ptvsd the pathmapping trick works to avoid opening sandbox copies of files.
  • With dedubpy the pathmapping trick does not work to avoid opening sandbox copies of files.

Expected behavior

I'd expect both to work. Either through the pathmappings trick, or through some other way such as introducing a new that canonicalizes paths.

As I believe debugpy was forked from ptvsd, we could maybe compare code and see what tweak is needed to make the pathmappings trick work again.

Steps to reproduce:

I've created a small example environment in which this issue can easily be reproduced. Please open this folder as a workspace in vscode and follow the instructions in the readme.

debugpy_issue.zip

@int19h int19h added the enhancement New feature or request label Oct 6, 2021
@int19h
Copy link
Contributor

int19h commented Oct 6, 2021

We have to consider this an enhancement, since the symlink-resolution side effect of path mapping was an implementation detail that was not meant to be relied on specifically.

Since path mapping is not really suitable here due to all the issues that you've mentioned, I think we just need a flag that would make the debug server flow all code paths via os.path.realpath before comparing. It'll have to be a flag, since the default behavior should match that of VSCode (in case some symlinked path is opened in the editor - if the debuggee resolves it, but the editor does not, we get breakpoint mismatch again).

@koenlek
Copy link
Author

koenlek commented Oct 6, 2021

@int19h, agreeing with all you said. Having that flag would be awesome. I'm not sure if it would better go in the launch.json debug config or as a flag to debugpy. Either would work fine for me.

@fabioz
Copy link
Collaborator

fabioz commented Oct 7, 2021

@koenlek since you're already using a workflow where you stop the debugger and then wait for the client to connect, one approach which could already work is using the API to set the path mappings instead of specifying it from the launch json.

i.e.: You should be able to use something as:

import pydevd_file_utils
paths = [(r'c:\translate\fom\my_project\src', '/translate/to/my_project/src')]
pydevd_file_utils.setup_client_server_paths(paths)

With that, you should be able to provide the paths variable above using logic in python to compute the paths in your sandbox.

-- note: this can be called before the connection if don't specify any pathMappings in the launch json (if you do that it must be called after the connection is done, otherwise it'd be overridden by the launch json mappings).

@koenlek
Copy link
Author

koenlek commented Oct 14, 2021

@fabioz, thanks, that's a useful pointer. It actually looks like the vendored pydevd already provides tooling to map things to their canonical path, see def canonical_normalized_path here.

For someone with a bit more knowledge of debugpy, I imagine it could be pretty easy to inject this into the breakpoint triggering logic such that we canonicalize paths before sending them off to the VS Code frontend. We can then expose this with a --canonicalize-paths argument for debugpy.

@judej judej added this to the Dev 17.x milestone Oct 20, 2021
@koenlek
Copy link
Author

koenlek commented Nov 6, 2021

@judej, I noticed this was added to Dev 17.x milestone. Does this mean it's being considered for implementation? And if so, within what time frame? Would be great if this get's addressed!

@koenlek
Copy link
Author

koenlek commented Dec 14, 2021

Bump on this request. This is a major quality of UX issue when using debugpy with bazel sandboxes, so would be great if someone can shed light on any (tentative) roadmap for this 🙏

@fabioz
Copy link
Collaborator

fabioz commented Apr 28, 2022

I'll start to take a look at this.

@fabioz
Copy link
Collaborator

fabioz commented Apr 28, 2022

@int19h what do you think should be the name of the flag?

I'm thinking about having an environment variable:

PYDEVD_RESOLVE_SYMLINKS: true|false

which may be overridden in specific launch configurations with a:

resolveSymlinks: true|false

property.

@int19h
Copy link
Contributor

int19h commented Apr 28, 2022

Sounds good to me!

@koenlek
Copy link
Author

koenlek commented Apr 28, 2022

SGTM too!

@fabioz
Copy link
Collaborator

fabioz commented Apr 29, 2022

@koenlek As a note, the actual fix (without the work related to the options/env vars) is mostly a one-line change at: fabioz@a0783bb#diff-82a2e0a4303124b25a3c6acce599deec12927c414ac18a226a3c6d8dc34384dcR405 (in the pydevd_file_utils.py file).

You can probably do that change locally to check if it works until a new release is done.

If you're running in the debugger you can do:

import pydevd_file_utils to know the location of that file

and then you can hardcode the change to do:

os_path_abspath = os_path_real_path

inside the pydevd_file_utils._abs_and_canonical_path function.

@fabioz fabioz closed this as completed in 0a40cab May 2, 2022
@koenlek
Copy link
Author

koenlek commented May 9, 2022

Thanks for implementing this, @fabioz! When do you expect this to be released? Or can I use a pre-release to test it?

@koenlek
Copy link
Author

koenlek commented Jul 28, 2022

@fabioz, I'm able to use this by setting the env var PYDEVD_RESOLVE_SYMLINKS, but it doesn't seem to be exposed as a setting in the vscode python extension (https://github.com/Microsoft/vscode-python) yet. Would it make sense to open a ticket there to expose this setting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants