-
-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix site-packages identification. #2262
Conversation
In the process of working a new feature, it was discovered that `sys.path` manipulations leaked into `PythonInterpreter` when they were performed before the `__pex__` import hook. This is fixed by always going through the interpreter disk cache path, which has properly isolated `sys.path` for quite some time, eliminating the one exception for the current interpreter. Fixing this case caused several tests to fail that exposed several eager uses of `PythonInterpreter.get()` that would execute in import of various modules. Those cases are fixed to either be lazy or use `sys.executable`. Along the way, fix up the ~TODO for migrating from `distutils.sysconfig` to `sysconfig` for determining the site-packages directories. This should preempt issues with vendor (e.g.: Debian) supplied Python 3.12+.
@@ -979,9 +998,6 @@ def _spawn_from_binary(cls, binary): | |||
cached_interpreter = cls._PYTHON_INTERPRETER_BY_NORMALIZED_PATH.get(canonicalized_binary) | |||
if cached_interpreter is not None: | |||
return SpawnedJob.completed(cached_interpreter) | |||
if canonicalized_binary == cls.canonicalize_path(sys.executable): | |||
current_interpreter = cls(PythonIdentity.get()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the bug. Here PythonIdentity.get()
slurps up the ambient sys.path
which may have been mucked with. Totally classic premature optimization. Don't optimize, instead just don't pessimize. Do the simplest thing, you're not smart enough to do more.
Ok, and the CI failure was real. The vendoring process is very titchy, very bootstrappy. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the moving parts in detail, but the code certainly seems to match the PR description, AFAIK. 👍
pex/pex.py
Outdated
for path in paths: | ||
if path in seen: | ||
continue | ||
seen.add(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this set just an optimisation, or is trying to be a guarantee that all output paths are unique?
If it's meant to be a guarantee, as written, it seems possible that this could be violated, e.g.:
- the
abspath
codepath:paths == ["a", "/path/to/cwd/a"]
while running with/path/to/cwd
and thus get a sequence like["a", "/path/to/cwd/a", "/path/to/cwd/a"]
(where the first two correspond to the first input element, and the last corresponds to the second input element) - the
realpath
codepath:paths == ["/a", "/link-to-a"]
whereos.path.realpath("/link-to-a") == "/a"
- the combination of the two: AFAIK,
os.path.realpath
will absolutise a relative path, and thusnot isabs(path)
impliesos.path.realpath(path) != path
and both extrayield
s will happen
Maybe this private function is not ever actually called with problematic args, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The questions the code raised are enough here to justify simplifying at the expense of slightly more work. Fixed.
pex/resolve/target_options.py
Outdated
@@ -61,16 +61,12 @@ def register( | |||
), | |||
) | |||
|
|||
current_interpreter = PythonInterpreter.get() | |||
program = sys.argv[0] | |||
singe_interpreter_info_cmd = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the typo to single_
while you're here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Fix site-packages identification.
In the process of working a new feature, it was discovered that
sys.path
manipulations leaked intoPythonInterpreter
when they wereperformed before the
__pex__
import hook. This is fixed by alwaysgoing through the interpreter disk cache path, which has properly
isolated
sys.path
for quite some time, eliminating the one exceptionfor the current interpreter. Fixing this case caused several tests to
fail that exposed several eager uses of
PythonInterpreter.get()
thatwould execute in import of various modules. Those cases are fixed to
either be lazy or use
sys.executable
.Along the way, fix up the ~TODO for migrating from
distutils.sysconfig
to
sysconfig
for determining the site-packages directories. Thisshould preempt issues with vendor (e.g.: Debian) supplied Python 3.12+.