Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Bazel pydrake, unittests may have system modules inadvertently shadowed by submodules #8041

Closed
EricCousineau-TRI opened this issue Feb 12, 2018 · 19 comments
Assignees

Comments

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Feb 12, 2018

The following code should not work, but does:

import systems.all

SUMMARY (2018/02/19): Because we put tests next to (sub)module code in Bazel, we can inadvertently shadow standard-but-not-builtin modules (e.g. math) in our tests / binaries. This happens on Mac more frequently because they adhere to using the correct builtin module list, while Ubuntu has a non-compliant builtin list (e.g. adding math).

@RussTedrake is running into this when trying to define a math submodule, which breaks other things.

@EricCousineau-TRI EricCousineau-TRI self-assigned this Feb 12, 2018
@EricCousineau-TRI
Copy link
Contributor Author

Er, nevermind, was just running from bindings/pydrake.

But perhaps this is an issue on Mac?

@EricCousineau-TRI EricCousineau-TRI changed the title In Bazel, pydrake submodules leak in as top-level modules in PYTHONPATH In Bazel, pydrake submodules might leak in as top-level modules in PYTHONPATH on Mac? Feb 12, 2018
@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Feb 12, 2018

Tested on Mac, and it seems like Python on Mac (or just Homebrew?) has different behavior than on Ubuntu:

When a Python script is executed, the directory of the script is effectively added to PYTHONPATH, meaning that if a script is executed with a neighboring math.py (or math.so) module, then import math will use the local version rather than the system version. (Which TBH, seems like a very poorly made choice on behalf of whoever packaged this behavior for Mac / Homebrew?)

I figured this out by running the script on Russ's pybary branch, getting the traceback, and executing that script directly in a fresh environment and getting the same error:

$ bazel run //bindings/pydrake:math_test
...
Traceback (most recent call last):
  File "/private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake/math_test", line 5, in <module>
    import tempfile
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/tempfile.py", line 35, in <module>
    from random import Random as _Random
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/random.py", line 45, in <module>
    from math import log as _log, exp as _exp, pi as _pi, e as _e, ceil as _ceil
ImportError: cannot import name log

$ /private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake/math_test
<same error>

Looking at directory of math_test, there is math.so, which seems to be used.

@RussTedrake For your test, you can work around this issue by placing the test in a separate directory. As an example: EricCousineau-TRI@60c50b1cf
This just renames math_test to test/math_test, such that you have to run it via bazel run //bindings/pydrake:test/math_test.

@jamiesnape Are you aware if this is intended behavior for Python on Mac?

@jamiesnape
Copy link
Contributor

When a Python script is executed, the directory of the script is effectively added to PYTHONPATH, meaning that if a script is executed with a neighboring math.py (or math.so) module, then import math will use the local version rather than the system version. (Which TBH, seems like a very poorly made choice on behalf of whoever packaged this behavior for Mac / Homebrew?)

You mean when Bazel runs the script, the wrapper it creates adds the directory of the script to PYTHONPATH?

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Feb 12, 2018

You mean when Bazel runs the script, the wrapper it creates adds the directory of the script to PYTHONPATH?

It does not appear that way.

This happens even before any of the Bazel wrapper junk happens. Here's the wrapper script, up to line 5 (where the error occurs):

$ head -n 5 /private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake/math_test
#!/usr/bin/env python

import os
import re
import tempfile

Update: If I inject a printout of sys.path, this confirms that the behavior appears to happen with Python itself, not Bazel:

$ head -n 7 /private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake/math_test
#!/usr/bin/env python

import sys; print("\n".join(sys.path))

import os
import re
import tempfile

$ /private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake/math_test                              
/private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake
/Users/eacousineau/Library/Python/2.7/lib/python/site-packages
...
Traceback (most recent call last):
  File "/private/var/tmp/_bazel_eacousineau/c3002ebb6ddcc60621648c70ebcb5ec7/execroot/drake/bazel-out/darwin-opt/bin/bindings/pydrake/math_test", line 7, in <module>
    import tempfile
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/tempfile.py", line 35, in <module>
    from random import Random as _Random
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/random.py", line 45, in <module>
    from math import log as _log, exp as _exp, pi as _pi, e as _e, ceil as _ceil
ImportError: cannot import name log

@jamiesnape
Copy link
Contributor

So if I do python -c "import sys; print('\n'.join(sys.path))", you are expecting that I see the current working directory?

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Feb 12, 2018

So if I do python -c "import sys; print('\n'.join(sys.path))", you are expecting that I see the current working directory?

It does appear, but after PYTHONPATH (which is great):

$ mkdir -p /tmp/random && cd /tmp/random
$ python -c "import sys; print('\n'.join(sys.path))"
/private/tmp/random
/Users/eacousineau/Library/Python/2.7/lib/python/site-packages
... # Other PYTHONPATH things.
/private/tmp/random
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python27.zip
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7
...

Rewording it to run from a file:

$ mkdir -p /tmp/random && cd /tmp/random
$ cat > test.py <<EOF
import sys; print("\n".join(sys.path))
EOF
$ python ./test.py
/private/tmp/random  # <-- YUCK
/Users/eacousineau/Library/Python/2.7/lib/python/site-packages
/Users/eacousineau/Library/Python/2.7/lib/python/site-packages
... # Other PYTHONPATH things.
/private/tmp/random
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python27.zip
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7
...

However, it seems that this same behavior happens on Ubuntu... so now I have no idea why Russ's branch does not encounter this problem on Ubuntu... ???

Perhaps the Ubuntu binary has some logic to detect this edge case for system libraries, and ignore it?

Bottom line, though, is to ensure that tests are not run in a directory that neighbors any modules that might collide with system libraries?
(Or avoid naming submodules that may collide with any other upstream libraries, but that seems like a silly constraint?)

TIL: Python prepends a script's directory to sys.path, not just PWD...

@EricCousineau-TRI EricCousineau-TRI changed the title In Bazel, pydrake submodules might leak in as top-level modules in PYTHONPATH on Mac? In Bazel, pydrake submodules may shadow built-in modules for unittests Feb 12, 2018
@jamiesnape
Copy link
Contributor

Minor point (does not affect the conclusion, I checked it is the same) but all your tests above are using the system Python rather than the Homebrew one.

@jamiesnape
Copy link
Contributor

I actually never see the working directory in my output for some reason, but I do see the script directory for the second example.

@rdeits
Copy link
Contributor

rdeits commented Feb 13, 2018

@EricCousineau-TRI the order in which Python modules are found is here: https://docs.python.org/2/tutorial/modules.html#the-module-search-path

Built-ins are always preferred, so if you have a file called math.py in your current directory, import math will still get you Python's built-in math, not your copy. That should be true on every platform.

However, it's possible to shadow built-ins if you really try. For example, if you have foo.py and do:

import foo as math

then you're certainly in trouble when you try to do math.sin(x).

Relative imports within packages are indeed a source of confusion, (so much so that relative import within a package is removed in Python3). Fortunately, it's easy to do the right thing. If you have:

pydrake/
  __init__.py
  math.py

then pydrake/__init__.py can do:

from . import math

or

from .math import sin

and be absolutely certain you're getting pydrake.math and not anything else.

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Feb 13, 2018

@rdeits Thanks for the info!

Built-ins are always preferred, so if you have a file called math.py in your current directory, import math will still get you Python's built-in math, not your copy. That should be true on every platform.

This does not seem to be the case for Mac, as math is being shadowed inside of another builtin module (tempfile), which makes it seem like Mac / Homebrew Python is not correctly identifying math as a builtin. On Ubuntu, this is being recognized correctly.

Minor point (does not affect the conclusion, I checked it is the same) but all your tests above are using the system Python rather than the Homebrew one.

I tested the Bazel py_binary-generated script, running with python (system Python) and python2 (Homebrew Python), and both fail, so it seems like a Mac bug in general?

@rdeits
Copy link
Contributor

rdeits commented Feb 13, 2018

That would be very strange. Is there a minimal example that demonstrates this? Something like

pydrake/
  __init__.py
  math.py

?

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Feb 14, 2018

Is there a minimal example that demonstrates this?

I'll see if I can whip something up. I couldn't reproduce it with a *.py file; I'm going to try and make a minimal example with a *.so and see if that does it.

What I have so far:
https://github.com/EricCousineau-TRI/repro/blob/bada7fbcb69dd256e7af477ba0317f6ba117cd16/bug/mac_builtin_shadow/test.sh

@EricCousineau-TRI
Copy link
Contributor Author

Yup, Mac does not correctly ignore *.so files.

Here is the line that now fails:
https://github.com/EricCousineau-TRI/repro/blob/63e0ce6/bug/mac_builtin_shadow/test.sh#L80

Note that the other call (where math.so is not in the same PWD) still works. Will file the bug with Python (unless I turn something else up during the search) since this is with both Mac and Homebrew.

@EricCousineau-TRI
Copy link
Contributor Author

Hup, I lied; it fails with *.py files too, I just hadn't set up the path correctly.
Latest incantation: https://github.com/EricCousineau-TRI/repro/blob/b08bc47/bug/mac_builtin_shadow/test.sh#L38

Filed the upstream bug:
https://bugs.python.org/issue32845

@rdeits
Copy link
Contributor

rdeits commented Feb 14, 2018

Wow, that's super weird. Kudos for figuring this out.

@rdeits
Copy link
Contributor

rdeits commented Feb 14, 2018

Yup, just confirmed this for myself too. Having a math.py in the current directory shadows the built-in on mac, but not on Ubuntu. Not cool, python.

@EricCousineau-TRI
Copy link
Contributor Author

EricCousineau-TRI commented Feb 15, 2018

According to Eric V. Smith and Ned Deily on the Python bug tracker, this is expected behavior. math is a standard library, but not a builtin library per the standards. Ubuntu is adding extra modules to the builtins list, while Mac stays true to the "truth".

Ned posted an example showing math being builtin on Ubuntu, but a non-builtin on Mac. If you print sys.builtin_module_names, you'll see that Ubuntu does this for a few other libraries.

So ultimately, Python tests should not have system / standard modules shadowed by neighboring submodules. To this end, I'll submit a PR for drake_py_test which always puts the test in test/{name}, and creates an alias to still make it runnable via :{name}.

Additionally, the linting tests (which were failing on Mac) should also not be sensitive to the module names - to this end, the Python linting tests should also be placed in a subdirectory.

UPDATE: Was sufficiently disappointed in search results that I posted a StackOverflow question + answer:
https://stackoverflow.com/questions/48810317/python-bazel-unittest-in-same-folder-as-submodule-named-math-standard-ma

@EricCousineau-TRI EricCousineau-TRI changed the title In Bazel, pydrake submodules may shadow built-in modules for unittests In Bazel pydrake, unittests may have system modules inadvertently shadowed by submodules Feb 15, 2018
RussTedrake added a commit to RussTedrake/drake that referenced this issue Feb 15, 2018
@EricCousineau-TRI
Copy link
Contributor Author

Solution plan:

  • Ensure that tests won't be influenced by submodules (e.g. testing pydrake.math under //bindings/pydrake shouldn't care that math is a module in that directory, since it's meant to be imported only as pydrake.math).
  • Follow-up on Pymath w underscore2 #8076 and rename pydrake._math to pydrake.math.

EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 19, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 19, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 19, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 19, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 19, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 20, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 20, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 20, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 20, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 20, 2018
EricCousineau-TRI added a commit to EricCousineau-TRI/drake that referenced this issue Feb 20, 2018
jwnimmer-tri added a commit that referenced this issue Feb 20, 2018
 drake_py: Add mechanism for #8041 to prevent submodule shadowing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants