Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update build system to allow for building Python wheels #614

Closed

Conversation

dpad
Copy link
Contributor

@dpad dpad commented Feb 20, 2024

Description

This change modifies the build system to address several short-comings and bring it up to speed with some more modern Python packaging standards. The results are summarised below:

  • Basilisk now builds and installs like any other Python package.
    • Can build a Python wheel ".whl" to be shared with users to install directly without manual compilation.
    • This ".whl" file can be uploaded automatically to Pypi to allow users to just do pip install Basilisk. I'm leaving this for the maintainers to setup.
  • Inverts the dependency of Conan and pip (pip now calls conan, instead of the opposite):
    • Conan build is no longer dependent on having a Python virtual environment.
    • Conan build no longer assumes the user is running pip or tries to upgrade it or install dependencies.
    • Python dependencies are now handled through PEP-517 compliance (see below).
  • Upgrades the Conan recipe to be compatible with Conan 2.0+, while maintaining compatibility for old (1.60) versions.
  • Full Python PEP-517 compliance:
    • All build and runtime dependencies are installed by the Python front-end tool (i.e. pip) -- thanks to new pyproject.toml file.
    • setup.py takes care of all dynamic information, including calling the underlying Conan build system.
    • When using pip, everything is built automatically within isolated temporary build environments.
  • Full builds (including opNav) and tests are now done in CI for 64-bit Linux, Mac, and Windows.
    • Tests performed across all supported Python versions (3.8~3.11).

Verification

Basilisk is now built by CI on three platforms (Linux, Mac, Windows) and with all supported Python versions (3.8~3.11). In addition, I have personally tested building Basilisk on Windows and Linux, as well as downloading and installing the wheels built by the CI runners. I have also had colleagues test building on Mac.

It all seems to work, but I of course would love it if the maintainers could try building it themselves to iron out any remaining issues. I'm sure there are probably some weird incompatibilities, but I tried to keep the dependencies as close to the old ones as possible. I did remove some old workarounds (mainly in the Conanfile) that I felt were no longer necessary, but I might have missed some special cases where they're still necessary, so might be worth checking that. In any case, being able to build wheels means that we can avoid users having to compile Basilisk at all (so long as their platform matches one of the wheels we build).

I also tested running the build manually using an old Conan (1.60) version, and it works, although I would recommend deprecating this and not supporting it at all going forward. By default, pip will download and use Conan 2.0+ for the build.

Documentation

Need to update the installation instructions for all platforms. I haven't updated the documentation and will leave it for the maintainers for now. The biggest differences are described below:

  • Building and installing Basilisk locally generally only requires a single command: pip install . or pip install -e . (editable)
  • An external module can be compiled simultaneously by setting the environment variable EXTERNAL_MODULES.
    • e.g. EXTERNAL_MODULES=/path/to/my/modules pip install .
  • Arguments can be passed to Conan directly by setting the environment variable CONAN_ARGS.
    • e.g. CONAN_ARGS="-o opNav=True" pip install .
  • (For reference, the reason for using environment variables above is written in setup.py.)
  • It is still possible to run conan 1.60+ or 2.0.5+ manually (e.g. you can just call conan build . --build=missing), but you can't easily pip install the built artifacts (at least in any easy way that I know). Might need to generate a fake dist3/Basilisk/pyproject.toml file to support this.
  • Removed some dependencies from having to be installed manually by the user:
    • Conan, CMake, setuptools, parse, and SWIG are now all build-time dependencies installed automatically by pip
    • pip>=22.3 is required to initiate the build (see the check in setup.py for reasons why)
  • I moved the supportData folder to be inside src (this is to match the recommendations given by the Python packaging tools), and I updated all relevant breaking tests but maybe you also need to update some documentation about this.

Future work

Bugs

  • Fix support for Python 3.12.
    • Builds for this work and most tests pass, but some testcases run into segmentation faults that I haven't yet tracked down.
  • Fix support for Python 3.8 on MacOS.
    • This mostly works, but sometimes runs into a stack overflow issue when running the test_orb_elem_convert.py tests.
    • "Abort trap: 6, cannot recover from stack overflow" bug, seems to be potentially test-order dependent?
    • Possibly related to the method pythonVariableLogger.__getattr__? Tests seem to pass if I add a simple debug print of the self._variables at the top of that method, and fail if I remove the debug print.
  • Fix support for windows-latest.
    • Builds work, but certain tests crash the pytest runners without any useful error message or crash dump. I was unable to track down the specific issue.
    • For now, I am leaving the CI to build on windows-2019 as per the existing Basilisk CI.

Further Improvements

  • Upload the built wheels to Pypi (e.g. using twine) -- should be prepared by the maintainers.
  • Sort out the MacOS compatible wheel version. I know close to zero about Mac platforms, but I know I had to set CMAKE_OSX_DEPLOYMENT_TARGET = 10.13 to get things to compile, but the wheels get built with macos_11_0 platform tags, which for some of my colleagues don't work (their Python lists macos_10_9 and complains of the wheel being incompatible). Someone with a Mac should look into this more.
  • Update documentation about installation and building custom modules needs to be updated. I am leaving this for the maintainers (update documentation on installing and compiling Basilisk #437).
  • Add support for building multiple external modules together with Basilisk.
    • This is a relatively simple change, I have a private branch as a work-in-progress where this is already possible. I will create a pull request as soon as I can.
  • Add support for building external modules entirely separately from Basilisk.
    • I believe this is possible, but will require more extensive changes to the CMake build system, including decoupling External Modules from the current build system, and exporting the relevant header files / CMake targets as part of the Python wheel.
    • Basically, I envision adding a distutils plugin (and associated CLI script) to Basilisk which would let a user compile their custom module from their own pyproject.toml, something like this:
[build-system]
requires = [
 "Basilisk"  # Includes external module compile tools during build time
]

[tool.Basilisk]
external_modules = [
 "path/to/module1",
 "path/to/module2"
]
  • Use cibuildwheel plus the Python limited API to build a single wheel for each platform (Mac, Windows, Linux) that can be used across any supported Python version.
    • I am currently finalising this change in a private branch, and will provide a pull request as soon as I can.

@dpad
Copy link
Contributor Author

dpad commented Feb 20, 2024

@patkenneally I heard you were interested in build system related changes, so tagging you here. My apologies if I'm mistaken!

@patkenneally
Copy link
Contributor

patkenneally commented Feb 20, 2024

Hi Dan @dpad , thank you for your clearly hard work and impressive result here. And yes, thank you for tagging me I'm very interested in build changes. It's ripe for improvement at the moment. We all forward to working through a review of it. With that in mind my, and in keeping with our process for submitting and reviewing PRs (I admit not always the clearest) my proposal is that we break up this very large PR into many separate ones (where appropriate) that can be reviewed on a per issue change. You are likely already familiar with the project's information regarding contribution, but if not it's available here at the contributions page. In particular the three articles linked a the bottom of that page really capture what is being pursued. In short, our goal is for PRs to address a single scoped change. This allows for more thorough and manageable reviews. For each change we will no doubt want to discuss the approach taken, the use cases which the proposed changes support, and their ultimate appropriateness to the project. I have no doubt you've given it all great thought and so I'm happy to help break the pieces up, account for other changes possibly needed to support, and eager to read through over the coming weeks.

@patkenneally patkenneally requested review from patkenneally and removed request for patkenneally February 20, 2024 02:12
@schaubh
Copy link
Contributor

schaubh commented Feb 20, 2024

Howdy @dpad , thanks for sharing this PR. As Patrick says, a lot of work went into this, creating a rather complex PR. You state yourself this PR still is causing some crashes and issues. I second his request to break down this large PR into discrete chunks that can be put on individual branches and associated PRs. This way we can iterate on each PR to ensure we are all happy with that step and no issues have come up.

@schaubh
Copy link
Contributor

schaubh commented Feb 20, 2024

Regarding making pip wheels, how did you get around the issue that the wheel is compiled against a particular version of python? You mention yourself that not all co-workers were able to use pip install Basilisk?

@dpad
Copy link
Contributor Author

dpad commented Feb 20, 2024

@patkenneally @schaubh Thanks very much for your comments and your support!

Building Python wheels

For your reference, CI run is available here: https://github.com/dpad/basilisk/actions/runs/7967368431
It's building with full opNav support and running the full C and Python test suites. Note that every build and test passes for all platforms + Python versions.

Basically, I am building wheels for each platform + Python version combination. There are 12 such combinations as per the CI above. This is a pretty standard thing to do for similar packages, for example if you look at numpy or matplotlib, they also build many different versions like this and upload all of them to Pypi. When users call pip install numpy, it downloads the most appropriate compatible wheel, or if it can't find any, it downloads the "sdist" (source distribution) and compiles locally. With the changes in this PR, the same should be possible for Basilisk.

To be clear, my CI setup is testing Linux, Mac (x86_64), and Windows (2019) for Python 3.8~3.11. Seemingly randomly, it seems the Python 3.8 MacOS build sometimes fails a single testcase (out of ~2700 testcases) due to some weird stack overflow issue that I mentioned above. In addition, Python 3.12 builds seem to run into a couple of segfaults on a very small subset of testcases (I suspect this is a deeper issue or possibly bugs in Basilisk and/or SWIG), and therefore I do not build for 3.12 in CI. Similarly, the "windows-latest" image also runs into similar segfaults on a very small subset testcases -- so I am building on "windows-2019" instead, which is what the current Basilisk CI does.

For comparison, the current Basilisk CI only runs a Python 3.9 build on Linux and Windows, with lots of additional environment set-up necessary, and doesn't run every Python test (only the "not scenarioTest" ones).

Building wheels in the future

By the way, I am also working on a separate branch which leverages the new SWIG 4.2 support (many thanks @schaubh !) to build using the "Python Limited API". This would basically allow us to build a single wheel for each platform, which would then theoretically work for any Python 3 version (at least from Python 3.8+). The main benefit is reducing the total amount of CI resources spent building wheels, and most users using the same single built artifact. I believe we can get this to work, but I'm currently running into some issues mainly on the Windows side.

Issues with installing Python wheels (MacOS)

Regarding the issue of different versions not being installable by my colleagues, basically this is only an issue I've observed with Mac. I myself have zero experience with Macs in general or building anything for them, but what I've learned is that you need to first set some deployment target version to 10.13 in order to build Basilisk correctly (that seems to be the minimum which I can get to build). This ends up building a Python wheel which is tagged with a deployment version, for example for Python 3.8 it is tagged like py38_macos_11_0. I don't know why the build system upgrades the value to 11_0 instead of 10_13. Then, some users have Python installed with a platform as something like py38_macos_10_9, and when they try to install the above built wheels, pip complains of the platform not matching and therefore cannot be installed.

But in any case, regardless of not being able to install these pre-built wheels, those users can still build and install Basilisk manually themselves by just calling pip install ..

Summary of changes for review

Of course, I understand the need for smaller changesets to help ease the burden of reviewing.
Unfortunately, due to the fairly integrated nature of the changes, this was about the smallest I could get it. I don't think I will have the time to make these changesets any smaller. While it looks like a lot of changes (due to touching many files), the main "meat" is in conanfile.py, pyproject.toml, setup.py, and some relatively minor changes to CMake files.

Instead, I can provide a summary of the changes below to aid your review:

conanfile.py

  • Removed all pip-related functionality (updating pip, installing dependencies, checking versions etc.)
  • Updated to Conan 2 compatible format. Still also builds with Conan 1.60, but I recommend not supporting that.
  • Moved setting libcxx=libstdc++11 to legacy section (i.e. only when using Conan 1).
    • I recommend reviewing this specifically.
    • I believe the Conan 2 way is to set the tools.gnu:define_libcxx11_abi to force the use of the ABI. I don't think there is a programmatic way to update the profile anymore.
    • In any case, Basilisk builds and runs without setting this manually on all the platforms I've tried.
  • Moved XCode and MSBuild generators to legacy section (i.e. only when using Conan 1).
    • Conan 2 seems to set up XCode and MSBuild by default with the CMake generators.
  • Removed ability to run Conanfile as a script, since this is not standard.
  • Removed the package_id() function changing the Visual Studio runtime compiler string to "MD/MDd" or "MT/MTd".
    • I don't think it is possible to change the settings programmatically like this in Conan 2.
  • Removed some seemingly unnecessary dependencies (zlib, xz_utils, pcre).
  • Removed the pinned version of opencv (this can be re-added, I started making these changes before seeing that pinned version).
  • Import and call the utility functions directly, instead of calling them as a subprocess.
    • Not a strictly necessary change, just something I thought felt neater. Could be reverted.

pyproject.toml

  • Added this file. This is the PEP-517 compliant way to write Python packages nowadays.
  • Basically this file tells a "front-end tool" (i.e. pip) how to build the package, including setting up build isolation and all build-time dependencies.
  • In this case, it builds the package by calling the "back-end tool" (i.e. setuptools) which in turn calls setup.py.

setup.py

  • Basically, all this does is build a "Conan extension", which I define as running conan build on the given Conanfile and then finding the built package directories.
  • Removed all the old custom commands.
    • Basically, setup.py is not supposed to be run as a script anymore. That's deprecated behaviour in modern packaging systems and is only still supported for backwards compatibility.
    • I added a check to make sure users don't run this as a script.
  • Added a custom command to build "Conan extensions".
    • The only pre-existing alternative I could find was skbuild-conan but I ran into some issues with this initially so opted for a custom solution. It's possible we could use that instead though.
  • Added sanity checks to ensure user's pip version is sufficiently new. Just needs pip>=22.3.
  • Added some environment variables to pass to Conan.
    • The main reason for this was that passing command-line options through the PEP-517 front-end is not well supported yet by the back-end setuptools.

Changes to some CMakeLists.txt files

  • Updated minimum CMake to 3.18 (it gets installed by pip automatically anyway).
  • Added an exact version check to find_package(Python3) to avoid finding any other Python versions installed on the system but not matching the one that's running the installation.
  • Updated related variables for the above, including adding appropriate library targets (Python3::Module instead of PYTHON_LIBRARIES).
  • Added an appropriate SWIG version check -- SWIG 4.2.0 is required for Python 3.12, and I added a lower bound of 4.0 otherwise (can of course be changed as you see fit). pip will install SWIG 4.2.0 automatically.
  • Deleted conan.cmake -- this is no longer necessary.
  • Updated Protobuf to find the correct version and ensure it matches the one expected by Conan (there are cases where it would find a system-installed one).

CI Changes

  • Added a new Github workflow which runs 12 individual build + test of the Basilisk package.
    • NOTE: MacOS is only being built for x86_64 architecture, not yet for arm64.
  • I am working on a separate branch to do this build using cibuildwheel, which is the standard tool used for building wheels properly with universally-acceptable versions of compilers etc. It may be worth switching to that, but since the CI above works fully at the moment, I leave it as is.
  • I didn't include an "sdist" build (since I'm working on this in a separate branch) but if we want to upload Basilisk to Pypi, that will be necessary. It's very easy to add though.

Misc. changes

  • Added a missing #include in centerRadiusCNN.cpp (otherwise this wasn't building on Windows).
  • Removed the self.spiceObject.SPICELoaded = True line from simIncludeGravBody.py -- I believe this was causing test failures on some platforms, where for some reason SPICE complains of not having loaded any files (spkezr_c SPICE(NOLOADEDFILES) error).

Unnecessary changes that could be reverted but I think are useful anyway

  • Changed the way generatePackageInit.py works. Basically, I wanted to get rid of the dependency on having to look at which messages will be built. Instead, it just looks for which python modules have been built and builds an __init__.py which imports all of them. This change was made while I was testing some other build system stuff but I left it in as I think it's a bit cleaner from a dependency point of view.
  • Reversed the dependencies of the swigtrick target in CMake to support the above.
  • Added a cMsgCInterfacePy.py module that imports all messages and warns the user about its deprecation. Previously this was just a symlink, but I think on some platforms that would end up copying a large number of files and making the wheel file bigger.
  • Moved supportData to src/supportData. This is the recommended way to include data files in Python packages now. I had to update some testcases and CMakes for this to work.

@patkenneally
Copy link
Contributor

patkenneally commented Feb 20, 2024

hahaha this would explain what nuked our git-lfs quota in the last week. Building for multiple wheels won't be feasible unless we remove git-lfs contents. I have also wanted to matrix the CI across python versions but using git-lfs makes this cost prohibitive. But a clear solution is to remove spice kernels from the repo. I've been a fan of removing kernels from the repo for a while and we can provide download via other means.

I appreciate the integrated nature of some of these changes, however, I can spot a number of places where changes can be broken out and will take a look at doing so. I'm hesitant to approve such a large change set and so will look to chunk it up. I appreciate the write up above.

@dpad
Copy link
Contributor Author

dpad commented Feb 20, 2024

@patkenneally Oops, I'm not sure if that was due to me but if so, my apologies! I would not have expected any changes I made to my own fork to affect any quotas set on the main repository :/

Regarding having a CI matrix with Git LFS, I have a branch that's still WIP over on dpad to build Basilisk using the Python limited API: https://github.com/dpad/basilisk/tree/basilisk/build-python-limited-api
It's still a bit broken (mostly for Windows, and I can't yet get Mac to build the arm64 platform), but in theory, that would only need 3 builds (1 for each of Linux, Mac, and Python), so that might be helpful for you to reference if you are interested.

And yes please, if you are willing to extract some changes out please do feel free to take over. I won't be making any further changes on this branch (going to use it as is for our internal purposes for now), but I'm happy to answer any questions if you run into any problems.

@patkenneally
Copy link
Contributor

patkenneally commented Feb 20, 2024

Sounds good. I will pick out chunks and check in with you as we go. I appreciate you sharing this work back to the community. There are a number of features in your branch that I've been poking at in the scraps of free time. So this is great.

Any time you clone a fork of a repo, the git-lfs contents are pulled from the root repository and count against the root repo's git-lfs bandwidth quota. So any CI running on forks count against the root repo's quota. No worries though. I think the appropriate solution is to remove the kernels from git-lfs and have them downloaded by conan or cmake from other source/storage locations.

@dpad
Copy link
Contributor Author

dpad commented Jun 19, 2024

@patkenneally I thought I'd check in on this to see whether there are any updates or progress from Basilisk side?

For your reference, I have been using this proposed system, with only minor modifications, successfully across several different Linux machines with Python 3.8~3.11.

I think we can maybe split up this work to make it a bit easier to merge into individual pieces like this:

  1. Fixes to CMake files and build system Python tools
  2. Clean up and upgrade conanfile.py to Conan 2+
  3. Changes to setup.py and pyproject.toml to allow compilation via pip install
  4. Build in CI using cibuildwheel to prepare Python wheels for all desired platforms

@schaubh
Copy link
Contributor

schaubh commented Jun 19, 2024

Howdy @dpad , thanks for the follow up and good news that your approach has been running successfully on a range of Python systems on Linux. Patrick's focus has been his on fork of Basilisk and I'm not aware of any work on your PR.

I did take a look at your PR a few weeks ago to see how this could be integrated. I agree with you that this PR needs to be broken up into smaller PRs where we can test and review each update across all platforms, build versions, etc. You mentioned in the original post that there some issues with some scripts. I'm glad to offer to work with you on testing and reviewing these branches to integrate these features into Basilisk. Doing these on public branches will allow other stake holders to review each step.

Regarding the order of proposed your branches, I will follow your lead as you know what upgrades are required by subsequent upgrades. Getting conan 2.0 compatibility is nice, but I'm very excited about the prospect of building BSK wheels that can be installed readily by users just looking to use BSK, not develop for it. This will open up drastically access to BSK to students in classes I teach, researchers trying out BSK to evaluate it, and many more. Thanks for working on these improvements.

Let me know if you are ok to take the lead breaking up your earlier PR into individual branches and associated PRs.

@juan-g-bonilla
Copy link
Contributor

@dpad Like Dr Schaub's mentioned, this development is very important and it would be great to split this into multiple PRs. I will be happy to help out, especially with Basilisk-specific errors, since it seems you have a very good handle of the build system itself. I normally use a windows machine, and Dr Schaub uses MacOS, so we can help you debug those. Also, feel free to add me as reviewer to these PRs (and ping me, if needed). Very excited for this!

@dpad
Copy link
Contributor Author

dpad commented Jul 2, 2024

I would like to close this in favour of #737 . There are some additional bits of work in here, but they can and should be done later in separate merge requests.

@dpad dpad closed this Jul 2, 2024
@schaubh
Copy link
Contributor

schaubh commented Nov 17, 2024

Howdy @dpad , the 2nd part of this branch was compatibility with conan 2.x. Where do you stand adding this into a separate branch? We just finished CDR review and my current focus is looking for more elegant ways to include the BSK data files without the PyPi wheel exceed 100Mb.

@dpad
Copy link
Contributor Author

dpad commented Nov 18, 2024

Hi @schaubh , unfortunately I don't have any plans to continue work on the Conan 2.x port at least for the near-future. As you know, this branch contains a working (at the time) conanfile.py using Conan 2 syntax which can be used as a reference, but given we made some changes since then, it may be easier/better to just follow Conan's migration guide from scratch.

@schaubh
Copy link
Contributor

schaubh commented Nov 18, 2024

Howdy @dpad , thanks for the quick update on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment