Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch jaxlib to use nanobind instead of pybind11. #16100

Merged
merged 1 commit into from
Aug 24, 2023

Conversation

copybara-service[bot]
Copy link

@copybara-service copybara-service bot commented May 23, 2023

Switch jaxlib to use nanobind instead of pybind11.

nanobind has a number of advantages (https://nanobind.readthedocs.io/en/latest/why.html), notably speed of compilation and dispatch, but the main reason to do this for these bindings is because nanobind can target the Python Stable ABI starting with Python 3.12. This means that we will not need to ship per-Python version CUDA plugins starting with Python 3.12.

@hawkinsp
Copy link
Collaborator

This is a prototype, not ready for submission yet.

@copybara-service copybara-service bot force-pushed the test_534535677 branch 6 times, most recently from edb03cb to f4deb33 Compare May 23, 2023 23:49
@nicholasjng
Copy link
Contributor

Please ignore if it's too off-topic, but I would be interested in your take on building nanobind with Bazel.

Specifically, on MacOS, the "official" CMake build uses a linker response file to handle unknown Python symbols via chained fixups and reduce the .so output size, which I have tried but failed to replicate in Bazel. (It might be possible with a genrule running the given symbol collection script in the nanobind/cmake folder, and pass its output as a linkopt)

Is that something that you think would be worth pursuing for Mac builds? (For some additional context, especially point no. 7): python/cpython#97524 (comment)

Also, from experiments on Benchmark, I saw that passing -Os to nanobind does indeed reduce Python wheel size as advertised (in my case probably ~10%) - but I do not know if that is interesting given that the built XLA extensions are probably the bigger factor (until they too are ported to nanobind eventually?)

@hawkinsp
Copy link
Collaborator

Thanks for the comment. I wasn't aware of this, but it sounds like the size effects are minor for Python binaries, so perhaps it isn't worth sweating this.

I don't think it would be terribly hard to pass a linker response file. JAX uses this macro from TSL to build python extensions:
https://github.com/openxla/xla/blob/586f729fb47af6b8979b86ff141cc73677e4e78c/third_party/tsl/tsl/tsl.bzl#L530

and note that it already passes a similar linker option: the exported symbols list (https://github.com/openxla/xla/blob/586f729fb47af6b8979b86ff141cc73677e4e78c/third_party/tsl/tsl/tsl.bzl#L638). So I'd imagine you could do something very similar here if you wanted?

(My personal interest in nanobind here is actually the promise of Python C API stability, although we won't get the benefit of that until Python 3.12+.)

@hawkinsp
Copy link
Collaborator

I should also add: the other issue discussed in that thread relates to targeting Mac OS 12.0. We currently target 10.14 (in general, when shipping wheels, we want to be compatible over a long time period).

@nicholasjng
Copy link
Contributor

Thanks for that link. JAX is doing exactly what I had in mind, but unfortunately, additional_linker_inputs are not available on the cc_library rule. I sent out an FR for that to Bazel when I ran into this during my nanobind experiments on GBM (bazelbuild/bazel#17788), but it stalled.

Would I get in trouble by declaring it a cc_shared_library instead of a cc_library?

@hawkinsp
Copy link
Collaborator

JAX actually uses the cc_binary path, not the cc_shared_library path. This is fine when building a python extension: it's just a shared library (a .so/.dylib file). You can stick whatever linker options you like on a cc_binary.

@copybara-service copybara-service bot force-pushed the test_534535677 branch 4 times, most recently from 9965dd4 to e7ab697 Compare August 24, 2023 22:35
nanobind has a number of advantages (https://nanobind.readthedocs.io/en/latest/why.html), notably speed of compilation and dispatch, but the main reason to do this for these bindings is because nanobind can target the Python Stable ABI starting with Python 3.12. This means that we will not need to ship per-Python version CUDA plugins starting with Python 3.12.

PiperOrigin-RevId: 559898790
@copybara-service copybara-service bot merged commit 70b7d50 into main Aug 24, 2023
@copybara-service copybara-service bot deleted the test_534535677 branch August 24, 2023 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants