-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: include numpy._core
imports for NumPy 2.0
#4857
Changes from all commits
0290852
e8d98f8
ee40dab
340afab
45cc2a0
2adef4b
f2f497e
4c39988
8aa97a6
c656925
a89e0a6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -120,6 +120,20 @@ inline numpy_internals &get_numpy_internals() { | |
return *ptr; | ||
} | ||
|
||
PYBIND11_NOINLINE module_ import_numpy_core_submodule(const char *submodule_name) { | ||
module_ numpy = module_::import("numpy"); | ||
str version_string = numpy.attr("__version__"); | ||
|
||
module_ numpy_lib = module_::import("numpy.lib"); | ||
object numpy_version = numpy_lib.attr("NumpyVersion")(version_string); | ||
int major_version = numpy_version.attr("major").cast<int>(); | ||
|
||
/* `numpy.core` was renamed to `numpy._core` in NumPy 2.0 as it officially | ||
became a private module. */ | ||
std::string numpy_core_path = major_version >= 2 ? "numpy._core" : "numpy.core"; | ||
return module_::import((numpy_core_path + "." + submodule_name).c_str()); | ||
} | ||
|
||
template <typename T> | ||
struct same_size { | ||
template <typename U> | ||
|
@@ -263,9 +277,13 @@ struct npy_api { | |
}; | ||
|
||
static npy_api lookup() { | ||
module_ m = module_::import("numpy.core.multiarray"); | ||
module_ m = detail::import_numpy_core_submodule("multiarray"); | ||
auto c = m.attr("_ARRAY_API"); | ||
void **api_ptr = (void **) PyCapsule_GetPointer(c.ptr(), nullptr); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's an existing problem, but could you please add this fix or similar while you're at it (untested)?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I added this |
||
if (api_ptr == nullptr) { | ||
raise_from(PyExc_SystemError, "FAILURE obtaining numpy _ARRAY_API pointer."); | ||
throw error_already_set(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you're right! This needs to be here. How did you notice? Looks like a made a mistake in #4570. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here in documentation: https://pybind11.readthedocs.io/en/stable/advanced/exceptions.html#chaining-exceptions-raise-from
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch (it completely slipped my mind), thanks a lot! |
||
} | ||
npy_api api; | ||
#define DECL_NPY_API(Func) api.Func##_ = (decltype(api.Func##_)) api_ptr[API_##Func]; | ||
DECL_NPY_API(PyArray_GetNDArrayCFeatureVersion); | ||
|
@@ -626,11 +644,8 @@ class dtype : public object { | |
|
||
private: | ||
static object _dtype_from_pep3118() { | ||
static PyObject *obj = module_::import("numpy.core._internal") | ||
.attr("_dtype_from_pep3118") | ||
.cast<object>() | ||
.release() | ||
.ptr(); | ||
module_ m = detail::import_numpy_core_submodule("_internal"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rwgk I would bet this change right here is causing the deadlocking. Look at how the old code did it all in one static call so the import was only once (even if this function got called a bunch). The new code is importing _internal every time this function is called. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That was my first suspicion, too, but that change by itself did not help. Then I decided to roll back and debug after production is in a healthy state again. Sorry I should have mentioned this before. (I'm currently battling multiple fires.) Almost certainly we should change the code here back, but I want to get to the bottom of the timeouts first. (I only have an internal reproducer at the moment. It involves running a test 100 times, ~1/3 of the tests time out.) |
||
static PyObject *obj = m.attr("_dtype_from_pep3118").cast<object>().release().ptr(); | ||
return reinterpret_borrow<object>(obj); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you have here looks acceptable, but I wonder if we could make this more robust pretty easily, along the lines of:
import numpy
numpy.__version__
or similar to decide ifcore
or_core
is needed, and try only that one import. (I'm hoping this might be more feasible than usual because you own the numpy API.)This is less likely to be surprising if there is some low-level issue in pybind11 or Python itself (refactoring, new major dev version), or numpy is changed further in the future. The error message then could be the original import error, re-raised using
raise_from
(very easy):pybind11/include/pybind11/pytypes.h
Line 817 in 7e5edbc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I did similar changes for JAX and originally I also decided to go with numpy version checks in an
if
statement and then a single import.But I was advised that
try ... catch
approach is more robust. Here's an explanation from Jake: jax-ml/jax#16972 (review)But I'm happy to switch to a version check if you prefer it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're dealing with 1 try vs 2 tries here, that's similar but not the same.
I'm thinking of it more as a "capability check".
(Version numbers a convenient way to answer "what exactly do I have", that's a different purpose.)
I'm still hoping that we can rise to a level of organization that allows us to only have 1 import and report the original error: does or can numpy provide a simple API that allows us to only try once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I switched to an
if
statement that selects package path based on numpy version - now(_)core.multiarray
or(_)core._internal
is imported only once.(it might look lengthy, but this is the recommended way to get major version integer: https://numpy.org/doc/stable/release/1.9.0-notes.html#numpyversion-class-added)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think I somehow lost a comment here that I thought I typed in already. Sorry if something appears twice.)
What you have looks fine though, if you prefer, the only thing I'd do is remove the explicit
std::string()
inside the twoif
branches, i.e. I think simplynumpy_core_path = "numpy._core";
will work and is what people usually do.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can’t really check (at a conference), but what are we using the core module for? If it’s private that means NumPy is free to break us any time. Would be very nice to start moving to whatever is supported instead longer term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it only our tests? Ahh, no, multiarray. Is that available directly? Doesn’t seem to be. Is it supposed to stick around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henryiii
We are using the core module to obtain access to numpy's private C API.
Yes, NumPy can break us at any time as we are relying on a private API. @rwgk and I discuss a long term solution in the main PR thread, which is moving us to Numpy's public Python API instead of their private C API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Numpy has a public C API as well, but that's difficult for us to use because that would complicate our build process quite a bit).