Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PyCLIF-C-API feature parity with this pybind11 `base.__init__() must …
…be called when overriding __init__` error condition: * google3/third_party/pybind11/include/pybind11/detail/class.h;l=195-204;rcl=555209523 Motivation: Guard against backsliding (b/296065655). Situation: `CppBase` is a PyCLIF-C-API-wrapped C++ object. What happens when the Python interpreter processes the following code (usually at import time)? ``` class PC(CppBase): pass ``` When the native Python `PC` class is built: * `PC` `tp_new` is set to use `CppBase` `tp_new`, but * `PC` `tp_init` does NOT in any way involve `CppBase` `tp_init`. It is the responsibility of `PC.__init__` to call `CppBase.__init__`, but this is not checked. **This CL adds the missing check.** The approach is: * `PC` `tp_init` is replaced with an "intercept" function. * The intercept function calls the original `PC` `tp_init`. * After that call finishes (and if it was successful), the intercept function checks if the `CppBase` wrapped C++ object was initialized. This approach makes the assumption that `PC` `tp_init` is not also modified elsewhere, validated via TGP testing. The practical benefit of guarding against backsliding (e.g. cl/558247087) is assumed to far outweight the very theoretical risk of that assumption not being true, and even if it is not true, the consequences are unlikely to be harmful. An additional consideration is that the switch to PyCLIF-pybind11 will eliminate this risk entirely. For easy reviewing, this is the generated code implementing the new mechanism: ``` // Intentionally leak the unordered_map: // https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables static auto* derived_tp_init_registry = new std::unordered_map<PyTypeObject*, int(*)(PyObject*, PyObject*, PyObject*)>; static int tp_init_intercepted(PyObject* self, PyObject* args, PyObject* kw) { DCHECK(PyType_Check(self) == 0); const auto derived_tp_init = derived_tp_init_registry->find(Py_TYPE(self)); CHECK(derived_tp_init != derived_tp_init_registry->end()); int status = (*derived_tp_init->second)(self, args, kw); if (status == 0 && reinterpret_cast<wrapper*>(self)->cpp.get() == nullptr) { Py_DECREF(self); PyErr_Format(PyExc_TypeError, "%s.__init__() must be called when" " overriding __init__", wrapper_Type->tp_name); return -1; } return status; } static PyObject* tp_new_impl(PyTypeObject* type, PyObject* args, PyObject* kwds) { if (type->tp_init != tp_init_impl && derived_tp_init_registry->count(type) == 0) { (*derived_tp_init_registry)[type] = type->tp_init; type->tp_init = tp_init_intercepted; } return PyType_GenericNew(type, args, kwds); } ``` Background technical pointers: What happens when a `PC` object is constructed (i.e. `PC()`) in Python? * `_PyObject_MakeTpCall()`: google3/third_party/python_runtime/v3_10/Objects/call.c;l=215;rcl=491965220 * `type_call` `type->tp_new()`: google3/third_party/python_runtime/v3_10/Objects/typeobject.c;l=1123;rcl=491965220 * `type_call` `type->tp_init()`: google3/third_party/python_runtime/v3_10/Objects/typeobject.c;l=1135;rcl=491965220 Side note: * `CppBase` `tp_alloc` is NOT called. Instead, `PyObject* self` is allocated and `\0`-initialized in `PyType_GenericAlloc()`. For the wrapped `clif::Instance<CppBase>` this is Undefined Behavior. We are just getting lucky that it works: * google3/third_party/python_runtime/v3_10/Objects/typeobject.c;l=1166;rcl=491965220 This was noted already here: * google3/third_party/clif/python/gen.py;l=504-506;rcl=551297023 Additional manual testing (go/py311-upgrade): ``` blaze test --python3_version=3.11 --python_mode=unstable third_party/clif/... devtools/clif/... -k ``` * http://sponge2/7f8c0ecd-48c4-4269-8d71-e2e08802d40a The only failure is known and unrelated: b/288436695 PiperOrigin-RevId: 559501787
- Loading branch information