Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design a brand new C API with new PyCAPI_ prefix where all functions respect new guidelines #9

Closed
vstinner opened this issue Jun 26, 2023 · 13 comments

Comments

@vstinner
Copy link

Many functions of the existing CPython C API don't respect the new guidelines (being discussed here) like have ambiguous return value or return borrowed references. The problem is: how to write a C extension which avoid these issues? How can someone know if a C extensions only use "sane" functions? In the issue capi-workgroup/problems#54, I propose adding an opt-in macro to remove functions known to have issues in the existing API.

Here I propose something different: create a brand new C API with a new PyCAPI_ prefix where all functions respect the new guidelines. For example, PyObject* PyDict_GetItem(PyObject *key, PyObject *key) which has an ambigious return value and returns a borrowed reference will be replaced with int PyCAPI_Dict_GetItem(PyObject*, PyObject *key, PyObject **pvalue).

Goals of such new API:

  • Avoid known issues
  • Be consumed by code generators like Cython and pybind11
  • Consumed by HPy
  • Used by hand written C extensions
  • API should be easier to implement for other Python implementations
  • Smaller than the existing C API: remove functions with known issues, remove functions which are not strictly related
  • Used by CPython itself
  • Slowly replace the existing API
  • (Other goals have to be defined)

Non-goals:

  • Replace HPy: HPy goals are different (stable ABI access CPython versions, PyPy versions, universal ABI working on CPython and PyPy, etc.).
  • Perfect API: we will make new issues and regret, and we will introduce new variants of functions to fix them, one by one.
  • Stable ABI: for now, the stable ABI is only implemented with the existing limited C API
  • (Other non-goals have to be defined)

It would reduce the proliferation of "Ex", "Flags", "2", "Ref", "Object" suffixes of the C API, since we start against counting at 0 (no suffix): see issue capi-workgroup/problems#52 about naming convention for new functions.

I propose that at the beginning, this new C API will be implemented outside de CPython project. The implementation would be only header files using "alias" macros (ex: #define PyCAPI_Dict_GetItem PyDict_GetItem2) and static inline functions. An external project will ease quick experimentation.

Later, once the API is well designed and tested, move the implementation into CPython and both APIs will co-exist for the best and the worst of both world!

Such API doesn't have to be complete, since a C extension can use the old C API for functions which are not supported of the new C API yet.


To accelerate the adoption of this new C API, we should try to support old Python versions. The pythoncapi-compat project is a proof that it's doable with static inline functions. For example, PyImport_AddModuleRef() is implemented for Python 3.12 and older as:

// gh-105922 added PyImport_AddModuleRef() to Python 3.13.0a1
#if PY_VERSION_HEX < 0x030D00A0
PYCAPI_COMPAT_STATIC_INLINE(PyObject*)
PyImport_AddModuleRef(const char *name)
{
    return Py_XNewRef(PyImport_AddModule(name));
}
#endif
@vstinner
Copy link
Author

Prototype project: https://github.com/vstinner/pycapi

Example of implementation: pycapi/_pycapi_dict.h

#define PyCAPI_Dict_New PyDict_New
#define PyCAPI_Dict_GetItem PyDict_GetItemRef
#define PyCAPI_Dict_SetItem PyDict_SetItem

It is implemented on top of the pythoncapi-compat project. I just wrote a quick test on the discussed PyDict_GetItemRef() function: tests/test_pycapi.c.

See the API documentation.

@encukou
Copy link

encukou commented Jun 27, 2023

Oh, this shares so many goals with the limited API -- although the focus is on different ones :)

Consider structuring this a “diff” against the limited API: which limited functions should be removed? Added? Renamed? Tweaked?
In capi-workgroup/problems#7 I argue that C headers (+ docs) are a bad way to define an API. Consider auto-generating all declarations (that is, not inline function definitions) from a definition file.
This would help the limited API: it would show the direction in which the limited API needs to go, although the limited API would be much slower to get there.

If you think that's a good direction, I'll be happy to help, though I can't put in much time right now.

@vstinner
Copy link
Author

I'm not sure if writing a "brand new C API" is a good idea or not. I wanted to create an issue about it to discuss it. I even wrote a proof-of-concept API to see how it goes and what it would imply. @markshannon seems to like this idea a lot.

For me, the main issue with a new API is to expected slow adoption of it if the old API remains available, tested, documened, and "just works". I saw the Python 2 to Python 3 migration which was similar (Python 2 was still available, "just worked", and Python 3 has little benefits for end users) and took 10 years. Nowadays, there is HPy which is a brand new API, it makes C extensions way faster on PyPy, the API is less error-prone and has no known issues (like borrowed references). It also provides a "stable ABI" indirectly (universal ABI). Still, I don't see a wide adoption of it. So I'm not sure convinced that if a 3rd C API for Python appears, people will immediately jump on it and adopt it.

Here is the carrot to motivate users to migrate will not be "2x faster performance", but "less bugs", like avoiding race conditions. It's something which is really hard to sell :-( (Python 3 has "less bugs" and "better design", again, see the slow adoption rate.)

Consider structuring this a “diff” against the limited API: which limited functions should be removed? Added? Renamed? Tweaked?

Honestly, I'm not sure at this point. It seems like everybody is still in the experimental phase to first list issues that we want to avoid, and second check which existing APIs have these issues. Maybe having a concrete experimental project may help to create such diff? To take again my example:

#define PyCAPI_Dict_New PyDict_New
#define PyCAPI_Dict_GetItem PyDict_GetItemRef
#define PyCAPI_Dict_SetItem PyDict_SetItem

Here you can see that PyDict_New() and PyDict_SetItem() are just fine: the new names are just aliases, whereas PyDict_GetItem() is not used and overridden by the (currently discussed) PyDict_GetItemRef().

See also issue capi-workgroup/problems#54 which is closer to that: exclude some functions. Or even replace some functions with others if a macro is defined.

@mattip
Copy link

mattip commented Jun 27, 2023

Still, I don't see a wide adoption of it (HPy).

It is a bit early to judge that, no? I understand your point was "adopting a new C-API is hard" and not "HPy is a failure", but please do not make such blanket statements since they can be misinterpreted.

@vstinner
Copy link
Author

@mattip:

It is a bit early to judge that, no? I understand your point was "adopting a new C-API is hard" and not "HPy is a failure", but please do not make such blanket statements since they can be misinterpreted.

Do you have names of C extensions which adopted HPy? Any idea of how many C extensions use it?

I'm not trying to say that HPy is a failure. I'm saying that replacing an old API with a new one with the same of the C API (more than 1500 functions) and a long history (30 years) is uneasy.

@mattip
Copy link

mattip commented Jun 27, 2023

Do you have names of C extensions which adopted HPy? Any idea of how many C extensions use it?

Believe me I understand how hard it is to change the C-API. But using the lack of adoption of HPy as proof of anything is premature and a cheap shot, and does a disservice to people working on developing it. I think this entire discussion is off-topic for this repo, which is

We should focus, for now, on enumerating the problems rather than discussing API redesign proposals. We will be able to discuss new API designs only when we have a comprehensive view of the problems we are trying to solve.

However, we do not need to wait for that before we start fixing problems that can be fixed incrementally in the current C API. Issues can be marked with the "fixable" label, and within those issues it is fine to discuss pointed solutions to the specific problem covered by this issue.

If we have moved past that and are now discussing solutions, I think we could weigh your proposal against other candidates based on merit, not maturity or popularity.

@vstinner
Copy link
Author

In terms of new Python C API, there is the existing HPy project which is getting mature. It is based on the concept of opaque handle. The API does not provide a direct access to structure members, in terms of API, all goes through API calls, while the implementation can use direct memory access (without function calls).

It seems like @markshannon is unhappy with HPy design and proposes a different new Python C API: https://github.com/markshannon/New-C-API-for-Python For example, he wants to use PyDictObject, PyTupleObject, etc. types in the API without runtime type check, rather than PyObject with runtime check: see issue capi-workgroup/problems#31.

@markshannon
Copy link

This repo is specifically for problems with the existing C API.
Maybe we should make a new "solutions" repo?

For example, he wants to use PyDictObject, PyTupleObject, etc. types

I want to use handles as well, not pointers. That way it is easier to use the provided, safe casting mechanism, rather than plain C casts.

@vstinner
Copy link
Author

In terms of APIs to interoperate between Python and C, there are different options:

  • Current Python C API -- each Python version removes a few functions and add a few functions
  • Python limited C API -- same
  • HPy -- external project, under development
  • Cython -- different language, but is used for similar use cases than C API: extend Python, optimize Python code
  • cffi -- related to libffi, Foreign Function Interface
  • pybind11, nanobind11 -- for C++

There are also other related projects like:

It's appealing to propose a new C API fixing most known issues. But so far, I'm not convinced by a migration plan. @markshannon's migration plan is in short:

  • 2023: Provide a new API, deprecate the old API
  • 2031: Remove the old API

Well, his full plan is more detailed. For end users, I don't see a strong incentive to convince users to migrate to the new API. Is the only reason is that "in 8 years, your code will fail to build"?

HPy provides better debugging tooling than Python C API, promises better performance on PyPI, and provide "universal ABI" working on CPython and PyPy. What's the advantages for another new C API?


Well, if it wasn't clear event: I'm more a supporter of incremental changes:

Obviously, there are also disavantages:

  • Migrating to opaque handles (opaque structures: PyObject, PyTupleObject, etc.) can take a few years, since the existing C API expose almost all structure members and sometimes advices to use them directly.
  • Each Python release breaks an unknown number of C extensions
  • Each Python release requires C extension maintainers to update their project, backward compatibility has flaws: pythoncapi-compat helps for that (provide new functions, tool to automate the work)
  • Issue Resistance to small improvements problems#44: Resistance to small improvements

People who target maximum Python compatibility with an CPython and PyPy versions may look to projects like HPy and Cython.

@vstinner
Copy link
Author

If a new C API is developed, for me, an important question is: should it be developed inside or outside the CPython Git repository? If it's inside, for me, it means that the implementation should be completed and "bugfree" quicker and it would mean that the CPython core devs endorse this approach. If it's outside, the API can be developed quicker and doesn't have to be endorse: the problem is that it may slow down it's adoption.

@vstinner
Copy link
Author

See also issue capi-workgroup/problems#62: How can an user access old removed functions? Can a 3rd party project provide them?

@encukou
Copy link

encukou commented Oct 24, 2023

Moving to the solutions repo.

@encukou encukou transferred this issue from capi-workgroup/problems Oct 24, 2023
@vstinner
Copy link
Author

vstinner commented Sep 6, 2024

I close this issue.

I don't think that a brand new C API is going to make the situation better. I'm more a supporter of incremental changes.

See PEP 743 – Add Py_COMPAT_API_VERSION to the Python C API.

@vstinner vstinner closed this as completed Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants