Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Add private "CAST" macros to clean up casts in C code #91320

Closed
vstinner opened this issue Mar 30, 2022 · 17 comments
Closed

[C API] Add private "CAST" macros to clean up casts in C code #91320

vstinner opened this issue Mar 30, 2022 · 17 comments
Labels
3.11 only security fixes topic-C-API

Comments

@vstinner
Copy link
Member

BPO 47164
Nosy @gpshead, @vstinner, @markshannon, @corona10, @Fidget-Spinner
PRs
  • bpo-47164: Add _PyCFunctionObject_CAST() macro #32190
  • bpo-47164: Add _PyASCIIObject_CAST() macro #32191
  • bpo-47164: Add _PyCFunction_CAST() macro #32192
  • bpo-47164: Argument Clinic uses _PyCFunction_CAST() #32210
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-03-30.12:28:10.065>
    labels = ['expert-C-API', '3.11']
    title = '[C API] Add private "CAST" macros to clean up casts in C code'
    updated_at = <Date 2022-04-05.06:59:30.735>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-04-05.06:59:30.735>
    actor = 'gregory.p.smith'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['C API']
    creation = <Date 2022-03-30.12:28:10.065>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 47164
    keywords = ['patch']
    message_count = 14.0
    messages = ['416350', '416353', '416354', '416357', '416358', '416361', '416362', '416363', '416365', '416372', '416417', '416418', '416419', '416420']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'Mark.Shannon', 'corona10', 'kj']
    pr_nums = ['32190', '32191', '32192', '32210']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue47164'
    versions = ['Python 3.11']

    @vstinner
    Copy link
    Member Author

    Last years, I started to add "CAST" macros like:

    #define _PyObject_CAST(op) ((PyObject*)(op))
    #define _PyType_CAST(op) (assert(PyType_Check(op)), (PyTypeObject*)(op))

    Example of usage:

    #define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type)

    These macros avoids parenthesis. Example without CAST macro:

    #define PyCFunction_GET_FLAGS(func) \
            (((PyCFunctionObject *)func) -> m_ml -> ml_flags)

    Currently, inline cast requires adding parenthesis:

    ((PyCFunctionObject *)func)

    IMO it's more readable with a CAST macro:

    #define _PyCFunction_CAST(func) ((PyCFunctionObject *)func)
    #define PyCFunction_GET_FLAGS(func) \
            (_PyCFunction_CAST(func)->m_ml->ml_flags)

    I propose to add more CAST macros.

    By the way, the current Python C API is not fully compatible with C++. "(type)expr" C syntax is seen as "old-style cast" in C++ which recommends static_cast<type>(expr), reinterpret_cast<type>(expr), or another kind of cast. But I prefer to discuss C++ in a separated issue ;-) IMO without considering C++, adding CAST macros is worth it for readability.

    I am preparing pull requests for add CAST macros and use existing CAST macros.

    @vstinner vstinner added 3.11 only security fixes topic-C-API labels Mar 30, 2022
    @markshannon
    Copy link
    Member

    I think that adding macros makes readability worse.

    The macro is only more readable if you already know what it does.
    If you don't, then you need to look up the macro, and understand the cast in the macro (which is harder than understanding the original cast).

    In general, I find the excessive use of macros and tiny inline function obscures the meaning of code, and makes it hard to reason about what the code is doing.

    A few well chosen macros (like Py_INCREF(), etc) can definitely help readability, but if there are too many then anyone reading the code ends up having to lookup loads of macro definitions to understand the code.

    Without the macro, the reader needs to parse the cast, which is admittedly a

    @markshannon
    Copy link
    Member

    The problem in the example you give is the need for the cast in the first place. If func were a PyCFunctionObject * instead of a PyObject *, then there would be no cast.

    Making the casts explicit serves as a reminder that a type check is needed.

    PyObject *func = ...;
    int flags = PyCFunction_GET_FLAGS(func);
    

    is dangerous.

    PyObject *obj = ...;
    if (PyCFunction_Check(obj)) {
        PyCFunctionObject *func = (PyCFunctionObject *)obj;
        int flags = func->m_ml->ml_flags;
    

    is safe.

    @Fidget-Spinner
    Copy link
    Member

    tiny inline function obscures the meaning of code

    Sadly I have to agree with Mark for that specific case. I've had to debug a segfault before only because the inline function implicitly cast its arguments, and it was accessing a non-existent member. If it were a macro it would access the struct member directly, and the compiler would be able to catch that and warn me before runtime.

    However, for the case of _PyCFunctionObject_CAST, I'm not against it. The assert(PyCFunction_Check) pattern has genuinely helped me catch problems in the case of similar macros.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 30, 2022

    I think that adding macros makes readability worse.

    See the attached PRs, I can remove multiple layers of parenthesis in macros by using CAST macros.

    PyObject *func = ...;
    int flags = PyCFunction_GET_FLAGS(func);
    

    is dangerous.

    Right.

    PEP-670 proposes to remove the cast, but only in the limited C API version 3.11 or newer. Sadly, I don't think that we can remove the cast in the general Python C API, it would simply add too many compiler warnings in existing C extensions which previously built without warnings. (See also PEP-670 discussions on python-dev).

    Currently, PyCFunction_GET_FLAGS() doesn't check its argument. The macro documentation is quite explicit about it:

    /* Macros for direct access to these values. Type checks are *not*
       done, so use with care. */
    

    My #76371 PR adds a check in debug mode. So using PyCFunction_GET_FLAGS() in debug mode is safer with this PR.

    @vstinner
    Copy link
    Member Author

    By the way, the current Python C API is not fully compatible with C++. (...)

    I created bpo-47165 "[C API] Test that the Python C API is compatible with C++".

    @vstinner
    Copy link
    Member Author

    Hum, there are two things and maybe we are not talking about the same thing.

    • (A) Modifying macros defined in the Python C API (Include/ header files) doing cast on their arguments to use CAST macros
    • (B) Modify C code doing casts to use CAST macros

    I created this issue for (A). I propose to do (B), but this part is optional.

    Mark, Ken: are you fine with (A)? Is your concern about (B)?

    But modifying macros to remove the cast of their argument is out of the scope of this issue. As I wrote, it would add compiler warnings, something that I would like to avoid. See:
    https://peps.python.org/pep-0670/#cast-pointer-arguments

    @vstinner
    Copy link
    Member Author

    #76373 "Add _PyCFunction_CAST() macro" is special. It's used to define functions in PyTypeObject.tp_methods or PyModuleDef.m_methods.

    Casting a function pointer can cause issues with Control Flow Integrity (CFI) implemented in LLVM. The _thread module has an undefined behavior fixed by the commit 9eea6ea of bpo-33015. Previously, a cast ignored that the callback has no return value, whereas pthread_create() requires a function which as a void* return value.

    To fix compiler warnings, the (PyCFunction)func cast was replaced with the (PyCFunction)(void(*)(void))func cast to fix bpo-bpo-33012 GCC 8 compiler warning:

    warning: cast between incompatible function types (...)
    

    @Fidget-Spinner
    Copy link
    Member

    @victor,

    I'm not against (A) or any of the PRs you proposed. They look fine to me. My concern was with certain static inline functions (there are none here :). The _CAST macros have genuinely helped me debug code before.

    @vstinner
    Copy link
    Member Author

    I've had to debug a segfault before only because the inline function implicitly cast its arguments, and it was accessing a non-existent member. If it were a macro it would access the struct member directly, and the compiler would be able to catch that and warn me before runtime.

    This is part of Python C API legacy and as I wrote, it's not going to change soon. The minor enhancement that we can do is to inject an assertion when Python is built in debug mode.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 31, 2022

    After reading Mark's comments, I reworked my #76371 PR to only use the CAST macros in other macros, not in the C code. The CAST macros are not used in such code pattern:

         else if (PyCFunction_Check(func))
    -        return ((PyCFunctionObject*)func)->m_ml->ml_name;
    +        return _PyCFunctionObject_CAST(func)->m_ml->ml_name;
    

    In fact, this change doesn't bring any value.

    @vstinner
    Copy link
    Member Author

    New changeset c14d7e4 by Victor Stinner in branch 'main':
    bpo-47164: Add _PyASCIIObject_CAST() macro (GH-32191)
    c14d7e4

    @vstinner
    Copy link
    Member Author

    New changeset f0bc694 by Victor Stinner in branch 'main':
    bpo-47164: Add _PyCFunction_CAST() macro (GH-32192)
    f0bc694

    @vstinner
    Copy link
    Member Author

    New changeset 7fc39a2 by Victor Stinner in branch 'main':
    bpo-47164: Add _PyCFunctionObject_CAST() macr (GH-32190)
    7fc39a2

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner
    Copy link
    Member Author

    Concrete problem of METH_NOARG undefined behavior on a function with a single parameter, whereas the ABI requires 2 parameters (pass NULL as the second parameter): https://blog.pyodide.org/posts/function-pointer-cast-handling/

    It's also an issue for CFI.

    vstinner added a commit that referenced this issue Apr 27, 2022
    Fix C++ compiler warnings about "old-style cast"
    (g++ -Wold-style-cast) in the Python C API.  Use C++
    reinterpret_cast<> and static_cast<> casts when the Python C API is
    used in C++.
    
    Example of fixed warning:
    
        Include/object.h:107:43: error: use of old-style cast to
        ‘PyObject*’ {aka ‘struct _object*’} [-Werror=old-style-cast]
        #define _PyObject_CAST(op) ((PyObject*)(op))
    
    Add _Py_reinterpret_cast() and _Py_static_cast() macros.
    @vstinner
    Copy link
    Member Author

    CAST macros now use reinterpret_cast<> on C++: 29e2245 of #91959. I close the issue.

    vstinner added a commit that referenced this issue May 3, 2022
    Replace "(PyCFunction)(void(*)(void))func" cast with
    _PyCFunction_CAST(func).
    vstinner added a commit that referenced this issue May 3, 2022
    Use _Py_CAST(), _Py_STATIC_CAST() and _PyASCIIObject_CAST() in
    static inline functions to fix C++ compiler warnings:
    "use of old-style cast" (clang -Wold-style-cast).
    
    test_cppext now builds the C++ test extension with -Wold-style-cast.
    vstinner added a commit that referenced this issue May 3, 2022
    Replace "(PyCFunction)(void(*)(void))func" cast with
    _PyCFunction_CAST(func).
    
    Change generated by the command:
    
    sed -i -e \
      's!(PyCFunction)(void(\*)(void)) *\([A-Za-z0-9_]\+\)!_PyCFunction_CAST(\1)!g' \
      $(find -name "*.c")
    @vstinner
    Copy link
    Member Author

    vstinner commented May 3, 2022

    Hum, the issue was not fully fixed (especially, related to C++: see issue #91321). I pushed a few more changes:

    vstinner added a commit that referenced this issue Jun 2, 2022
    Use _PyObject_CAST() in the public C API to fix C++ compiler
    warnings: "use of old-style cast" (clang -Wold-style-cast).
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes topic-C-API
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants