[C API] PEP 674: Disallow using macros (Py_TYPE and Py_SIZE) as l-value #89639

vstinner · 2021-10-14T21:17:31Z

BPO	45476
Nosy	@malemburg, @rhettinger, @vstinner, @gareth-rees, @erlend-aasland, @arhadthedev
PRs	bpo-45476: Convert PyFloat_AS_DOUBLE() to static inline #28961 bpo-45476: Disallow using PyFloat_AS_DOUBLE() as l-value #28976 bpo-45476: Add _Py_RVALUE() macro #29860 bpo-45476: Disallow using asdl_seq_GET() as l-value #29866
Files	pep674_regex.py

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-10-14.21:17:31.326>
labels = ['expert-C-API', '3.11']
title = '[C API] PEP 674: Disallow using macros (Py_TYPE and Py_SIZE) as l-value'
updated_at = <Date 2022-01-27.20:29:40.417>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2022-01-27.20:29:40.417>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['C API']
creation = <Date 2021-10-14.21:17:31.326>
creator = 'vstinner'
dependencies = []
files = ['50462']
hgrepos = []
issue_num = 45476
keywords = ['patch']
message_count = 41.0
messages = ['403950', '403956', '403961', '403965', '403990', '403995', '403999', '404001', '404010', '404039', '406343', '406344', '406345', '406346', '406348', '406996', '407283', '407358', '407374', '407375', '407395', '407410', '407412', '407416', '407417', '407456', '407463', '407528', '407529', '407531', '407532', '407536', '407872', '407875', '411774', '411802', '411803', '411826', '411827', '411831', '411921']
nosy_count = 6.0
nosy_names = ['lemburg', 'rhettinger', 'vstinner', '[email protected]', 'erlendaasland', 'arhadthedev']
pr_nums = ['28961', '28976', '29860', '29866']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue45476'
versions = ['Python 3.11']

vstinner · 2021-10-14T21:17:31Z

The Python C API provides "AS" functions to convert an object to another type, like PyFloat_AS_DOUBLE(). These macros can be abused to be used as l-value: "PyFloat_AS_DOUBLE(obj) = new_value;". It prevents to change the PyFloat implementation and makes life harder for Python implementations other than CPython.

I propose to convert these macros to static inline functions to disallow using them as l-value.

I made a similar change for Py_REFCNT(), Py_TYPE() and Py_SIZE(). For these functions, I added "SET" variants: Py_SET_REFCNT(), Py_SET_TYPE(), Py_SET_SIZE(). Here, I don't think that the l-value case is legit, and so I don't see the need to add a way to *set* a value.

For example, I don't think that PyFloat_SET_DOUBLE(obj, value) would make sense. A Python float object is supposed to be immutable.

rhettinger · 2021-10-14T22:06:30Z

These macros can be abused to be used as l-value

You could simply document, "don't do that". Also if these is a need to make an assignment, you're now going to have to create a new setter function to fill the need.

We really don't have to go on thin ice converting to functions that might or might not be inlined depending on compiler specific nuances.

AFAICT, no one has ever has problems with these being macros. There really isn't a problem to be solved and the "solution" may in fact introduce new problems that we didn't have before.

Put me down for a -1 on the these blanket macro-to-inline function rewrites. The premise is almost entirely a matter of option, "macros are bad, functions are good" and a naive assumption, "inline functions" always inline.

vstinner · 2021-10-14T23:47:11Z

Raymond:

AFAICT, no one has ever has problems with these being macros.

This issue is about the API of PyFloat_AS_DOUBLE(). Implementing it as a macro or a static inline function is an implementation detail which doesn't matter. But I don't know how to disallow "PyFloat_AS_DOUBLE(obj) = value" if it is defined as a macro.

Have a look at the Facebook "nogil" project which is incompatible with accessing directly the PyObject.ob_refcnt member:
"Extensions must use Py_REFCNT and Py_SET_REFCNT instead of directly accessing reference count fields"
https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0/edit

Raymond:

You could simply document, "don't do that".

Documentation doesn't work. Developers easily fall into traps when it's possible to fall. See bpo-30459 for such trap with PyList_SET_ITEM() and PyCell_SET() macros. They were misused by two Python projects.

Raymond:

We really don't have to go on thin ice converting to functions that might or might not be inlined depending on compiler specific nuances.

Do you have a concrete example where a static inline function is not inlined, whereas it was inlined when it was a macro? So far, I'm not aware of any performance issue like that.

There were attempts to use __attribute__((always_inline)) (Py_ALWAYS_INLINE), but so far, using it was not a clear win.

vstinner · 2021-10-15T00:00:18Z

I searched for "PyFloat_AS_DOUBLE.*=" regex in the PyPI top 5000 projects. I couldn't find any project doing that.

I only found perfectly safe comparisons:

traits/ctraits.c: if (PyFloat_AS_DOUBLE(value) <= PyFloat_AS_DOUBLE(low)) {
traits/ctraits.c: if (PyFloat_AS_DOUBLE(value) >= PyFloat_AS_DOUBLE(high)) {
c/_cffi_backend.c: return PyFloat_AS_DOUBLE(ob) != 0.0;
pandas/_libs/src/klib/khash_python.h: ( PyFloat_AS_DOUBLE(a) == PyFloat_AS_DOUBLE(b) );

malemburg · 2021-10-15T09:10:44Z

I am with Raymond on this one.

If "protecting against wrong use" is the only reason to go down the slippery path of starting to rely on compiler optimizations for performance critical operations, the argument is not good enough.

If people do use macros in l-value mode, it's their problem when their code breaks, not ours. Please don't forget that we are operating under the consenting adults principle: we expect users of the CPython API to use it as documented and expect them to take care of the fallout, if things break when they don't.

We don't need to police developers into doing so.

vstinner · 2021-10-15T09:43:07Z

For PyObject, I converted Py_REFCNT(), Py_TYPE() and Py_SIZE() to static inline functions to enforce the usage of Py_SET_REFCNT(), Py_SET_TYPE() and Py_SET_SIZE(). Only a minority of C extensions are affected by these changes. Also, there is more pressure from recent optimization projects to abstract accesses to PyObject members.

I agree that it doesn't seem that "AS" functions are abused to *set* the inner string:

PyByteArray_AS_STRING()
PyBytes_AS_STRING()
PyFloat_AS_DOUBLE()

If "protecting against wrong use" is the only reason to go down the slippery path of starting to rely on compiler optimizations for performance critical operations, the argument is not good enough.

Again, I'm not aware of any performance issue caused by short static inline functions like Py_TYPE() or the proposed PyFloat_AS_DOUBLE(). If there is a problem, it should be addressed, since Python uses more and more static inline functions.

static inline functions is a common feature of C language. I'm not sure where your doubts of bad performance come from.

Using static inline functions has other advantages. It helps debugging and profiling, since the function name can be retrieved by debuggers and profilers when analysing the machine code. It also avoids macro pitfalls (like abusing a macro to use it as an l-value ;-)).

malemburg · 2021-10-15T10:20:32Z

On 15.10.2021 11:43, STINNER Victor wrote:

Again, I'm not aware of any performance issue caused by short static inline functions like Py_TYPE() or the proposed PyFloat_AS_DOUBLE(). If there is a problem, it should be addressed, since Python uses more and more static inline functions.

static inline functions is a common feature of C language. I'm not sure where your doubts of bad performance come from.

Inlining is something that is completely under the control of the
used compilers. Compilers are free to not inline function marked for
inlining, which can result in significant slowdowns on platforms
which are e.g. restricted in RAM and thus emphasize on small code size,
or where the CPUs have small caches or not enough registers (think
micro-controllers).

The reason why we have those macros is because we want the developers to be
able to make a conscious decision "please inline this code unconditionally
and regardless of platform or compiler". The developer will know better
what to do than the compiler.

If the developer wants to pass control over to the compiler s/he can use
the corresponding C function, which is usually available (and then, in many
cases, also provides error handling).

Using static inline functions has other advantages. It helps debugging and profiling, since the function name can be retrieved by debuggers and profilers when analysing the machine code. It also avoids macro pitfalls (like abusing a macro to use it as an l-value ;-)).

Perhaps, but then I never had to profile macro use in the past. Instead,
what I typically found was that using macros results in faster code when
used in inner loops, so profiling usually guided me to use macros instead
of functions.

That said, the macros you have inlined so far were all really trivial,
so a compiler will most likely always inline them (the number of machine
code instructions for the call would be more than needed for
the actual operation).

Perhaps we ought to have a threshold for making such decisions, e.g.
number of machine code instructions generated for the macro or so, to
not get into discussions every time :-)

A blanket "static inline" is always better than a macro is not good
enough as an argument, though.

Esp. in PGO driven optimizations the compiler could opt for using
the function call rather than inlining if it finds that the code
in question is not used much and it needs to save space to have
loops fit into CPU caches.

gareth-rees · 2021-10-15T11:01:27Z

If the problem is accidental use of the result of PyFloat_AS_DOUBLE() as an lvalue, why not use the comma operator to ensure that the result is an rvalue?

The C99 standard says "A comma operator does not yield an lvalue" in §6.5.17; I imagine there is similar text in other versions of the standard.

The idea would be to define a helper macro like this:

    /* As expr, but can only be used as an rvalue. */
    #define Py_RVALUE(expr) ((void)0, (expr))

and then use the helper where needed, for example:

    #define PyFloat_AS_DOUBLE(op) Py_RVALUE(((PyFloatObject *)(op))->ob_fval)

vstinner · 2021-10-15T12:42:17Z

#define Py_RVALUE(expr) ((void)0, (expr))

Oh, that's a clever trick!

I wrote #73162 which uses it.

vstinner · 2021-10-15T17:44:08Z

I created bpo-45490: "[meta][C API] Avoid C macro pitfalls and usage of static inline functions" to discuss macros and static inline functions more generally.

arhadthedev · 2021-11-15T07:54:56Z

Marc-Andre:

Inlining is something that is completely under the control of the
used compilers. Compilers are free to not inline function marked for
inlining [...]

I checked the following C snippet on gcc.godbolt.org using GCC 4.1.2 and Clang 3.0.0 with <no flags>/-O0/-O1/-Os, and both compilers inline a function marked as static inline:

    static inline int foo(int a)
    {
        return a * 2;
    }

    int bar(int a)
    {
        return foo(a) < 0;
    }

So even with -O0, GCC from 2007 and Clang from 2011 perform inlining. Though, old versions of CLang leave a dangling original copy of foo for some reason. I hope a linker removes it later.

As for other compilers, I believe that if somebody specifies -O0, that person has a sound reason to do so (like per-line debugging, building precise flame graphs, or other specific scenario where execution performance does not matter), so inlining interferes here anyway.

malemburg · 2021-11-15T09:22:59Z

On 15.11.2021 08:54, Oleg Iarygin wrote:

Oleg Iarygin <[email protected]> added the comment:

Marc-Andre:
> Inlining is something that is completely under the control of the
used compilers. Compilers are free to not inline function marked for
inlining [...]

I checked the following C snippet on gcc.godbolt.org using GCC 4.1.2 and Clang 3.0.0 with <no flags>/-O0/-O1/-Os, and both compilers inline a function marked as static inline:
static inline int foo(int a)
{
    return a * 2;
}

int bar(int a)
{
    return foo(a) \< 0;
}
So even with -O0, GCC from 2007 and Clang from 2011 perform inlining. Though, old versions of CLang leave a dangling original copy of foo for some reason. I hope a linker removes it later.

That's a great website :-) Thanks for sharing.

However, even with x86-64 gcc 11.2, I get assembler which does not inline
foo() without compiler options or with -O0: https://gcc.godbolt.org/z/oh6qnffh7

Only with -O1, the site reports inlining foo().

As for other compilers, I believe that if somebody specifies -O0, that person has a sound reason to do so (like per-line debugging, building precise flame graphs, or other specific scenario where execution performance does not matter), so inlining interferes here anyway.

Sure, but my point was a different one: even with higher optimization
levels, the compiler can decide whether or not to inline. We expect
the compiler to inline, but cannot be sure.

With macros the compiler has no choice and we are in control and even
when using -O0, you will still want e.g. Py_INCREF() and Py_DECREF()
inlined.

vstinner · 2021-11-15T09:35:37Z

I wrote PEP-670 "Convert macros to functions in the Python C API" for this issue:
https://www.python.org/dev/peps/pep-0670/

vstinner · 2021-11-15T09:54:49Z

I don't understand what you are trying to prove about compilers not inlining when you explicitly ask them... not to inline.

The purpose of the -O0 option is to minimize the build time, with a trade-off: don't expect the built executable to be fast. If you care about Python performance... well, don't use -O0? Python ./configure --with-pydebug builds Python with -Og which is not -O0. The -Og level is special, it's a different trade-off between the compiler build time and Python runtime performance.

If you want a Python debug build (Py_DEBUG macro defined, ./configure --with-pydebug), it's perfectly fine to build it with -O2 or -O3 to make sure that static inline functions are inlined. You can also enable LTO and PGO on a debug build.

GCC -Og option:
"""
-Og

Optimize debugging experience. -Og should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. It is a better choice than -O0 for producing debuggable code because some compiler passes that collect debug information are disabled at -O0.

Like -O0, -Og completely disables a number of optimization passes so that individual options controlling them have no effect. Otherwise -Og enables all -O1 optimization flags except for those that may interfere with debugging:

-fbranch-count-reg  -fdelayed-branch 
-fdse  -fif-conversion  -fif-conversion2  
-finline-functions-called-once 
-fmove-loop-invariants  -fmove-loop-stores  -fssa-phiopt 
-ftree-bit-ccp  -ftree-dse  -ftree-pta  -ftree-sra

"""
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

I prefer to use gcc -O0 when I develop on Python because the build time matters a lot in my very specific use case, and gcc -O0 is the best to debug Python in a debugger. See my article:
https://developers.redhat.com/articles/2021/09/08/debugging-python-c-extensions-gdb

On RHEL8, the Python 3.9 debug build is now built with -O0 to be fully usable in gdb (to debug C extensions).

In RHEL, the main motivation to use -O0 rather than -Og was to get a fully working gdb debugger on C extensions. With -Og, we get too many <optimized out> values which are blocking debugging :-(

malemburg · 2021-11-15T10:46:55Z

On 15.11.2021 10:54, STINNER Victor wrote:

I don't understand what you are trying to prove about compilers not inlining when you explicitly ask them... not to inline.

I'm not trying to prove anything, Victor.

I'm only stating the fact that by switching from macros to inline
functions we are giving away control to the compilers and should not
be surprised that Python now suddenly runs a lot slower on systems
which either have inlining optimizations switched off or where the
compiler (wrongly) assumes that creating more assembler would result
in slower code.

I've heard all your arguments against macros, but don't believe the
blanket approach to convert to inline functions is warranted in all
cases, in particular not for code which is private to the interpreter
and where we know that we need the code inlined to not result in
unexpected performance regressions.

I also don't believe that we should assume that Python C extension
authors will unintentionally misuse Python API macros. If they do,
it's their business to sort out any issues, not ours. If we document
that macros may not be used as l-values, that's clear enough. We don't
need to use compiler restrictions to impose such limitations.

IMO, conversion to inline functions should only happen, when

a) the core language implementation has a direct benefit, and

b) it is very unlikely that compilers will not inline the code
with -O2 settings, e.g. perhaps using a threshold of LOCs
or by testing with the website Oleg mentioned.

Overall, I think the PEP-670 should get some more attentions from the
SC to have a guideline to use as basis for deciding whether or not
to use the static inline function approach. That way we could avoid
these discussions :-)

BTW: Thanks for the details about -O0 vs. -Og.

vstinner · 2021-11-25T13:26:13Z

I decided to exclude macros which can be used as l-value from the PEP-670, since the motivation to disallow using them as l-value is different, and I prefer to restrict PEP-670 scope.

Disallowing using macros as l-value is more about hide implementation details and improving compatibility with Python implementations other than CPython, like PyPy or RustPython.

The PEP-670 is restricted to advantages and disavantages of converting macros to functions (static inline or regular functions).

vstinner · 2021-11-29T14:57:08Z

PyBytes_AS_STRING() and PyByteArray_AS_STRING() are used to modify string characters, but not used directly as l-value.

Search in PyPI top 5000 packages:

$ ./search_pypi_top_5000.sh '(PyByteArray|PyBytes)_AS_.*[^!<>=]=[^=]'
pypi-top-5000_2021-08-17/plyvel-1.3.0.tar.gz
pypi-top-5000_2021-08-17/numexpr-2.7.3.tar.gz
pypi-top-5000_2021-08-17/Cython-0.29.24.tar.gz

numexpr-2.7.3, numexpr/numexpr_object.cpp:

PyBytes_AS_STRING(constsig)[i] = 'b';
PyBytes_AS_STRING(constsig)[i] = 'i';
PyBytes_AS_STRING(constsig)[i] = 'l';
PyBytes_AS_STRING(constsig)[i] = 'f';
PyBytes_AS_STRING(constsig)[i] = 'd';
PyBytes_AS_STRING(constsig)[i] = 'c';
PyBytes_AS_STRING(constsig)[i] = 's';

plyvel-1.3.0, plyvel/_plyvel.cpp:

PyByteArray_AS_STRING(string)[i] = (char) v;
PyByteArray_AS_STRING(string)[i] = (char) v;

Cython-0.29.24:

$ grep -E '(PyByteArray|PyBytes)_AS_.*[^!<>=]=[^=]' -R .
./Cython/Utility/StringTools.c:            PyByteArray_AS_STRING(string)[i] = (char) v;
./Cython/Utility/StringTools.c:        PyByteArray_AS_STRING(string)[i] = (char) v;
./Cython/Utility/StringTools.c:            PyByteArray_AS_STRING(bytearray)[n] = value;

vstinner · 2021-11-30T11:14:49Z

New changeset c19c3a0 by Victor Stinner in branch 'main':
bpo-45476: Add _Py_RVALUE() macro (GH-29860)
c19c3a0

vstinner · 2021-11-30T14:13:04Z

I created this issue to disallow macros like PyFloat_AS_DOUBLE() and PyDict_GET_SIZE() as l-value. It seems like this change by itself is controversial.

I proposed one way to implement this change: convert macros to static inline functions. I didn't expect that this conversion would be also controversial. For now, I abandon the static inline approach, to focus on the implementation which keeps macros: modify macros to use _Py_RVALUE() => PR 28976.

Once the PR 28976 will be merged and the PEP-670 will be accepted, we can reconsider converting these macros to static inline functions, then it should be non controversial.

vstinner · 2021-11-30T14:14:03Z

New changeset 4b97d97 by Victor Stinner in branch 'main':
bpo-45476: Disallow using asdl_seq_GET() as l-value (GH-29866)
4b97d97

vstinner · 2021-11-30T18:34:51Z

I wrote PEP-674 "Disallow using macros as l-value" for this change: https://python.github.io/peps/pep-0674/

vstinner · 2021-11-30T23:37:24Z

In the PyPI top 5000, I found two projects using PyDescr_TYPE() and PyDescr_NAME() as l-value: M2Crypto and mecab-python3. In both cases, it was code generated by SWIG:

This file was automatically generated by SWIG (http://www.swig.org).
Version 4.0.2

M2Crypto-0.38.0/src/SWIG/_m2crypto_wrap.c and mecab-python3-1.0.4/src/MeCab/MeCab_wrap.cpp contain the function:

SWIGINTERN PyGetSetDescrObject *
SwigPyStaticVar_new_getset(PyTypeObject *type, PyGetSetDef *getset) {

  PyGetSetDescrObject *descr;
  descr = (PyGetSetDescrObject *)PyType_GenericAlloc(SwigPyStaticVar_Type(), 0);
  assert(descr);
  Py_XINCREF(type);
  PyDescr_TYPE(descr) = type;
  PyDescr_NAME(descr) = PyString_InternFromString(getset->name);
  descr->d_getset = getset;
  if (PyDescr_NAME(descr) == NULL) {
    Py_DECREF(descr);
    descr = NULL;
  }
  return descr;
}

vstinner · 2021-11-30T23:46:18Z

I found 4 projects using "Py_TYPE(obj) = new_type;" in the PyPI top 5000:

mypy-0.910:

mypyc/lib-rt/misc_ops.c: Py_TYPE(template_) = &PyType_Type;
mypyc/lib-rt/misc_ops.c: Py_TYPE(t) = metaclass;

recordclass-0.16.3:

lib/recordclass/_dataobject.c: Py_TYPE(op) = type;
lib/recordclass/_dataobject.c: Py_TYPE(op) = type;
lib/recordclass/_litetuple.c: // Py_TYPE(ob) = &PyLiteTupleType_Type;

pysha3-1.0.2:

Modules/_sha3/sha3module.c: Py_TYPE(type) = &PyType_Type;

datatable-1.0.0.tar.gz:

src/core/python/namedtuple.cc: Py_TYPE(v) = type.v;
src/core/python/tuple.cc: Py_TYPE(v_new) = v_type;

vstinner · 2021-12-01T00:21:50Z

In the PyPI top 5000 projects, I found 32 projects using "Py_SIZE(obj) =
new_size": 8 of them are written manually, 24 use Cython.

8 projects using "Py_SIZE(obj) = new_size":

guppy3-3.1.2: src/sets/bitset.c and src/sets/nodeset.c
mypy-0.910: list_resize() in mypyc/lib-rt/pythonsupport.h
pickle5-0.0.12: pickle5/_pickle.c
python-snappy-0.6.0: maybe_resize() in snappy/snappymodule.cc
recordclass-0.16.3: lib/recordclass/_dataobject.c + code generated by Cython
scipy-1.7.3: scipy/_lib/boost/boost/python/object/make_instance.hpp
zodbpickle-2.2.0: src/zodbpickle/_pickle_33.c
zstd-1.5.0.2: src/python-zstd.c

24 projects using "Py_SIZE(obj) = new_size" generated by an outdated Cython:

Naked-0.1.31
Shapely-1.8.0
dedupe-hcluster-0.3.8
fastdtw-0.3.4
fuzzyset-0.0.19
gluonnlp-0.10.0
hdbscan-0.8.27
jenkspy-0.2.0
lightfm-1.16
neobolt-1.7.17
orderedset-2.0.3
ptvsd-4.3.2
py_spy-0.3.11
pyemd-0.5.1
pyhacrf-datamade-0.2.5
pyjq-2.5.2
pypcap-1.2.3
python-crfsuite-0.9.7
reedsolo-1.5.4
tables-3.6.1
thriftpy-0.3.9
thriftrw-1.8.1
tinycss-0.4
triangle-20200424

vstinner · 2021-12-01T00:25:37Z

Attached pep674_regex.py generated a regex to search for code incompatible with the PEP-674.

To download PyPI top 5000, you can use my script:
https://github.com/vstinner/misc/blob/main/cpython/download_pypi_top.py

To grep a regex in tarball and ZIP archives, you can use the rg command:

$ rg -zl REGEX DIRECTORY/*.{zip,gz,bz2,tgz}

Or you can try my script:
https://github.com/vstinner/misc/blob/main/cpython/search_pypi_top.py

vstinner · 2021-12-01T13:29:04Z

I updated my ./search_pypi_top_5000.py script to ignore files generated by Cython.

On PyPI top 5000, I only found 16 projects impacted by the PEP-674 (16/5000 = 0.3%):

datatable-1.0.0
frozendict-2.1.1
guppy3-3.1.2
M2Crypto-0.38.0
mecab-python3-1.0.4
mypy-0.910
Naked-0.1.31
pickle5-0.0.12
pycurl-7.44.1
PyGObject-3.42.0
pysha3-1.0.2
python-snappy-0.6.0
recordclass-0.16.3
scipy-1.7.3
zodbpickle-2.2.0
zstd-1.5.0.2

I ignored manually two false positives in 3 projects:

"#define __Pyx_SET_SIZE(obj, size) Py_SIZE(obj) = (size)" in Cython
"* Py_TYPE(obj) = new_type must be replaced with Py_SET_TYPE(obj, new_type)": comment in psycopg2 and psycopg2-binary

vstinner · 2021-12-01T15:54:34Z

Oops, sorry, pycurl-7.44.1 and PyGObject-3.42.0 are not affected, they only define Py_SET_TYPE() macro for backward compatibility. So right now, only 14 projects are affected.

vstinner · 2021-12-02T15:13:22Z

zstd-1.5.0.2: src/python-zstd.c

I proposed a fix upstream: sergey-dryabzhinsky/python-zstd#70

vstinner · 2021-12-02T15:20:50Z

frozendict-2.1.1

If I understand correctly, this module is compatible with the PEP-674, it only has to copy Python 3.11 header files once Python 3.11 is released, to port the project to Python 3.11.

Incompatable code is not part of the "frozendict" implementation, but only in copies of the Python header files (Python 3.6, 3.7, 3.8, 3.9 and 3.10). For example, it contains the frozendict/src/3_10/cpython_src/Include/object.h header: copy of CPython Include/object.h file.

Source code: https://github.com/Marco-Sulla/python-frozendict

vstinner · 2021-12-02T15:27:52Z

pysha3-1.0.2

This module must not be used on Python 3.6 and newer which has a built-in support for SHA-3 hash functions. Example:

$ python3.6
Python 3.6.15 (default, Sep  5 2021, 00:00:00) 
>>> import hashlib
>>> h=hashlib.new('sha3_224'); h.update(b'hello'); print(h.hexdigest())
b87f88c72702fff1748e58b87e9141a42c0dbedc29a78cb0d4a5cd81

By the way, building pysha3 on Python 3.11 now fails with:

[Modules/_sha3/backport.inc:78](https://github.com/python/cpython/blob/main/Modules/_sha3/backport.inc#L78):10: fatal error: pystrhex.h: No such file or directory

The pystrhex.h header file has been removed in Python 3.11 by bpo-45434. But I don't think that it's worth it trying to port it to Python 3.11, if the module must not be used on Python 3.6 and newer.

Environment markers can be used to skip the pysha3 dependency on Python 3.6 on newer.

Example: "pysha3; python_version < '3.6'"

vstinner · 2021-12-02T15:41:06Z

Naked-0.1.31

Affected code is only code generated by Cython: the project only has to regenerated code with a recent Cython version.

vstinner · 2021-12-02T16:19:10Z

mypy-0.910

I proposed a fix: python/mypy#11652

vstinner · 2021-12-06T22:20:56Z

In the PyPI top 5000, I found two projects using PyDescr_TYPE() and PyDescr_NAME() as l-value: M2Crypto and mecab-python3. In both cases, it was code generated by SWIG

I proposed a first PR for Py_TYPE():
swig/swig#2116

vstinner · 2021-12-06T22:48:04Z

python-snappy-0.6.0: maybe_resize() in snappy/snappymodule.cc

I proposed a fix: intake/python-snappy#114

vstinner · 2022-01-26T16:37:19Z

In the PyPI top 5000, I found two projects using PyDescr_TYPE() and PyDescr_NAME() as l-value: M2Crypto and mecab-python3. In both cases, it was code generated by SWIG

I created bpo-46538 "[C API] Make the PyDescrObject structure opaque" to handle PyDescr_NAME() and PyDescr_TYPE() macros. But IMO it's not really worth it to make the PyDescrObject structure opaque. It's just too much work, whereas PyDescrObject is not performance sensitive. It's ok to continue exposing this structure in public for now.

I will exclude PyDescr_NAME() and PyDescr_TYPE() from the PEP-674.

vstinner · 2022-01-26T22:56:34Z

datatable-1.0.0.tar.gz

I created h2oai/datatable#3231

vstinner · 2022-01-26T23:12:13Z

pickle5-0.0.12: pickle5/_pickle.c

This project is a backport targeting Python 3.7 and older. I'm not sure if it makes sense to update to it to Python 3.11.

It's the same for pysha3 which targets Python <= 3.5.

vstinner · 2022-01-27T03:07:01Z

guppy3-3.1.2: src/sets/bitset.c and src/sets/nodeset.c

I created: zhuyifei1999/guppy3#40

vstinner · 2022-01-27T03:09:36Z

scipy-1.7.3: scipy/_lib/boost/boost/python/object/make_instance.hpp

This is a vendored the Boost.org python module which has already been fixed in boost 1.78.0 (commit: January 2021) by:
boostorg/python@500194e

scipy should just update its scipy/_lib/boost copy.

vstinner · 2022-01-27T03:24:46Z

recordclass-0.16.3: lib/recordclass/_dataobject.c + code generated by Cython

I created: https://bitbucket.org/intellimath/recordclass/pull-requests/1/python-311-support-use-py_set_size

vstinner · 2022-01-27T20:29:08Z

zodbpickle-2.2.0: src/zodbpickle/_pickle_33.c

Technically, zodbpickle works on Python 3.11 and is not impacted by the Py_SIZE() change.

_pickle_33.c redefines the Py_SIZE() macro to continue using as an l-value:
zopefoundation/zodbpickle@8d99afc

I proposed a PR to use Py_SET_SIZE() explicitly:
zopefoundation/zodbpickle#64

kigawas · 2022-07-27T05:57:49Z

pysha3-1.0.2

This module must not be used on Python 3.6 and newer which has a built-in support for SHA-3 hash functions. Example:
$ python3.6
Python 3.6.15 (default, Sep  5 2021, 00:00:00) 
>>> import hashlib
>>> h=hashlib.new('sha3_224'); h.update(b'hello'); print(h.hexdigest())
b87f88c72702fff1748e58b87e9141a42c0dbedc29a78cb0d4a5cd81
By the way, building pysha3 on Python 3.11 now fails with:
[Modules/_sha3/backport.inc:78](https://github.com/python/cpython/blob/main/Modules/_sha3/backport.inc#L78):10: fatal error: pystrhex.h: No such file or directory
The pystrhex.h header file has been removed in Python 3.11 by bpo-45434. But I don't think that it's worth it trying to port it to Python 3.11, if the module must not be used on Python 3.6 and newer.

Environment markers can be used to skip the pysha3 dependency on Python 3.6 on newer.

Example: "pysha3; python_version < '3.6'"

Strictly speaking, pysha3 is not equivalent to hashlib because keccak functions are different. If the improvement in #83720 is done, pysha3 may be completely replaced.

However, since pystrhex.h removed from Python 3.11, the pycryptodome is the only legit library supporting Ethereum keccak.

vstinner · 2022-08-03T20:12:09Z

However, since pystrhex.h removed from Python 3.11

Technically, these functions are still exported and remain available in the internal C API: Include/internal/pycore_strhex.h

If someone wants a public C API for these, it should be asked. But by default, the private functions must not be used outside Python itself: https://docs.python.org/dev/c-api/stable.html

It's not hard to copy Python/pystrhex.c (174 lines) or reimplement it.

kigawas · 2022-08-04T04:33:37Z

@vstinner

That's not practical because pysha3 is not actively being maintained. Many downstream libraries will break from 3.11.

For anyone searching the errors, just use pycryptodome instead to save your time.

vstinner · 2022-11-03T17:48:04Z

https://peps.python.org/pep-0674/ was deferred by the SC. For now, I prefer to close this issue.

vstinner added 3.11 only security fixes topic-C-API labels Oct 14, 2021

vstinner changed the title ~~[C API] Convert "AS" functions, like PyFloat_AS_DOUBLE(), to static inline functions~~ [C API] Disallow using PyFloat_AS_DOUBLE() as l-value Oct 15, 2021

vstinner changed the title ~~[C API] Disallow using PyFloat_AS_DOUBLE() as l-value~~ [C API] PEP 674: Disallow using macros as l-value Nov 30, 2021

vstinner changed the title ~~[C API] PEP 674: Disallow using macros as l-value~~ [C API] PEP 674: Disallow using macros (Py_TYPE and Py_SIZE) as l-value Jan 27, 2022

ezio-melotti transferred this issue from another repository Apr 10, 2022

bedevere-bot mentioned this issue Aug 19, 2022

gh-89639: Revert Py_SIZE() static inline function #96119

Closed

kigawas mentioned this issue Oct 17, 2022

Issues with pysha3 in windows kigawas/web3-input-decoder#82

Closed

vstinner closed this as completed Nov 3, 2022

[C API] PEP 674: Disallow using macros (Py_TYPE and Py_SIZE) as l-value #89639

[C API] PEP 674: Disallow using macros (Py_TYPE and Py_SIZE) as l-value #89639

Comments

vstinner commented Oct 14, 2021

vstinner commented Oct 14, 2021

rhettinger commented Oct 14, 2021

vstinner commented Oct 14, 2021

vstinner commented Oct 15, 2021

malemburg commented Oct 15, 2021

vstinner commented Oct 15, 2021

malemburg commented Oct 15, 2021

gareth-rees mannequin commented Oct 15, 2021

vstinner commented Oct 15, 2021

vstinner commented Oct 15, 2021

arhadthedev commented Nov 15, 2021

malemburg commented Nov 15, 2021

vstinner commented Nov 15, 2021

vstinner commented Nov 15, 2021

malemburg commented Nov 15, 2021

vstinner commented Nov 25, 2021

vstinner commented Nov 29, 2021

vstinner commented Nov 30, 2021

vstinner commented Nov 30, 2021

vstinner commented Nov 30, 2021

vstinner commented Nov 30, 2021

vstinner commented Nov 30, 2021

vstinner commented Nov 30, 2021

vstinner commented Dec 1, 2021

vstinner commented Dec 1, 2021

vstinner commented Dec 1, 2021

vstinner commented Dec 1, 2021

vstinner commented Dec 2, 2021

vstinner commented Dec 2, 2021

vstinner commented Dec 2, 2021

vstinner commented Dec 2, 2021

vstinner commented Dec 2, 2021

vstinner commented Dec 6, 2021

vstinner commented Dec 6, 2021

vstinner commented Jan 26, 2022

vstinner commented Jan 26, 2022

vstinner commented Jan 26, 2022

vstinner commented Jan 27, 2022

vstinner commented Jan 27, 2022

vstinner commented Jan 27, 2022

vstinner commented Jan 27, 2022

kigawas commented Jul 27, 2022

vstinner commented Aug 3, 2022

kigawas commented Aug 4, 2022

vstinner commented Nov 3, 2022