Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C standard for public headers #30

Open
encukou opened this issue Jun 13, 2024 · 44 comments
Open

C standard for public headers #30

encukou opened this issue Jun 13, 2024 · 44 comments

Comments

@encukou
Copy link
Collaborator

encukou commented Jun 13, 2024

Previous discussion: capi-workgroup/problems#42 & capi-workgroup/api-evolution#22

It seems the consensus of the people involved is that Python.h should be usable with:

  • C99 or newer
  • C++11 or newer

Now that there's a report about the 3.13 headers containing a C11 feature (anonymous structs/unions), I guess it's time to make an official guideline.

@vstinner
Copy link

The C API is now tested by test_cext and test_cppext: these tests checks that including <Python.h> does not issue any compiler warning.

test_cext has 5 tests using -Werror -Werror=declaration-after-statement:

  • C API (with no compiler flags)
  • C API with C99
  • C API with C11
  • Limited C API (with no compiler flags)
  • Limited C API with C11

test_cppext has 4 tests using -Werror:

  • C API (with no compiler flags)
  • C API with C++03
  • C API with C++11
  • C API with C++14 (only on Windows)

Without extra compiler flags, we're good. The problem of the issue gh-120293 are the usage of -Werror=pedantic and -Werror=cast-qual. I don't think that we want to support these flags.

@encukou
Copy link
Collaborator Author

encukou commented Jun 14, 2024

This issue is not about the tests, but about the C standard we want the headers to support.
After we decide that, we can test as much of it as we can, and make triage easy for issues about untestable things. But IMO we should write this down explicitly; it shouldn't be “whatever the tests do”.

Again, the tests currently pass even though the headers use a feature that's not in the C99 standard. If the intent is to follow the standard, then the test should change.

@vstinner
Copy link

Again, the tests currently pass even though the headers use a feature that's not in the C99 standard.

Well, "standard" is one thing, "what's being used in practice" is something else. No C compiler respect the standard by default, they add "compiler extensions", and require special compiler options to respect the standard.

For example, unnamed unions are supposed by all C compilers used by Python (GCC, clang, MSC) in C99 mode.

So the question is if we care about "strict standard" or "what's being used in practice by C compilers supported by Python" (so including compiler extensions).

In practice, the question is if we want to support -Werror=pedantic or not.

I proposed to not support -Werror=pedantic nor -Werror=cast-qual options. It's too much work, Python has more and more static inline functions, and it's tricky to respect the "pedantic" mode.

@encukou
Copy link
Collaborator Author

encukou commented Jun 14, 2024

So, you're saying that we should only try to be compatible with specific compilers (and compiler options), rather than C standards?

Python has more and more static inline functions

I don't think supporting C89 is on the table, so static inline functions aren't really relevant.

@vstinner
Copy link

So, you're saying that we should only try to be compatible with specific compilers (and compiler options), rather than C standards?

Yes.

I don't think supporting C89 is on the table, so static inline functions aren't really relevant.

I was referring to the fact that static inline function bodies expose a lot of code and usually compiler problems come from such code. For example, of pyatomic functions would be regular opaque functions, we wouldn't have this discussion about -Werror=cast-qual. In the past, we also got many cast issues with C++. I also had to introduce _Py_NULL to use nullptr in C++ in our Python C API :-)

@encukou
Copy link
Collaborator Author

encukou commented Jun 14, 2024

Well, I don't see why you brought up -Werror=cast-qual here. If we support specific compiler options, adding that one should be a separate conversation.

Anyway, I think I understand your opinion now; I just disagree with it. I guess it's time for others to chime in.

@gvanrossum
Copy link

I am confused though. Can you cleanly summarize the two opinions on the table?

@eli-schwartz
Copy link

$ cat /tmp/foo.c
#include <Python.h>

With c99:

$ gcc -std=c99 -Werror=pedantic -I/usr/include/python3.13 -c /tmp/foo.c -o /tmp/foo.o
In file included from /usr/include/python3.13/Python.h:92,
                 from /tmp/foo.c:1:
/usr/include/python3.13/cpython/code.h:32:10: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   32 |         };
      |          ^
/usr/include/python3.13/cpython/code.h:34:6: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   34 |     };
      |      ^
/usr/include/python3.13/cpython/code.h:27:9: error: struct has no named members [-Werror=pedantic]
   27 | typedef struct {
      |         ^~~~~~
In file included from /usr/include/python3.13/Python.h:128:
/usr/include/python3.13/cpython/optimizer.h:60:14: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   60 |             };
      |              ^
/usr/include/python3.13/cpython/optimizer.h:62:10: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   62 |         };
      |          ^
/usr/include/python3.13/cpython/optimizer.h:63:6: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   63 |     };
      |      ^
cc1: some warnings being treated as errors

With gnu99:

$ gcc -std=gnu99 -Werror=pedantic -I/usr/include/python3.13 -c /tmp/foo.c -o /tmp/foo.o
In file included from /usr/include/python3.13/Python.h:92,
                 from /tmp/foo.c:1:
/usr/include/python3.13/cpython/code.h:32:10: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   32 |         };
      |          ^
/usr/include/python3.13/cpython/code.h:34:6: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   34 |     };
      |      ^
/usr/include/python3.13/cpython/code.h:27:9: error: struct has no named members [-Werror=pedantic]
   27 | typedef struct {
      |         ^~~~~~
In file included from /usr/include/python3.13/Python.h:128:
/usr/include/python3.13/cpython/optimizer.h:60:14: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   60 |             };
      |              ^
/usr/include/python3.13/cpython/optimizer.h:62:10: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   62 |         };
      |          ^
/usr/include/python3.13/cpython/optimizer.h:63:6: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]
   63 |     };
      |      ^
cc1: some warnings being treated as errors

With c11:

$ gcc -std=c11 -Werror=pedantic -I/usr/include/python3.13 -c /tmp/foo.c -o /tmp/foo.o
$

@vstinner,

Well, "standard" is one thing, "what's being used in practice" is something else. No C compiler respect the standard by default, they add "compiler extensions", and require special compiler options to respect the standard.

This has nothing to do with "compiler extensions". It is about the fact that some compilers, like GCC, will allow any code that is legal by a later standard to also compile with a newer standard if the alternative is an error. The theory is that "well you know, your meaning is obvious, you wrote code that is legal in c23 but not c17, so clearly you meant to build with c17". This doesn't actually mean it's safe to ignore the fact that it compiles by default.

You can get the same issue if you upgrade to future gcc 15, and start using c23 features. As long as there isn't an incompatible alternative meaning in earlier standards, the compiler typically accepts it. Hopefully with a warning that you're doing something which isn't ISO C99.

... which brings us back to using -Werror=pedantic, since that offers a chance to catch cases where you're using code that isn't legal for a std but the compiler is accepting it anyways "to be nice".

If it was a compiler extension it would be allowed when compiling with "GNU C", the collection of GCC vendor extensions on top of standard C. But it is not.

@eli-schwartz
Copy link

I am confused though. Can you cleanly summarize the two opinions on the table?

If I understand correctly, the options are:

  • require per policy that the headers to be compatible with any conforming C99 compiler, and check this as best one can, using the compilers available in CI (-Werror=pedantic can give you advance warning in many cases)
  • require per policy that the headers be compatible with officially supported compilers (gcc, clang, MSVC) when given the -std=c99 option to tell the compiler to perform any dealbreaker decisions by conforming with the c99 interpretation rather than the c23 interpretation.

Option 2 requires mandating specific compilers, and also specific compiler versions (e.g. require that GCC is at least "whichever version of GCC first implemented c11 unnamed unions").

@vstinner
Copy link

code.h:32:10: error: ISO C99 doesn’t support unnamed structs/unions [-Werror=pedantic]

The problem is not specific to code.h. I added __extension__ (GCC/clang) and __pragma(warning(disable: 4201)) (MSC) to object.h to make a similar warning quiet in the Free Threading mode, on the unnamed union:

#ifndef Py_GIL_DISABLED
struct _object {
#if (defined(__GNUC__) || defined(__clang__)) \
        && !(defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L)
    // On C99 and older, anonymous union is a GCC and clang extension
    __extension__
#endif
#ifdef _MSC_VER
    // Ignore MSC warning C4201: "nonstandard extension used:
    // nameless struct/union"
    __pragma(warning(push))
    __pragma(warning(disable: 4201))
#endif
    union {
       Py_ssize_t ob_refcnt;
#if SIZEOF_VOID_P > 4
       PY_UINT32_T ob_refcnt_split[2];
#endif
    };
#ifdef _MSC_VER
    __pragma(warning(pop))
#endif

    PyTypeObject *ob_type;
};
#else
(...)
#endif

So it seems like C11 is what we need/want here.

@vstinner
Copy link

Now that there's a python/cpython#120293 about the 3.13 headers containing a C11 feature (anonymous structs/unions), I guess it's time to make an official guideline.

I suggest to require C11 and C++03 at minimum to use the Python C API.

@encukou
Copy link
Collaborator Author

encukou commented Jun 17, 2024

@guido, sorry for the delay.

I am confused though. Can you cleanly summarize the two opinions on the table?

1. Document the standards we support.

The C API guidelines will list the C/C++ standards that #include <Python.h> is meant to comply with. The list is open for nitpicking; I propose:

  • C99 or newer
  • C++11 or newer
  • (possibly: C89 with several select C99 features, as in PEP 7)

We'd welcome PRs to improve conformance, and to make the tests better in verifying conformance. (Within reason of course -- the guidelines can always be changed.)

2. Don't.

We support specific compilers, as listed in PEP 11. The supported compiler options are defined in the tests/CI.

(That's my understanding; I'll let Victor clarify.)

@vstinner
Copy link

  1. Don't.
    We support specific compilers, as listed in PEP 11. The supported compiler options are defined in the tests/CI.
    (That's my understanding; I'll let Victor clarify.)

That's not my intent. I mean that in practice, we can support C99 with compiler extensions, such as __extension__.

@vstinner
Copy link

I mean that in practice, we can support C99 with compiler extensions, such as extension.

Anyway, as I wrote previously, I agree with Petr and I suggest to require C11 to get unnamed unions and other nice C11 features.

@zooba
Copy link

zooba commented Jun 17, 2024

I suggest to require C11 to get unnamed unions and other nice C11 features

I don't think we can reasonably require it in our headers (any of the headers that we distribute, "internal" or not), until the build backends used by package publishers have released stable versions that automatically add in the compiler options necessary to enable these modes.

This isn't a decision that affects us - it affects our users, who will suddenly find that they cannot compile with our newest releases due to code that they can't change.

I'd prefer to jump through as many hoops as possible to avoid breaking our users like that. I think it's worth the effort to not use these features in public header files (again, whether "internal" or not) so that we don't put the burden on our users. As far as I'm aware, nothing is broken for us, we just have to do a little bit more typing in our own code.

@vstinner
Copy link

I'd prefer to jump through as many hoops as possible to avoid breaking our users like that.

My attempt to fix the issue for PyCode and PyOptimizer APIs: python/cpython#120643

@vstinner
Copy link

@encukou: I suggest to declare that the Python C API should respect ISO C99.

My attempt to fix the issue for PyCode and PyOptimizer APIs: python/cpython#120643

I merged my PR, so the Python C API is basically compatible with strict ISO C99.

Only Free Threading build has one exception: PyObject uses pragma/extension for its unnamed union.

@vstinner
Copy link

By the way, I also fixed the "const qualifier" warnings: python/cpython#120593

@vstinner
Copy link

I created a PR to updated PEP 7: python/peps#3862

The public C API should be compatible with C99 and with C++03.

@colesbury
Copy link

A clarification: the anonymous union has been there since Python 3.12 and it's part of the default (not free-threaded) build's definition of struct _object.

@zooba
Copy link

zooba commented Jul 15, 2024

To add on to Sam's clarification, here's the two line anonymous union:

#if (defined(__GNUC__) || defined(__clang__)) \
        && !(defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L)
    // On C99 and older, anonymous union is a GCC and clang extension
    __extension__
#endif
#ifdef _MSC_VER
    // Ignore MSC warning C4201: "nonstandard extension used:
    // nameless struct/union"
    __pragma(warning(push))
    __pragma(warning(disable: 4201))
#endif
    union {
       Py_ssize_t ob_refcnt;
#if SIZEOF_VOID_P > 4
       PY_UINT32_T ob_refcnt_split[2];
#endif
    };
#ifdef _MSC_VER
    __pragma(warning(pop))
#endif

In any case where we don't have the back-compat constraints that we had here, it's definitely going to be less code to just avoid a nameless union.

@encukou
Copy link
Collaborator Author

encukou commented Jul 18, 2024

Heh, no wonder why compiling with -std=c99 doesn't catch it.

Here's my suggestion for the PEP update:

* The public C API should be compatible with C99 and with C++03.

  The existing API does use a few C11 features which are
  commonly available as compiler extensions to C99.
  New API should not use these features.

@vstinner
Copy link

Here's my suggestion for the PEP update:

I updated my PEP PR with these changes: python/peps#3862

@vstinner
Copy link

vstinner commented Aug 5, 2024

I proposed a different PR to require C11 and C++11: python/peps#3896

@encukou
Copy link
Collaborator Author

encukou commented Aug 19, 2024

This was partially covered in a PR in api-evolution; the resulting guidelines pre-PEP document now says:

[TODO: PEP 7 should be updated to link here once this PEP goes live.]

Public C API must be compatible with:

  • C11, with optional features needed by CPython:
    • IEEE 754 floating point
    • Atomics (!__STDC_NO_ATOMICS__, or MSVC)
  • C99
  • C89 with several select C99 features [...]
  • C++03

This:

  • makes us explicitly target specific standards, and
  • selects standards that roughly match our existing tests.

As I see it, the remaining open question is whether we should drop C89/C99. The latest public discussion around this, from a year ago, suggests that we shouldn't.
IMO, if we want to do that, we need a Discourse thread, not an issue here or on PEP 7. Does anyone want to champion that?

@vstinner
Copy link

I would prefer to just say:

The public C API should be compatible with C11 (with optional atomic primitives and types) and with C++11.

We can still attempt to remain compatible with C89 and C99 on specific cases, like some compiler options. But IMO it's time to move on and say that C11 is needed.

@zooba
Copy link

zooba commented Aug 26, 2024

But IMO it's time to move on and say that C11 is needed.

I think we'd need to tie this change to a specific release, and make it fairly public, particularly so that build backends have a chance to update their default settings for that version and to warn their users that their code's behaviour may change.

@encukou
Copy link
Collaborator Author

encukou commented Aug 27, 2024

Yup. The C89/C99 mess is the status quo, based on the reports we get when we break it.
If we want to change it, I'd also want to go through the new features. For example, we'd probably want to disallow requiring <threads.h> for public API, until we're sure non-C languages can handle it?

@zooba
Copy link

zooba commented Aug 27, 2024

I'm not sure we can ever be sure that non-C languages can use standard C libraries - they tend not to have external interfaces.

I think we should disallow any dependency on any header other than our own. If users need the functionality, we can wrap and re-export it (e.g. like for the new mutex type).

@gvanrossum
Copy link

gvanrossum commented Aug 27, 2024

I think we should disallow any dependency on any header other than our own. If users need the functionality, we can wrap and re-export it (e.g. like for the new mutex type).

That seems extreme. Surely we can depend on <stdbool.h>? Python.h itself imports about a dozen standard header files, from <assert.h> to <wchar.h>. (Though some of these are excluded when new enough values of Py_LIMITED_API are set.) I didn't audit the other files, but I'm sure there are more.

@encukou
Copy link
Collaborator Author

encukou commented Aug 28, 2024

<stdbool.h> is somewhat dangerous for FFI, since memcpy-ing anything but the exact bit-patterns for true and false into a bool has undefined behaviour. (And AFAIK the bit-patterns are unspecified too, so you shouldn't just copy 1 or 0 from an appropriately-sized integer.)

We could allow bool for new API, but again, I'd prefer a more considered approach than just saying “we use C11 now”.

@zooba
Copy link

zooba commented Aug 28, 2024

That seems extreme. Surely we can depend on <stdbool.h>? Python.h itself imports about a dozen standard header files, from <assert.h> to <wchar.h>.

Fair. I was thinking of function calls, rather than type or macro definitions. I don't believe we require any of those to be used by users, though (wchar_t on Windows doesn't require the include, I'm not sure whether that's the case elsewhere).

@vstinner
Copy link

The C standard question arose in issue python/cpython#123747 where static_assert() is used (in an internal C API header) whereas the C API consumer asks for C99. On old clang version on macOS, static_assert() is not available. I worked around the issue by replacing static_assert() with #if... #error... preprocessor.

I suggest to draw a line and require C11 to use the Python C API in Python 3.14.

@encukou
Copy link
Collaborator Author

encukou commented Sep 10, 2024

IMO, internal code is already C11+. Reverting to C99 there is a friendly gesture for Cython as they work to remove use of private APIs.

And it seems making the public headers C11+ would not be friendly to Cython -- according to that issue, they want C99.

@vstinner
Copy link

I read again the discussion. While C11 would solve most discussions, we don't want to require C11 right now. I understood that C99 would be preferred in practice. The problem is that the C API is not compatible with strict ("pedantic") C99. So I propose:

Public C API must be compatible with:

  • C99 with compiler extensions for atomic variables and unnamed unions.
  • C++03.

@zooba
Copy link

zooba commented Sep 19, 2024

Can we also add "warning free under /W4 (with inline suppressions)"?

@vstinner
Copy link

Can we also add "warning free under /W4 (with inline suppressions)"?

I suggest to check warnings using test_cext and test_cppext. Currently on Windows, test_cext uses /WX (treat warnings as errors) with /W3 (flag from setuptools). I will see if we can enable /W4.

@encukou
Copy link
Collaborator Author

encukou commented Sep 23, 2024

IMO, we should use the same warnings flags for all of the code, not just the public API.
Currently that's in PEP 7, but very underspecified: “No compiler warnings with major compilers (gcc, VC++, a few others).”

@zooba
Copy link

zooba commented Sep 23, 2024

IMO, we should use the same warnings flags for all of the code, not just the public API.

Yes, but that's fairly aspirational. We have a range of other tests that ensure that our code works, despite /W4 warnings (probably) existing.

But if we trigger warnings in code that's included in other people's projects, we impact their ability to get clean. So it's not that we're checking our own API with tighter flags, but we're allowing our users to check their own code with them without having to suppress our own warnings.

@encukou
Copy link
Collaborator Author

encukou commented Sep 23, 2024

Makes sense. Still, new compiler versions can add new warnings, so it's probably best to check in this tests, and not make a guarantee for users.

@zooba
Copy link

zooba commented Sep 23, 2024

We do check in tests now :) Victor was right on it

@vstinner
Copy link

So what do you think of requiring "C99 with compiler extensions for atomic variables and unnamed unions": #30 (comment) ?

@zooba
Copy link

zooba commented Sep 24, 2024

Provided they don't raise any warnings, they're fine. (And provided we're open to adding more build configurations to test for warnings, if users need them, though I expect that /W4 covers enough, and the /Za and /permissive- options should get things closer to portable C99 rather than creating new warnings for us.)

@encukou
Copy link
Collaborator Author

encukou commented Sep 24, 2024

OK, let's treat it as a change from what's now in the pre-PEP.
I went through the Wikipedia pages for C99 and C11 and classified the new features; see below for the details.

Rather than “Unnamed unions”, let's refer to the feature as anonymous structures and unions. (I don't think there is a compiler that supports anonymous unions but not structs.)

By requiring anything from C11, we break -std=c99. (Cython supports that, and I assume there are other projects like Cython.) AFAIK, that is the main compatibility break here. All relevant compilers support C11, they just hide the features in old-standard mode.
At that point, we might as well move to C11 altogether. That will give us alignof and static_assert. AFAIK any compiler that can do anonymous unions can do those too.

C99

Features already needed by Python 3.6:

  • inline functions
  • intermingled declarations and code: variable declaration is no longer restricted to file scope or the start of a compound statement (block)
  • int types
  • // comments
  • designated initializers

Not in C++03:

  • flexible array members
  • compound literals
  • variadic macros
  • restrict qualification
  • keyword static in array indices in parameter declarations

Made optional in C11:

  • variable-length arrays

Not interesting for API:

  • new library functions, such as snprintf
  • new headers, such as <stdbool.h>, <complex.h>, <tgmath.h>, and <inttypes.h>
  • type-generic math (macro) functions, in <tgmath.h>
  • improved support for IEEE floating point
  • universal character names

C11

Proposed here:

  • Anonymous structures and unions
  • Multi-threading support (incl. <stdatomic.h>)

Might be useful:

  • Alignment specification
  • Static assertions

Not in C++:

  • Type-generic expressions using the _Generic keyword

Not necessary (can be #defined to no-op):

  • The _Noreturn function specifier

Not interesting for API:

  • Improved Unicode support
  • Removal of the gets function
  • More macros for querying the characteristics of floating-point types
  • "…x" suffix for fopen
  • quick_exit function
  • timespec_get function
  • Macros for the construction of complex values

Optional, not considered:

  • Bounds-checking interfaces (Annex K).
  • Analyzability features (Annex L).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants