Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py_PACK_VERSION #47

Open
5 of 6 tasks
encukou opened this issue Nov 8, 2024 · 20 comments
Open
5 of 6 tasks

Py_PACK_VERSION #47

encukou opened this issue Nov 8, 2024 · 20 comments

Comments

@encukou
Copy link
Collaborator

encukou commented Nov 8, 2024

Please vote for adding macros/functions for easier version handling, as discussed on Discourse:

  • Py_PACK_FULL_VERSION(x, y, z, level, serial) packs a version number from components
    into the format used by Py_VERSION_HEX and Py_LIMITED_API.
    For example, Py_PACK_FULL_VERSION(3, 14, 0, 0xA, 1) evaluates to 0x030E00A1.

  • Py_PACK_VERSION(x, y) is shorthand for Py_PACK_FULL_VERSION(x, y, 0, 0, 0),
    useful because the first two version components often determine ABI
    compatibility.

  • These are primarily macros, but we will export library functions with the same names and functionality, for use in wrappers for non-C languages – for example, Python with ctypes.
    (The macro-style naming means that we do encourage “serious” wrappers to implement them as compile-time constructs, rather than library calls.)

These should go in the limited API.

See the discussion thread for other considered ideas:

  • Alternate names (much bikeshedding was done, with no clear winner)
  • Leaving out Py_PACK_FULL_VERSION (it's useful to explain this in docs, and sometimes in projects like Cython)
  • Py_VERSION_GE(x, y, z) & Py_VERSION_LE(x, y, z) that would directly compare Py_VERSION_HEX to a given version (but do nothing for Py_LIMITED_API or the macro proposed in PEP 743)
@vstinner
Copy link

vstinner commented Nov 8, 2024

Is it really useful to add Py_PACK_FULL_VERSION(x, y, z, level, serial)? What's the use case?

(it's useful to explain this in docs, and sometimes in projects like Cython)

Cython can use its own macro.

@zooba
Copy link

zooba commented Nov 8, 2024

Is it really useful to add Py_PACK_FULL_VERSION(x, y, z, level, serial)? What's the use case?

I see it more as fully defining our version field. We shouldn't make other people replicate our logic for that if we're going to provide any level of support - I'd argue that Py_PACK_VERSION is the more questionable one (especially if macro arguments were properly optional, which they aren't...)

@serhiy-storchaka
Copy link

serhiy-storchaka commented Nov 8, 2024

I support adding the macros, but I am against adding functions. The main use of such macros is in preprocessor:

#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= Py_PACK_FULL_VERSION(3, 14, 0, 0xA, 1)

And it only works if they are macros.

Wrappers for non-C languages should implement them as native functions. This is not more difficult than create a wrapper.

@zooba
Copy link

zooba commented Nov 8, 2024

Right, but C code is never going to get the function, they'll always get the macro. They can be exported for GetProcAddress and (I assume) dlsym purposes without being in the header files.

It's not trivial for non-C languages to write a function that wraps the macros. That involves adding a C compiler or copy-pasting the logic.

@serhiy-storchaka
Copy link

def Py_PACK_FULL_VERSION(x, y, z, level, serial):
    return (x << 24) | (y << 16) | (z <<  8) | (level <<  4) | (serial << 0)

I am sure that it is easy to write in any language.

@zooba
Copy link

zooba commented Nov 8, 2024

Sure, but it's not easy to infer without being given that information. So either we present it in the docs as "here's how to do this yourself", or we can simply provide a function that does the same job as the macro (which we've agreed elsewhere is a good policy to have for stability).

(There are probably also languages where it's not easy - very strictly typed languages may not let you shift a byte outside of its range without explicit casts/conversions - but this is very much speculative and besides the point.)

@serhiy-storchaka
Copy link

It is already represented in details in the documentation for PY_VERSION_HEX. The Py_PACK_FULL_VERSION should refer to it, or maybe the description should be moved from PY_VERSION_HEX to Py_PACK_FULL_VERSION.

If add function, we should discuss them in details. What types of parameters -- int, int32_t, uint32_t, other? What is the returning type? How to handle errors? What exceptions can be raised?

I do not believe that functions are needed. Macros will be helpful. If authors of wrappers for non-C languages ask us to add functions, then we can add functions that satisfy their needs.

@vstinner
Copy link

vstinner commented Nov 8, 2024

I would prefer to only provide macros. I don't think that it's useful to add functions here.

@encukou
Copy link
Collaborator Author

encukou commented Nov 11, 2024

This goes in the limited API, where, IMO, we should add function equivalents to macros whenever it's possible. There's no pressing reason to omit the functions here.

Wrappers for non-C languages should implement them as native functions. This is not more difficult than create a wrapper.

That's only true if you're writing the wrapper by hand. If you're generating wrappers from headers (which some projects do, and which I'd like to make easier), functions should make it much easier.


As for Py_PACK_FULL_VERSION being easy to write -- yes, this proposal is all about convenience. Writing out bit-shifts isn't hard, but annoying (and the same goes for the tests).

I'd rather add masking (which is no-op in the primary use case of using a macros on literals):

def Py_PACK_FULL_VERSION(x, y, z, level, serial):
    return (
        (x & 0xff) << 24)
        | ((y & 0xff) << 16)
        | ((z & 0xff) <<  8)
        | ((level & 0x0f) << 4)
        | ((serial & 0x0f) << 0)
    )

Thanks for asking about the signatures. My proposal is to match the inputs' domains, except uint8_t for the half-bytes:

  • uint32_t Py_PACK_FULL_VERSION(uint8_t x, uint8_t y, uint8_t z, uint8_t level, uint8_t serial)
  • uint32_t Py_PACK_VERSION(uint8_t x, uint8_t y)

Out-of range level and serial would be masked with & 0x0f.
The function cannot fail; it is defined for all inputs. (This means that if we'd ever need to change their behaviour, we'd need to add new functions instead -- but we'd need that anyway for limited-API macros.)

@zooba
Copy link

zooba commented Nov 11, 2024

Thanks for asking about the signatures. My proposal is to match the inputs' domains, except uint8_t for the half-bytes

Might as well just go int and mask - a good calling convention is going to use registers, and a bad one is going to push machine words onto a stack, so passing bytes is unlikely to save anything at all. Plus if one day the full-year CalVer idea goes through then uint8_t isn't going to be sufficient (but so many copy-pasted macros would be broken that I expect we'd only pretend that Python is using CalVer anyway, rather than actually switching 😉 )

@encukou
Copy link
Collaborator Author

encukou commented Nov 12, 2024

OK, let's go with int input.

@vstinner
Copy link

If you don't want uint8_t, I would prefer unsigned int instead of int.

@serhiy-storchaka
Copy link

Let's split the voting separately for macros and functions.

There is a global variable Py_Version as a runtime version of PY_VERSION_HEX. It has type unsigned long. So, for consistency, new functions should also return unsigned long. Their names should follow common pattern for functions: Py_PackFullVersion() and Py_PackVersion().

@encukou
Copy link
Collaborator Author

encukou commented Nov 12, 2024

Let's split the voting separately for macros and functions.

If we did that, I couldn't express my preference to add both at the same time (following a general guideline discussed in capi-workgroup/api-evolution#18).

There is a global variable Py_Version as a runtime version of PY_VERSION_HEX. It has type unsigned long. So, for consistency, new functions should also return unsigned long.

New API should use the fixed-width types, per discussion here.
I don't see how the difference would matter in practice.

Their names should follow common pattern for functions: Py_PackFullVersion() and Py_PackVersion().

IMO, functions and function-like macros that do the same thing should have the same name.

@encukou
Copy link
Collaborator Author

encukou commented Nov 12, 2024

If you don't want uint8_t, I would prefer unsigned int instead of int.

Why?
Both int types have many values that are invalid uint_8; interpretation of the sign bit doesn't matter.

@vstinner
Copy link

Using unsigned int allows the compiler to detect that you pass an invalid value. For me, it makes no sense to use a negative value to build a version number.

Example:

int f(unsigned int arg) { return 0; }
int main() { return f(-3); }

Warning when using -Wconversion:

$ gcc int.c -o int -Wconversion

int.c: In function 'main':
int.c:2:23: warning: unsigned conversion from 'int' to 'unsigned int' changes value from '-3' to '4294967293' [-Wsign-conversion]
    2 | int main() { return f(-3); }

@zooba
Copy link

zooba commented Nov 12, 2024

There's a level of "someone doesn't know how to use the API" that we can't help any further.

On one hard, we have the argument that everyone should use the macro (which can't really handle error checking like this). On the other, we have the argument that the function version should have excessive amounts of error checking.

This is a simple enough API that I think we can allow garbage-in-garbage-out. No need to overthink it.

@encukou
Copy link
Collaborator Author

encukou commented Nov 13, 2024

Coincidentally: with masking, -1 is a convenient way to spell “maximum”.

@encukou
Copy link
Collaborator Author

encukou commented Nov 18, 2024

The draft guidelines

Avoid static inline functions and macros. All functions must be exported as actual library symbols.
[...]
Once you add API that conforms to this “portable subset”, you can add additional C/C++-specific API. Usually, the additional API will be either more performant, or easier to use from C. For example:

  • A function may be shadowed by a static inline function or macro with the same behavior. [...]

Of course, we can grant an exception if circumstances require it, and we can of course change those guidelines -- they are new and untested. @serhiy-storchaka, do you think we should do one of those?

If not: I see no strong reason to omit adding the functions as library symbols.

@encukou
Copy link
Collaborator Author

encukou commented Nov 25, 2024

@serhiy-storchaka, do you still want to block the proposal to add the functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants