-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support more integer types in PyMemberDef #117031
Comments
Add support for standard C and Posix integer types like Py_T_UINT32, Py_T_PTRDIFF, Py_T_OFF and Py_T_PID. Add Py_T_SSIZE as alias of Py_T_PYSSIZET.
Do we need this in 3.13? |
I like the idea of adding stdint.h types: int8/16/32/64_t, uint8/16/32/64_t, size_t and ssize_t. I'm also fine with intptr_t and uintptr_t since it's the recommended way to store an integer as a pointer in C. I'm less sure about 128-bit flavor: is it supposed by all platforms supposed by Python? Currently, we don't use intmax_t. I would prefer to also leave this one aside for now. Also, I'm not sure that we need aliases such as ptrdiff_t. I prefer to not add it yet. For Unix types, "off_t" and pid_t", can we not add them as PyMemberDef types, but use the macro black magic comparing their SIZEOF to select the corresponding int or uint type with a size, such as int32_t for pid_t? If we add off_t, pid_t, what about uid_t and gid_t? and file descriptor? and socket handle? etc. System programming uses a tons of types. In short, I prefer to start with a bare minimum set of types.
I'm not aware of any type which requires a specific alignment. Usually, it "just" works :-) Endianness: are you aware of an object which requires endianness conversion? Honestly, for "custom" use cases, just use getter and setter functions, no? |
I need this to create wrappers of C structures. Several issues wait for this feature. How do you specify the custom size/alignment/signedness/endianness? There is no place for this in struct PyMemberDef {
const char *name;
int type;
Py_ssize_t offset;
int flags;
const char *doc;
}; You can only encode this as a part of |
Support of struct flock {
...
short l_type; /* Type of lock: F_RDLCK,
F_WRLCK, F_UNLCK */
short l_whence; /* How to interpret l_start:
SEEK_SET, SEEK_CUR, SEEK_END */
off_t l_start; /* Starting offset for lock */
off_t l_len; /* Number of bytes to lock */
pid_t l_pid; /* PID of process blocking our lock
(set by F_GETLK and F_OFD_GETLK) */
...
}; (on different platforms the types and the order of fields can be different). Support of struct f_owner_ex {
int type;
pid_t pid;
}; |
For me, it would be easier to review if your PR only add intX_t/uintX_t types. Later, we can discuss other types. I understand the need for Unix types such as pid_t. My question is more if we can implement them as aliases to intX_t or uintX_t:
The problem of this approach is that configure doesn't check if types are signed or not. I see the that your implementation does it partially: #ifdef MS_WINDOWS
# define Py_T_OFF Py_T_LONGLONG
#else
# define Py_T_OFF 35
#endif
#define Py_T_PID 36 and: #ifndef MS_WINDOWS
case Py_T_OFF:
v = _PyLong_FromByteArray((const unsigned char *)addr,
sizeof(off_t), PY_LITTLE_ENDIAN, 1);
break;
#endif
case Py_T_PID:
v = PyLong_FromPid(*(pid_t*)addr);
break; |
Let's go with size and signedness only, then. (Ignore alignment/endian, I was too fast to type that.) Let's say that if |
I was thinking about a similar idea when assigned values to new code types. It would be nice to encode signedness in the lowest bit, but unfortunately the existing code types do not follow any system. As for the macro, how do you determine the signedness of the C type? The larger issue is that there are more than two options for signedness. For example, -1 is a special value for Note also that the existing code types were not consistent with this. Only recently it was partially fixed, but there is more work, and it will take several releases to finish:
This is why I started with adding type codes for concrete C types. It is easier to specialize code for every concrete type. I am open to the idea of making some of new type codes aliases to other type codes if they do not need special converters yet, but we should decide what they should use -- standard integer C types (short, int, long, etc) or new fixed-size integer C types (intXX_t). |
That's fine. There'll always be exceptions, like
More generally, something the C type system doesn't capture is whether negative Python integers can be stored in unsigned fields. Sadly, the current member types don't care about this property. Sometimes that's an error, but sometimes it's useful to allow this -- @zooba likes to give Windows error codes as examples here. My proposed scheme has 7 bits left for details like this. Anyway: IMO, at the very least we should do a systematic approach (with the macro) for the |
What's the use case for that? Existing types don't implement it. I'm not convinced that it's useless. Note: use |
It allows the second part of the proposal: a |
You cannot write a macro which checks the signedness of a type. I tested, the preprocessor cannot test it. Do you have a working implementation (or just a proof-of-concept)? |
You don't need to test this in the preprocessor. The macro only needs to produce a value to put in the memberdef. #include <stdio.h>
#include <stdint.h>
#define Py_T_INTEGER(type) ((sizeof(type) << 8) | ((type)-1 < 0 ? 0 : 1))
int main() {
printf("%04x\n", Py_T_INTEGER(int8_t));
printf("%04x\n", Py_T_INTEGER(uint8_t));
printf("%04x\n", Py_T_INTEGER(int16_t));
printf("%04x\n", Py_T_INTEGER(uint16_t));
printf("%04x\n", Py_T_INTEGER(int32_t));
printf("%04x\n", Py_T_INTEGER(uint32_t));
printf("%04x\n", Py_T_INTEGER(int64_t));
printf("%04x\n", Py_T_INTEGER(uint64_t));
} |
A set of bits that can map into a The "sign" bit needs to be two bits to handle the cases:
(I need to poke on #116053 to figure out how best to handle the extra argument for the second one of these in AsNativeBytes.) An unsigned pid_t that allows assigning -1 would be "allow" for both, which is the case that isn't covered by the basic signed/unsigned definitions. I like Petr's idea of taking the second byte for size and to indicate that the lower byte is flags. |
I plan to add more wrappers for Posix structures. Many of them use types not supported in
PyMemberDef
, likeuint32_t
orpid_t
. So the first step is to add support of many standard C and Posix integer types.gh-114388 and gh-115011 were preparations to this step.
The open question: should we assign new codes for new types or define them as aliases to existing types if they are already exist (for example,
Py_T_PID
can be an alias ofPy_T_INT
,Py_T_LONG
orPy_T_LONGLONG
, depending on platform).Linked PRs
The text was updated successfully, but these errors were encountered: