-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make some PyMutex
functions public
#117511
Comments
What's the private slow path for? I'm a bit puzzled why you'd ever choose that? |
The cpython/Include/internal/pycore_lock.h Lines 70 to 87 in c43f6a4
|
There was a discussion about this late last year on discourse: https://discuss.python.org/t/exposing-atomic-operations-tooling-in-the-public-c-api/39973 I’m now looking at adding support for free-threaded python in NumPy. My current plan is to use PyThreadType_lock or when that is not performant enough, look at using platform-specific synchronization primitives or e.g. C11 threads.h. I think just using PyMutex instead would be better since it’s a simple API and is fast enough for many use cases. I understand there’s a desire to avoid maintaining a suboptimal API forever, but I think CPython providing a migration story for C extension authors would save a lot of time distributed across the community. |
Would it be possible to allocate it on the heap memory, and make the structure members opaque? |
A heap allocated mutex is less useful and harder to use. For example:
I don't think making it opaque and heap allocated provides any concrete advantages in this case. Oftentimes, opaqueness gives us more flexibility in adjusting the implementation, but in this case it restricts the implementation in ways that matter. |
My concern is that tomorrow we might find more efficient mutex and need new more specific mutex (recursive, "adaptative", "error check", etc.), and we might be blocked by the structure which can no longer be modified. There are two aspects: API compatibility (we can change the structure) and ABI compatibility (stable ABI, we cannot change the structure if it's exposed). Maybe we can support the stable ABI by not exposing the structure in the limited C API, and only support heap allocation for the limited C API. The Linux ptread mutex implementation supports heap allocation and static initialization, and the structure is opaque. Extract: #ifdef __x86_64__
# if __WORDSIZE == 64
# define __SIZEOF_PTHREAD_MUTEX_T 40
# else
# define __SIZEOF_PTHREAD_MUTEX_T 32
# endif
#else
# define __SIZEOF_PTHREAD_MUTEX_T 24
#endif
typedef union
{
struct __pthread_mutex_s __data;
char __size[__SIZEOF_PTHREAD_MUTEX_T];
long int __align;
} pthread_mutex_t;
#define PTHREAD_MUTEX_INITIALIZER \
{ { __PTHREAD_MUTEX_INITIALIZER (PTHREAD_MUTEX_TIMED_NP) } }
#ifdef __USE_GNU
# define PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP \
{ { __PTHREAD_MUTEX_INITIALIZER (PTHREAD_MUTEX_RECURSIVE_NP) } }
# define PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP \
{ { __PTHREAD_MUTEX_INITIALIZER (PTHREAD_MUTEX_ERRORCHECK_NP) } }
# define PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP \
{ { __PTHREAD_MUTEX_INITIALIZER (PTHREAD_MUTEX_ADAPTIVE_NP) } }
#endif
// ---- Internal ---------------------
struct __pthread_mutex_s
{
int __lock;
unsigned int __count;
int __owner;
#ifdef __x86_64__
unsigned int __nusers;
#endif
/* KIND must stay at this position in the structure to maintain
binary compatibility with static initializers. */
int __kind;
#ifdef __x86_64__
short __spins;
short __elision;
__pthread_list_t __list;
# define __PTHREAD_MUTEX_HAVE_PREV 1
#else
unsigned int __nusers;
__extension__ union
{
struct
{
short __espins;
short __eelision;
# define __spins __elision_data.__espins
# define __elision __elision_data.__eelision
} __elision_data;
__pthread_slist_t __list;
};
# define __PTHREAD_MUTEX_HAVE_PREV 0
#endif
};
#ifdef __x86_64__
# define __PTHREAD_MUTEX_INITIALIZER(__kind) \
0, 0, 0, 0, __kind, 0, 0, { 0, 0 }
#else
# define __PTHREAD_MUTEX_INITIALIZER(__kind) \
0, 0, 0, __kind, 0, { { 0, 0 } }
#endif I see that pthread_mutex contains way more information: lock, owner, nusers, spins, elision, and are single linked list. Maybe the list is used to solve the hard problem is keeping locks consistent at fork: see issue gh-50970. Do you plan to solve this problem or leave it aside? |
The mutex is already adaptive, both in the sense that it dynamically switches from spinning to blocking and from unfair (barging) to fair (handoff). We should not shoehorn a recursive mutex into Linux's pthread mutex isn't really any more opaque than this 1: note that We already clear the wait entries at fork, which is better than the pthreads implementations. With pthreads, if you unlock a mutex you own after a fork, you might still deadlock due to trying to wake up dead threads. We don't have that problem with the Footnotes
|
This issue is about exposing the existing mutex API, not about designing a new kind of mutex API. Sam, you have a big +1 from me for exposing the existing |
I'm fine with adding proposed API, but not to added it the limited C API. I suggest to rename The limited C API (and stable ABI) would need an alloc/free API similar to PyThread_allocate_lock() / PyThread_free_lock(), hide the structure, and expose PyMutex_Lock() and PyMutex_Unlock() in the limited C API (but still implement them as static inline in the public C API). Pseudo-API for an hypothetical limited C API:
An alternative is to reuse the PyThread API to expose PyMutex there in a stable ABI. |
What about adding an initialization API (albeit trivial)? Right now, this is how I'm initializing a PyMutex, and here how it's currently done in main for dict. Of course it's trivial to set one field to 0, but it sounds like an implementation detail a programmer shouldn't need to worry about. |
No, I think an init routine just adds complexity. There are standard ways to zero (or default-value) initialize a structure in C and C++ that do not require knowledge of the fields:
|
I can't really see how it adds complexity, but that might be my own short sightedness. If not for that, it seems to me more future-proof to have an init routine to call, in case at some point whatever different value is needed at initialization time for the mutex to function. |
It adds API complexity (not computation complexity). It is redundant with standard idioms for zero initialization. It doesn't add real flexibility. If you allow zero-initialization, people will rely on it, so you can't change the representation without breaking backwards compatibility even if you also provide an initialization APIs that does the same thing. You would also need at least two APIs: one for static initialization and one for dynamically allocated objects. |
…ythonGH-117731) (cherry picked from commit 3af7263) Co-authored-by: Sam Gross <[email protected]>
…ythonGH-117731) (cherry picked from commit 3af7263) Co-authored-by: Sam Gross <[email protected]>
Feature or enhancement
Overview
The
PyMutex
APIs are currently internal-only inpycore_lock.h
. This proposes making the type and two functions public in as part of the general, non-limited API inInclude/cpython/lock.h
.The APIs to be included in the public header are:
Motivation
PyMutex
API is easier to use correctly than the existingPyThread_type_lock
. ThePyMutex
APIs release the GIL before blocking, whereas with thePyThread_type_lock
APIs you often want to use a two step approach to locking to release the GIL before blocking.PyMutex
API enables a more efficient implementation: it's faster to acquire when the lock is not contended and does not require allocation.C-API WG issue
PyMutex
public in the non-limited API capi-workgroup/decisions#22cc @ngoldbaum
Linked PRs
The text was updated successfully, but these errors were encountered: