-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[subinterpreters] Make the PyGILState API compatible with subinterpreters #59956
Comments
Currently, modules that use the PyGILState* APIs cannot be used with mod_wsgi, as mod_wsgi uses the subinterpreter support. Graham Dumpleton and I spent some time discussing this at PyCon AU 2012, and we believe that the incompatibility can be resolved with a single API in the core: a function that mod_wsgi can call to set the interpreter used by the GIL state APIs to implicitly create new thread states. This is still only a seed of an idea (and it's entirely possible we've missed something), but it's a place to start in resolving this longstanding incompatibility. |
Just to clarify. One can still tell WSGI applications under mod_wsgi to run in the main interpreter and in that case modules using PyGILState* do work. By default though, sub interpreters are used as noted. The mechanism for forcing use of main interpreter is the directive: WSGIApplicationGroup %{GLOBAL} Some details about this issue can be found in: http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API |
How would it work? |
It would twiddle the autoInterpreterState and autoTLSkey entries in the pystate.c global variables to point to a different subinterpreter. As I understand the situation, mod_wsgi doesn't need arbitrary externally created threads to be able to call back into arbitrary subinterpreters, it just needs to be able to direct externally created threads in a process to a subinterpreter other than the main one. Graham, looking at the current impl - have you experimented with just calling _PyGILState_Init() with the interpreter state and current thread state for the desired subinterpreter to see what happens? I think the new method could just be a cleaner combination of _PyGILState_ReInit and _PyGILState_Init. If I'm right, then calling _PyGILState_Init should convert the current crashes and deadlocks into a relatively less harmful memory leak (since the old entry in the TLS won't get deleted properly). |
Graham, even better would be if you could try the following combination: _PyGILState_Fini(); (where si and st are the interpreter state and thread state for the target subinterpreter) If a new PyGILState_SwitchInterpreter API is going to be able to solve this in 3.4, then I believe those private APIs should be enough to make it possible in *current* versions. If those private APIs *aren't* enough, then I'm missing something and this isn't going to be as easy as I thought. |
But how does it know that the right externally created thread will point |
Just as they do today, all externally created threads will still go to *one* interpreter when they hit PyGILState_Ensure(). It's just that interpreter won't be the main one. Since the whole point of the PyGILState API is to support threads that don't have a previously created thread state, there's no getting around the requirement to have a single blessed interpreter that handles all externally created threads in a given process. It will be up to mod_wsgi (and any other embedding application that uses the new function) to make sure it calls this at a time when there aren't any existing calls to PyGILState that would be disrupted. (Assuming we can't figure out a locking scheme that *ensures* no such threads are running when the switch occurs) |
Uh but how does it solve the issue? (unless you create a mod_wsgi app |
My understanding of the mod_wsgi architecture is that it uses subinterpreters to maintain a persistent process, while still providing a relatively pristine interpreter state to handle each new request. This means even when you're using multiple processes with a single request handling thread per process, you're running a subinterpreter unless you explicitly configure mod_wsgi to always run in the main interpreter (which, I believe, will result in additional state persistence across requests). The proposed API change can only fix scenarios where *at a given point in time*, *all* PyGILState_Ensure calls should be directed to a particular subinterpreter. The target subinterpreter may change *later*, but there still cannot be two desired targets within the same process at the same time. However, at the moment, PyGILState doesn't even allow that - all externally created threads are handled in the *main* interpreter even if that isn't what the embedding application wants. |
Sorry, I mischaracterised the way mod_wsgi works slightly. However, my understanding is still that the scope of this particular fix is merely to allow all external threads to be redirected to a different subinterpreter at various times over the life of a process. It does not need to allow different external threads to be redirected to different subinterpreters. (Note: I am assuming that any hooks Apache/mod_wsgi has into external thread creation could already just create an appropriate thread state if that was the desired behaviour. It may be I'm incorrect on this, and what Graham really wants is the ability to change the target interpreter state just for the current thread. However, if that's what he wants, then there's additional background info I need on mod_wsgi and its ability to influence thread creation within a process, because I didn't get the impression on the weekend that that is what he was after) |
I don't think that's true. On hg.python.org, the hglookup application |
s/slightly/completely/ (I believe my description of the core problem was right, but my description of why that problem exists was wrong - it actually has to do with the way mod_wsgi handles virtual hosts and endpoints) If we expose an official way to switch the destination of the PyGILState APIs, then what it means is that mod_wsgi can, over the lifecycle of a single process in the pool, *switch* the virtual host and WSGI endpoint that process is handling by changing the active interpreter. There are still some extension modules where that won't work (because they create persistent threads that periodically call back into Python, so they may still end up calling back in to the wrong interpreter), but it will allow those that just do callbacks from external worker threads (or other operations that are similarly bound by the lifecycle of a request) to start working properly. |
I don't understand what that means. It's the OS which switches threads, (by the way, the real fix to the GILState vs. subinterpreters issue |
Umm, no. The whole point of the GILState API is that you can call it from a thread which knows *nothing* about Python. It will then use the *process global* state in the pystate.c statics to initialise that thread with a Python thread state. Currently, that thread state will always be for the main Python interpreter for the process. All Graham wants is an officially supported way to change which interpreter the pystate.c globals reference *without* shutting down and reinitialising Python completely. There are going to be limitations on how effective this will be - it still won't support *all* extension modules that use the PyGILState APIs. It will, however, support many more of them than the status quo (which is zero, unless you force your WSGI app to use the main interpreter, which has its own problems). And you absolutely can control when the OS switches threads - controlling that is what synchronisation primitives are *for*. |
No to what? Any sane callback API allows to pass it some user data, that
I still don't understand how that allows to "support some extension
I don't think mod_wsgi has access to enough hooks or information to do |
In both embedded mode and daemon mode of mod_wsgi, albeit how thread pool is managed is different, there is a fixed number of threads with those being dedicated to handling web requests. On a request arriving next available thread from the thread pool handles accepting of request at C code level, that thread may then map to any WSGI application and so any sub interpreter, or even the main interpreter. Thus there is no one to one mapping between thread and (sub)interpreter. The way the mod_wsgi code works now is that when it knows it will be calling into the main interpreter, it uses PyGILState_Ensure(). If not, it will use a thread state for that thread specific to the sub interpreter it is calling in to. At the end of the request, the thread state is remembered and not thrown away so that thread locals still work for that thread across requests for that sub interpreter. Thus, there can be more than one thread state per thread, but this is fine so long as it is only used against the sub interpreter it was created for. This is actually an enforced requirement of Python, because if you create more than one thread state for a thread for the same sub interpreter, or even an additional one for the main interpreter when there is also the auto TLS, then Python will die if you compile and run it is in debug mode. Now, since mod_wsgi always knows which interpreter it is calling into, the intent was that there was this single API call so that mod_wsgi could say that at this time, this thread is going to be calling into that interpreter. It could then just call PyGILState_Ensure(). Any third party module then which uses the simplistic calling sequence of calling PyGILState_Release() on exiting Python code and thence within the same thread calling PyGILState_Ensure() when coming back into Python with a callback will work, as mod_wsgi has specified the interpreter context for that thread at that time. As pointed out, if a third party module was creating its own background threads at C level and calling PyGILState_Ensure() when calling back into Python code, this could pose a problem. This could also be an issue for Python created background threads. In the case of the latter, if a Python thread is created in a specific sub interpreter, it should automatically designate for that thread that that is its interpreter context, so if it calls out and does the Release/Ensure dance, that it goes back into the same sub interpreter. The C initiated thread is a bit more complicated though and may not be solvable, but a lot of the main third party modules which don't work in sub interpreters, such as lxml, don't use background threads, so the simplistic approach means that will work at least. So, in summary, saw a single API call which allowed designation of which interpreter a thread is operating against, overriding the implicit default of the main interpreter. PyGILState API will need to manage a set of interpreter states for each interpreter, with potential for more than one thread state for a thread due to a thread being able to call into multiple interpreters at different times. |
Le mardi 21 août 2012 à 22:14 +0000, Graham Dumpleton a écrit :
Why would a module do that, instead of using |
Those macros only work for general GIL releasing and pop straight away, not for the case where released, calls into some non Python C library, which then calls back into Python. My recollection is, and so unless they have changed it, SWIG generated calls use the GILState calls. See: |
I see, so you are right that this new API could be useful. However, we
Well, if SWIG isn't fixed, people should stop using an unmaintained and |
If you have a Ex version of Ensure, then if the interpreter argument is NULL, then it should assume main interpreter. That way the normal version of Ensure can just call PyGILState_EnsureEx(NULL). |
I'm not sure it makes sense to call this new API "PyGILState_EnsureEx". My concern is that the behaviour is quite different in the presence of an existing thread state: Ensure:
New API:
I guess it makes sense if we treat the NULL pointer as the degenerate case meaning "use the interpreter of this thread, or the default interpreter if no interpreter has been declared for this thread". PyGILState_Ensure would then simply call PyGILState_EnsureEx(NULL) internally. So, my question for Graham would be, given this ability, would mod_wsgi still need the ability to change the default interpreter? Or would it be enough for you to be able to register the threads *you* create with a specific interpreter? |
That's not what I'm proposing. What I'm proposing is that the new API So basically: Ensure:
New API:
Graham is merely suggesting for simplification that "global TLS key" == |
And the current "autoTLSkey" could move into the interpreter state object? I like it - that's a lot more flexible than the approach I was thinking of. |
It is past my bed time and not thinking straight, but I believe Antoine is aligned with what I had in mind, as need multiple thread states per OS thread where each is associated with separate interpreter. My main reason for allowing NULL to EnsureEX rather than requiring main_interpreter to be explicitly passed, is that way way back in time, my recollection is that getting access to the main interpreter pointer was a pain as you had to iterate over the list of interpreters and assume it was the last one due to it being created first. I don't remember there being a special global variable or function for getting a pointer to the main interpreter. This may well have changed since and there is an easier way do let me know. So saw it as a convenience. |
The GIL state api was mainly interested in the case of a thread which has (possibly) never been seen before calling into Python. IIUC, the proposal here is so that a thread that *has* been seen before can be associated with a thread-state specified by the embedding application (and the degenerate case would be to assume the thread hasn't been seen, and as such should get the default interpreter) If that isn't too wide of the mark, I agree it sounds workable and worthwhile. |
To clarify, I wrote:
Where I meant to say: Can be associated with an interpreter state and corresponding thread-state ... Or something like that - it's been a while since I've looked at that code. |
Thinking about it, I believe there still needs to be a concept of an "active thread state" TLS key in order to deal with Graham's original problem. Specifically, if PyGILState_EnsureEx is used to associate the thread with a particular interpreter, then subsequent calls to PyGILState_Ensure from *that thread* should get the explicitly associated interpreter, rather than the main interpreter. Then, the lookup sequence for "interpreter=NULL" would be:
A similar approach almost works when requesting a specific interpreter, but where that goes wrong is when the active TLS key is *already set*. You can't just overwrite it, because that will cause problems for subsequent PyGIL_Release calls. You could just make it an error, but I think Graham's original idea makes it possible to do better than that. Specifically, a PyGILState_SwitchInterpreter API could focus solely on the manipulation of the "active thread state" TLS key entry. The sequence of commands in mod_wsgi would then look like: old_interp = PyGILState_SwitchInterpreter(target_interp);
old_gil = PyGILState_Ensure();
/* Call into Python using target_interp */
PyGILState_Release(old_gil);
PyGILState_SwitchInterpreter(old_interp); /* May not be needed in the mod_wsgi case, since it knows it is making the outermost call into the PyGILState_* APIs */ All of the other elements of Antoine's proposal (i.e. the per-interpreter TLS key entries) would still be needed, it's just that the API for associating a thread with an interpreter would remain separated from that of associating the thread with a particular thread state. The big advantage of doing it this way is that it will nest properly, whereas PyGILState_EnsureEx would need a more complicated API to correctly report both the old interpreter state and the old GIL state within that interpreter. |
Le mardi 28 août 2012 à 14:12 +0000, Nick Coghlan a écrit :
Why wouldn't it be simply written: old_gil = PyGILState_EnsureEx(target_interp);
/* Call into Python using target_interp */
PyGILState_Release(old_gil); |
In NumPy ufuncs and datatype casting (iteration) we have the following setup: user-code (releasing GIL) -> NumPy API -> 3rd-party C-function (in the ufunc code, numpy is the one releasing the GIL, although others, such as numba probably hook in and replace that.) One thing which may make this not as problematic is that the all of this can be expected to run within a Python created thread, so I somewhat think the initial proposal here would effectively fix the current NumPy usage (I am not sure). The reason I ask/post this is, that this is all part of public-API which requires a complete re-implementation very soon. While extensions to that new API may be possible, that is always a bit painful, so it would be good to know how that API should be designed right now. At this point, it seems that I should design the 3rd-party C-function API so that it is passed a In the majority of cases, I could already pass this right now, but I have currently no clear idea of what is the best practice. I also need to take such an argument (requiring new API) in at least one place. If we know how this will pan out, adding it sooner rather than later would be good, since it will make future transition/adoption faster or at least easier. As far as I understand, right now PyGILState_Ensure()` not working is probably the single most blocking issue for correct subinterpreter support in NumPy, since this is baked into public API it may take years for true support in the above way, but we may be able to go a large part of the way now. My problem is that I can only do that if I am clear on what is needed. |
See also bpo-40522: Get the current Python interpreter state from Thread Local Storage (autoTSSkey). |
The bpo-46295 was marked as a duplicate of this issue. |
This one still has the 3.10 label attached. Is there any chance to see any improvement scheduled for 3.11? |
The objective of this change is to help make the GILState-related code easier to understand. This mostly involves moving code around and some semantically equivalent refactors. However, there are a also a small number of slight changes in structure and behavior: * tstate_current is moved out of _PyRuntimeState.gilstate * autoTSSkey is moved out of _PyRuntimeState.gilstate * autoTSSkey is initialized earlier * autoTSSkey is re-initialized (after fork) earlier #59956
…te (gh-101209) We've factored out a struct from the two PyThreadState fields. This accomplishes two things: * make it clear that the trashcan-related code doesn't need any other parts of PyThreadState * allows us to use the trashcan mechanism even when there isn't a "current" thread state We still expect the caller to hold the GIL. #59956
A PyThreadState can be in one of many states in its lifecycle, represented by some status value. Those statuses haven't been particularly clear, so we're addressing that here. Specifically: * made the distinct lifecycle statuses clear on PyThreadState * identified expectations of how various lifecycle-related functions relate to status * noted the various places where those expectations don't match the actual behavior At some point we'll need to address the mismatches. (This change also includes some cleanup.) #59956
FYI, I have a PR up that partially fixes the incompatibility: gh-101431. It makes it so the GILState thread state and the "current" thread state (from The PR does not address |
…01308) A PyThreadState can be in one of many states in its lifecycle, represented by some status value. Those statuses haven't been particularly clear, so we're addressing that here. Specifically: * made the distinct lifecycle statuses clear on PyThreadState * identified expectations of how various lifecycle-related functions relate to status * noted the various places where those expectations don't match the actual behavior At some point we'll need to address the mismatches. (This change also includes some cleanup.) python#59956
…ters (gh-101431) The GILState API (PEP 311) implementation from 2003 made the assumption that only one thread state would ever be used for any given OS thread, explicitly disregarding the case of subinterpreters. However, PyThreadState_Swap() still facilitated switching between subinterpreters, meaning the "current" thread state (holding the GIL), and the GILState thread state could end up out of sync, causing problems (including crashes). This change addresses the issue by keeping the two in sync in PyThreadState_Swap(). I verified the fix against gh-99040. Note that the other GILState-subinterpreter incompatibility (with autoInterpreterState) is not resolved here. #59956
It is known that Python3.12 has supported per-interpreter GIL. Considering that running different sub-interpreters in parallel in multiple threads becomes a reality, it is a necessity to make the PyGILState API compatible. Hope this will be fixed later in 3.12.x. ❤ |
Just to be clear, this issue describes two distinct but related concerns:
Presumably you are talking about the bug part.
The bug part should be fixed in 3.12, via gh-1014312 (and mostly tested via gh-101625). There should no longer be any crashes or inconsistent state when using the GILState API with subinterpreters. (The feature part of this issue is still unresolved.) If the PR didn't actually solve the problem then we'll see if we can fix it in 3.12. However, a fix would have to wait until 3.13 if it requires adding new API. Is there a specific case that isn't working for you on 3.12? If so, please provide an example that reproduces the failure.
It shouldn't need any special treatment. Use Footnotes
|
Not speaking for the OP, but if For example, if an extension gives access to a database and allows writing user-defined functions (UDFs) in Python, UDFs would then be always executed in the main interpreter which is not necessarily expected. What would be needed is an additional API such as: PyGILState_STATE_EX PyGILState_EnsureWithInterpreter(PyInterpreterState*);
void PyGILState_ReleaseWithInterpreter(PyGILState_STATE_EX, PyInterpreterState*); ( ... with one open question being: what happens if the given interpreter was destroyed in the meantime? |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: