-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Interpreter Isolation #100227
Comments
A related analysis of In summary, the following state needs attention:
Moves to
Also note:
old listThe following state needs further investigation and maybe further attention:
|
Hmm how about introducing a new debug API: Ahh.. But it looks like stable API, I am not sure it's proper approach.. |
It's OK to deprecate it. It just can't be removed :) |
Before proceeding with these changes, I think we need some up-front design. Memory allocationWe need some sort of global memory allocator. Is a per-interpreter An alternative design for allocation, that I would like to try, is per-interpreter size-based freelists backed by a global allocator. TracemallocTracemalloc data structures should be per-interpreter. Access to global values (like process-wide total memory traced) will need to be done through a function interface. EfficiencyAs more and more of the VM becomes per-interpreter, the performance impact of getting the interpreter with a C-API that was not designed with it in mind becomes larger. We need to decide whether to pass the thread state or interpreter state. Backwards compatibility
Ownership (runtime or per-interpreter) and constantsMaking something Some values are not Nothing should be in the "maybe both" category above. |
FYI, I have a plan for
We would update Py_ssize_t
_Py_GetRefTotal(void)
{
// _Py_GetLegacyRefTotal() returns the value of the actual `_Py_RefTotal` global.
Py_ssize_t legacy = _Py_GetLegacyRefTotal();
_PyRuntimeState.object_state.reftotal += legacy - _PyRuntimeState.object_state.last_legacy_reftotal;
_PyRuntimeState.object_state.last_legacy_reftotal = legacy;
return _PyRuntimeState.object_state.reftotal;
} All this assumes that folks are not using For a per-interpreter GIL, I see two options:
|
FWIW, the 3.10 changes to |
Yes, for any changes that don't make sense without a per-interpreter GIL, we would not merge them before PEP 684 is accepted. All the points you have brought up are worth discussing. Note, though, that I consider the current per-interpreter GIL work to be a minimal approach to getting there, with at little disruption as possible while tackling a variety of improvements that make sense anyway, including improving the isolation of the existing interpreters. Much of what you've brought up are important points, but ones that will require much more discussion, take substantially more effort, and introduce more disruption. I don't think we should hold up the current project for a resolution of those broader matters. That said, in the context of the above checklists, I agree that we must be very conservative in how we bake in any new design/architecture. I just consider isolated interpreters to be an old design (with a less than ideal API and an incomplete implementation) that we are carrying through. Improving the design is a separate project, which I wholly support! |
FWIW, I made a comment suggesting we leave some process-level globals as static variables and not moving them to the interpreter state: More generally, for fields (or variables) that are currently thread-safe due to the GIL we would solve them one of two ways (for a per-interpreter GIL):
Which one we do will depend on the following:
|
@pablogsal, regarding the currently runtime-global state for the perf trampoline: cpython/Include/internal/pycore_ceval_state.h Lines 36 to 46 in 101cfe6
What parts make sense moving to |
@markshannon, in which contexts do any of the version tags (e.g. dict version) make sense as a globally-unique value. Would there be any problem moving them all to |
@pablogsal, would it make sense to move |
I don't think so. These structures are mainly for debugging so I think it doesn't make sense to worry a lot to make them compatibles with multiple interpreters because you would only use them when debugging the parser. They are not even active in debug mode IIRC as there are only activated by hand. |
That depends on how we decide that the perf trampoline should behave with multiple interpreters. The easiest thing to do is just activate it globally (which is what I think it makes more sense and is more maintainable). In this case we just need the code index because code objects are per interpreter IIRC |
That's fine with me. Would it be okay to restrict FWIW, this is part of a relatively small class of runtime-global state that (conceptually or concretely) manages or utilizes certain process-global resources. We encountered this with the syslog module recently and applies a similar restriction (after some discussion). In fact, I'm growing confident that, in these cases, restricting to the main interpreter is the cleanest solution that does not sacrifice our ability to expand support later or go in a different direction. AFAICS, the main alternative to support a per-interpreter GIL (assuming that is approved), aside from moving to Furthermore, while some currently global state/resources are conceptually better suited as per-interpreter, other such state and resources make the most sense staying global. So far, it has made sense to tie them to the main interpreter each of the (few) cases we've seen. This aligns with the prevalent idea that the app should be responsible for managing certain global resources, and the "app" in Python is essentially the |
I think it should be activated in all interpreters at the same time. The switch should be global, the same way is global for all threads. |
|
Having looked through the faulthandler code, it definitely leans on whichever interpreter calls into the module (assuming (We'll address this separately in gh-101509.) |
The same goes for tracemalloc: gh-101520. |
…obal Interned Dict Safe for Isolated Interpreters" (pythongh-103063) This reverts commit 87be8d9. This approach to keeping the interned strings safe is turning out to be too complex for my taste (due to obmalloc isolation). For now I'm going with the simpler solution, making the dict per-interpreter. We can revisit that later if we want a sharing solution.
…ate (pythongh-102339) We can revisit the options for keeping it global later, if desired. For now the approach seems quite complex, so we've gone with the simpler isolation solution in the meantime. python#100227
…Interpreters (pythongh-103084) Sharing mutable (or non-immortal) objects between interpreters is generally not safe. We can work around that but not easily. There are two restrictions that are critical for objects that break interpreter isolation. The first is that the object's state be guarded by a global lock. For now the GIL meets this requirement, but a granular global lock is needed once we have a per-interpreter GIL. The second restriction is that the object (and, for a container, its items) be deallocated/resized only when the interpreter in which it was allocated is the current one. This is because every interpreter has (or will have, see pythongh-101660) its own object allocator. Deallocating an object with a different allocator can cause crashes. The dict for the cache of module defs is completely internal, which simplifies what we have to do to meet those requirements. To do so, we do the following: * add a mechanism for re-using a temporary thread state tied to the main interpreter in an arbitrary thread * add _PyRuntime.imports.extensions.main_tstate` * add _PyThreadState_InitDetached() and _PyThreadState_ClearDetached() (pystate.c) * add _PyThreadState_BindDetached() and _PyThreadState_UnbindDetached() (pystate.c) * make sure the cache dict (_PyRuntime.imports.extensions.dict) and its items are all owned by the main interpreter) * add a placeholder using for a granular global lock Note that the cache is only used for legacy extension modules and not for multi-phase init modules. python#100227
Decref the key in the right interpreter in _extensions_cache_set(). This is a follow-up to pythongh-103084. I found the bug while working on pythongh-101660.
…it (pythongh-103315) This cleans things up a bit and simplifies adding new granular global locks.
Deep-frozen code objects are cannot be shared (currently) by interpreters, due to how adaptive specialization can modify the bytecodes. We work around this by only using the deep-frozen objects in the main interpreter. This does incur a performance penalty for subinterpreters, which we may be able to resolve later.
* main: pythongh-100227: Only Use deepfreeze for the Main Interpreter (pythongh-103794) pythongh-103492: Clarify SyntaxWarning with literal comparison (python#103493) pythongh-101100: Fix Sphinx warnings in `argparse` module (python#103289)
* superopt: pythongh-100227: Only Use deepfreeze for the Main Interpreter (pythongh-103794) pythongh-103492: Clarify SyntaxWarning with literal comparison (python#103493) pythongh-101100: Fix Sphinx warnings in `argparse` module (python#103289)
…gh-103460) The lock is unnecessary as long as there's a GIL, but completely necessary with a per-interpreter GIL.
The risk of a race with this state is relatively low, but we play it safe anyway.
…gh-105514) The risk of a race with this state is relatively low, but we play it safe anyway. (cherry picked from commit 7799c8e) Co-authored-by: Eric Snow <[email protected]>
The risk of a race with this state is relatively low, but we play it safe anyway.
…5514) (gh-105517) The risk of a race with this state is relatively low, but we play it safe anyway. (cherry picked from commit 7799c8e) Co-authored-by: Eric Snow <[email protected]>
The risk of a race with this state is relatively low, but we play it safe anyway.
…h-105516) The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low.
…ate (pythongh-105516) The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low. (cherry picked from commit 68dfa49) Co-authored-by: Eric Snow <[email protected]>
…tate (gh-105516) (gh-105532) The risk of a race with this state is relatively low, but we play it safe anyway. We do avoid using the lock in performance-sensitive cases where the risk of a race is very, very low. (cherry picked from commit 68dfa49) Co-authored-by: Eric Snow <[email protected]>
In CPython, we store (persistent) runtime data in several locations. (See https://github.com/ericsnowcurrently/multi-core-python/wiki/3-CPython's-Runtime#cpythons-run-time-data.)
The state of each interpreter (
PyInterpreterState
) encapsulates the data which is unique to that interpreter and isolated from other interpreters. That isolation is important for the proper operation of multiple interpreters in a process, and will be critical for a per-interpreter GIL, AKA PEP 684 (assuming it gets approved). There are still a number of gaps in isolation which we will address here.The deficiencies may be categorized by the following:
_PyRuntimeState
) conceptually better suited to each interpreterLinked PRs
The text was updated successfully, but these errors were encountered: