bpo-40255: Implement Immortal Instances - Optimization 1 #31488
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Immortalizing Interned Strings
This is an optimization on top of PR19474.
The improvement here uses the assumption that all interned strings are alive during the entire lifecycle of the runtime. Thus, every time that a string is interned, it is automatically immortalized.
Benchmark Results
Overall: 1% slower compared to the master branch
pyperformance results
Implementation Details
Any time that
PyUnicode_InternInPlace
is called, the string will be automatically immortalized. This also re-utilizes the string state mechanism by setting the state toSSTATE_INTERNED_IMMORTAL
. Given that all interned strings are now immortal, theSSTATE_INTERNED_MORTAL
usage has been deprecated.Interned String Finalization
The current change does not attempt to fully clean up the strings during the runtime shutdown. The main reason being that interned strings include both statically allocated strings (i.e
Py_Identifiers
) and dynamically allocated strings (i.ePyUnicode_New
). During the runtime shutdown, it is hard to determined which strings in theinterned
dictionary are statically allocated vs dynamically allocated. Blindly calling the deallocation function will end up callingPyMem_RawFree
on statically allocated memory which will crash the program.This could be in theory fixed by having a way to mark these statics strings differently than the dynamically allocated strings. If we distinguish them during the runtime shutdown, we can run the deallocation function on the dynamic strings. However, all the fields within these instances are already used and have no way to cleanly embed this information. Another option is to use the memory allocation range of
_PyRuntime.global_objects.singletons.strings
up tosizeof(_Py_global_strings)
and anything outside of this is a dynamic string. However, this relies in exposing the internal_Py_global_strings
struct. Given there's no clean option, I’ve decided to leave the interned strings as is and let the OS free the memory after the runtime shuts down.https://bugs.python.org/issue40255