-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mimalloc memory allocator #90815
Comments
From https://github.com/microsoft/mimalloc
mimalloc has several interesting properties that make it useful for CPython. Amongst other it is fast, thread-safe, and NUMA-aware. It has built-in free lists with multi-sharding and allocation heaps. While Python's obmalloc requires the GIL to protect its data structures, mimalloc uses mostly thread-local and atomic instructions (compare-and-swap) for efficiency. Sam Gross' nogil relies on mimalloc's thread safety and uses first-class heaps for heap walking GC. mimalloc works on majority of platforms and CPU architectures. However it requires a compiler with C11 atomics support. CentOS 7's default GCC is slightly too old, more recent GCC from Developer Toolset is required. For 3.11 I plan to integrate mimalloc as an optional drop-in replacement for obmalloc. Users will be able to compile CPython without mimalloc or disable mimalloc with PYTHONMALLOC env var. Since mimalloc will be optional in 3.11, Python won't depend or expose on any of the advanced features yet. The approach enables the community to test and give feedback with minimal risk of breakage. mimalloc sources will vendored without any option to use system libraries. Python's mimalloc requires several non-standard compile-time flags. In the future Python may extend or modify mimalloc for heap walking and nogil, too. (This is a tracking bug until I find time to finish a PEP.) |
I add Neil to the nosy list since he is one of the kick-off members with this amazing works :) |
New features:
|
Buildbots "PPC64 Fedora PR" and all RHEL 7 build bots provided by David Edelsohn are failing because compiler is missing support for stdatomic.h. |
Thanks, I'm indeed interested. Most credit goes to Christian for advancing this. For the missing stdatomic.h, would it be appropriate to have an autoconfig check for it? Can just disable mimalloc if it doesn't exist. |
We have an autoconf check for stdatomic.h. The test even verifies that a program with atomic_load_explicit() compiles and links. How do we want to use mimalloc in the future? Is it going to stay optional in 3.12? Then the default setting for --with-mimalloc should depend on presence of stdatomic.h. Do we want to make it mandatory for GC heap walking and nogil? Then --with-mimalloc should default to "yes" and configure should abort when stdatomic.h is missing. I'm leaning towards --with-mimalloc=yes. It will make users aware that they need a compiler with atomics: configure: error: --with-mimalloc requires stdatomic.h. Update your compiler or rebuild with --without-mimalloc. Python 3.12 will require stdatomic. |
ICC might be a problem. Apparently some version have an incomplete stdatomic.h, see bpo-37415. |
References:
|
My preference would be for --with-mimalloc=yes in an upcoming release. For platforms without the required stdatomic.h stuff, they can manually specify --with-mimalloc=no. That will make them aware that a future release of Python might no longer build (if mimalloc is no longer optional). A soft-landing for merging nogil is not a good enough reason to merge mimalloc, IMHO. nogil may never be merged. There should be some concrete and immediate advantage to switch to mimalloc. The idea of using the "heap walking" to improve is cyclic GC is not concrete enough. It's just an idea at this point. I think the (small) performance win could be enough of a reason to merge. This seems to be the most recent benchmark: https://gist.github.com/pablogsal/8027937b71cd30f17aaaa5ef7c885d3e There is also the long-term maintenance issue. So far, mimalloc upstream has been responsive. The mimalloc code is not so huge or complicated that we couldn't maintain it (if for some reason it gets abandoned upstream). However, I think we would prefer to maintain obmalloc rather than mimalloc, all else being equal. Abandonment by the upstream seems fairly unlikely. So, I'm not too concerned about maintenance. |
New benchmark:
Benchmark hidden because not significant (9): unpack_sequence, go, raytrace, chameleon, xml_etree_process, fannkuch, sqlite_synth, regex_effbot, unpickle_list |
I re-ran the benchmark of d6f5f010b586:
Benchmark hidden because not significant (7): scimark_fft, dulwich_log, python_startup_no_site, regex_effbot, sqlite_synth, nbody, pickle_list |
ICC 2021 has full support for stdatomic.h and compiles mimalloc just fine: $ CC="icc" ./configure -C --with-pydebug
$ make
$ ./python
Python 3.11.0a5+ (main, Feb 9 2022, 15:57:40) [GCC Intel(R) C++ gcc 7.5 mode] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys._malloc_info
sys._malloc_info(allocator='mimalloc_debug', with_pymalloc=True, with_mimalloc=True, mimalloc_secure=4, mimalloc_debug=2) AIX xlc is still a problem. It does not support C11 stdatomic.h. But it comes with older GCC atomic memory access __sync function family, https://www.ibm.com/docs/en/xl-c-and-cpp-aix/13.1.3?topic=cbif-gcc-atomic-memory-access-built-in-functions-extension . It might be possible to re-implement mimalloc's atomics with __sync functions (e.g. https://gist.github.com/nhatminhle/5181506). The implementation would be less efficient, though. The __sync functions don't have memory order, atomic_load_explicit(v) becomes __sync_fetch_and_add(v, 0), and atomic_store_explicit() requires two full memory barriers. |
FYI, PR #31164 indicates the following 3 blockers:
After a conversation between @tiran, @daanx (the mimalloc creator), and me, we have a good idea on how to move forward. I'm looking into the PGO issue (with some help). |
…ythonGH-94790) Fixes the failure of PGO building with `mimalloc` on Windows, ensuring that `test_bpo20891` does not break profiling data (`python31*.pgc`). (cherry picked from commit 4a6bb30) Co-authored-by: neonene <[email protected]>
Fixes the failure of PGO building with `mimalloc` on Windows, ensuring that `test_bpo20891` does not break profiling data (`python31*.pgc`).
…ythonGH-94790) Fixes the failure of PGO building with `mimalloc` on Windows, ensuring that `test_bpo20891` does not break profiling data (`python31*.pgc`). (cherry picked from commit 4a6bb30) Co-authored-by: neonene <[email protected]>
Fixes the failure of PGO building with `mimalloc` on Windows, ensuring that `test_bpo20891` does not break profiling data (`python31*.pgc`). (cherry picked from commit 4a6bb30) Co-authored-by: neonene <[email protected]>
Include <unistd.h> to get sbrk() function.
mi_atomic_load_explicit() casts 'p' argument to drop the 'const' qualifier on Windows arm64 platform. Fix the compiler warning: 'function': different 'const' qualifiers (compiling source file ..\Objects\mimalloc\options.c)
I made a similar point wrt the bundling at #109914 (comment). |
* Don't include mimalloc .c's in Windows build * Fix warnings on Windows related to mimalloc
* Add mimalloc v2.12 Modified src/alloc.c to remove include of alloc-override.c and not compile new handler. Did not include the following files: - include/mimalloc-new-delete.h - include/mimalloc-override.h - src/alloc-override-osx.c - src/alloc-override.c - src/static.c - src/region.c mimalloc is thread safe and shares a single heap across all runtimes, therefore finalization and getting global allocated blocks across all runtimes is different. * mimalloc: minimal changes for use in Python: - remove debug spam for freeing large allocations - use same bytes (0xDD) for freed allocations in CPython and mimalloc This is important for the test_capi debug memory tests * Don't export mimalloc symbol in libpython. * Enable mimalloc as Python allocator option. * Add mimalloc MIT license. * Log mimalloc in Lib/test/pythoninfo.py. * Document new mimalloc support. * Use macro defs for exports as done in: python#31164 Co-authored-by: Sam Gross <[email protected]> Co-authored-by: Christian Heimes <[email protected]> Co-authored-by: Victor Stinner <[email protected]>
…#111522) Don't declare _PyMem_MimallocEnabled() if WITH_PYMALLOC macro is not defined (./configure --without-pymalloc). Fix also a typo in _PyInterpreterState_FinalizeAllocatedBlocks().
Include <unistd.h> to get sbrk() function.
mi_atomic_load_explicit() casts 'p' argument to drop the 'const' qualifier on Windows arm64 platform. Fix the compiler warning: 'function': different 'const' qualifiers (compiling source file ..\Objects\mimalloc\options.c)
…111532) * Don't include mimalloc .c's in Windows build * Fix warnings on Windows related to mimalloc
I think this issue could be closed now that mimalloc went into HEAD already. |
See follow-up issue: |
* Add mimalloc v2.12 Modified src/alloc.c to remove include of alloc-override.c and not compile new handler. Did not include the following files: - include/mimalloc-new-delete.h - include/mimalloc-override.h - src/alloc-override-osx.c - src/alloc-override.c - src/static.c - src/region.c mimalloc is thread safe and shares a single heap across all runtimes, therefore finalization and getting global allocated blocks across all runtimes is different. * mimalloc: minimal changes for use in Python: - remove debug spam for freeing large allocations - use same bytes (0xDD) for freed allocations in CPython and mimalloc This is important for the test_capi debug memory tests * Don't export mimalloc symbol in libpython. * Enable mimalloc as Python allocator option. * Add mimalloc MIT license. * Log mimalloc in Lib/test/pythoninfo.py. * Document new mimalloc support. * Use macro defs for exports as done in: python#31164 Co-authored-by: Sam Gross <[email protected]> Co-authored-by: Christian Heimes <[email protected]> Co-authored-by: Victor Stinner <[email protected]>
…#111522) Don't declare _PyMem_MimallocEnabled() if WITH_PYMALLOC macro is not defined (./configure --without-pymalloc). Fix also a typo in _PyInterpreterState_FinalizeAllocatedBlocks().
Include <unistd.h> to get sbrk() function.
mi_atomic_load_explicit() casts 'p' argument to drop the 'const' qualifier on Windows arm64 platform. Fix the compiler warning: 'function': different 'const' qualifiers (compiling source file ..\Objects\mimalloc\options.c)
…111532) * Don't include mimalloc .c's in Windows build * Fix warnings on Windows related to mimalloc
* Add mimalloc v2.12 Modified src/alloc.c to remove include of alloc-override.c and not compile new handler. Did not include the following files: - include/mimalloc-new-delete.h - include/mimalloc-override.h - src/alloc-override-osx.c - src/alloc-override.c - src/static.c - src/region.c mimalloc is thread safe and shares a single heap across all runtimes, therefore finalization and getting global allocated blocks across all runtimes is different. * mimalloc: minimal changes for use in Python: - remove debug spam for freeing large allocations - use same bytes (0xDD) for freed allocations in CPython and mimalloc This is important for the test_capi debug memory tests * Don't export mimalloc symbol in libpython. * Enable mimalloc as Python allocator option. * Add mimalloc MIT license. * Log mimalloc in Lib/test/pythoninfo.py. * Document new mimalloc support. * Use macro defs for exports as done in: python#31164 Co-authored-by: Sam Gross <[email protected]> Co-authored-by: Christian Heimes <[email protected]> Co-authored-by: Victor Stinner <[email protected]>
…#111522) Don't declare _PyMem_MimallocEnabled() if WITH_PYMALLOC macro is not defined (./configure --without-pymalloc). Fix also a typo in _PyInterpreterState_FinalizeAllocatedBlocks().
Include <unistd.h> to get sbrk() function.
mi_atomic_load_explicit() casts 'p' argument to drop the 'const' qualifier on Windows arm64 platform. Fix the compiler warning: 'function': different 'const' qualifiers (compiling source file ..\Objects\mimalloc\options.c)
…111532) * Don't include mimalloc .c's in Windows build * Fix warnings on Windows related to mimalloc
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: