Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in Py_Initialize in non-main thread in free-threading build #123022

Closed
colesbury opened this issue Aug 14, 2024 · 6 comments
Closed

Crash in Py_Initialize in non-main thread in free-threading build #123022

colesbury opened this issue Aug 14, 2024 · 6 comments
Labels
topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@colesbury
Copy link
Contributor

colesbury commented Aug 14, 2024

Py_Initialize will crash if in the free-threaded build if it's not called from the main thread. Additionally, background threads may crash if the program allocates lots of memory in total (>=30 TiB, not necessarily at one time.)

The problem is related to our use of mimalloc in the free-threaded build. We don't set mimalloc's "default heap" for threads, but a few code paths assume that there is a default heap.

Originally posted by @ngoldbaum in #122918 (comment)

I'm seeing similar crashes in my PyO3 dev environment on MacOS. Interestingly, it doesn't happen if I use a debug python 3.13 free-threaded build.

May be completely unrelated to the Windows-specific issues described above.

Here's the traceback from the crash:

running 681 tests
test buffer::tests::test_compatible_size ... ok
test buffer::tests::test_element_type_from_format ... ok
test conversions::std::array::tests::array_try_from_fn ... ok
Process 18512 stopped
* thread #3, name = 'buffer::tests::test_bytes_buffer', stop reason = EXC_BAD_ACCESS (code=2, address=0x1015edb08)
    frame #0: 0x000000010132d048 libpython3.13t.dylib`chacha_block(ctx=0x00000001015edac8) at random.c:67:20 [opt]
   64
   65  	  // add scrambled data to the initial state
   66  	  for (size_t i = 0; i < 16; i++) {
-> 67  	    ctx->output[i] = x[i] + ctx->input[i];
   68  	  }
   69  	  ctx->output_available = 16;
   70
Target 0: (pyo3-71e3a8c0dab6095a) stopped.
warning: libpython3.13t.dylib was compiled with optimization - stepping may behave oddly; variables may not be available.
(lldb) bt
* thread #3, name = 'buffer::tests::test_bytes_buffer', stop reason = EXC_BAD_ACCESS (code=2, address=0x1015edb08)
  * frame #0: 0x000000010132d048 libpython3.13t.dylib`chacha_block(ctx=0x00000001015edac8) at random.c:67:20 [opt]
    frame #1: 0x00000001013218d0 libpython3.13t.dylib`_mi_os_get_aligned_hint [inlined] chacha_next32(ctx=0x00000001015edac8) at random.c:83:5 [opt]
    frame #2: 0x00000001013218bc libpython3.13t.dylib`_mi_os_get_aligned_hint [inlined] _mi_random_next(ctx=0x00000001015edac8) at random.c:149:25 [opt]
    frame #3: 0x00000001013218bc libpython3.13t.dylib`_mi_os_get_aligned_hint [inlined] _mi_heap_random_next(heap=0x00000001015ecf80) at heap.c:258:10 [opt]
    frame #4: 0x00000001013218b8 libpython3.13t.dylib`_mi_os_get_aligned_hint(try_alignment=33554432, size=1073741824) at os.c:118:19 [opt]
    frame #5: 0x000000010132ee50 libpython3.13t.dylib`unix_mmap_prim(addr=0x0000000000000000, size=1073741824, try_alignment=33554432, protect_flags=3, flags=4162, fd=1677721600) at prim.c:190:18 [opt]
    frame #6: 0x0000000101327f68 libpython3.13t.dylib`_mi_prim_alloc [inlined] unix_mmap(addr=0x0000000000000000, size=<unavailable>, try_alignment=<unavailable>, protect_flags=3, large_only=false, allow_large=<unavailable>, is_large=<unavailable>) at prim.c:297:9 [opt]
    frame #7: 0x0000000101327ed4 libpython3.13t.dylib`_mi_prim_alloc(size=1073741824, try_alignment=33554432, commit=<unavailable>, allow_large=<unavailable>, is_large=0x0000000170211bdf, is_zero=<unavailable>, addr=0x0000000170211b78) at prim.c:334:11 [opt]
    frame #8: 0x0000000101321c18 libpython3.13t.dylib`mi_os_prim_alloc(size=1073741824, try_alignment=33554432, commit=true, allow_large=<unavailable>, is_large=<unavailable>, is_zero=<unavailable>, stats=<unavailable>) at os.c:201:13 [opt]
    frame #9: 0x000000010131a648 libpython3.13t.dylib`_mi_os_alloc_aligned [inlined] mi_os_prim_alloc_aligned(size=1073741824, alignment=33554432, commit=true, allow_large=<unavailable>, is_large=0x0000000170211bdf, is_zero=0x0000000170211bde, base=<unavailable>, stats=<unavailable>) at os.c:234:13 [opt]
    frame #10: 0x000000010131a60c libpython3.13t.dylib`_mi_os_alloc_aligned(size=<unavailable>, alignment=33554432, commit=true, allow_large=<unavailable>, memid=0x0000000170211c68, tld_stats=<unavailable>) at os.c:320:13 [opt]
    frame #11: 0x000000010131bb84 libpython3.13t.dylib`mi_reserve_os_memory_ex(size=1073741824, commit=true, allow_large=<unavailable>, exclusive=false, arena_id=0x0000000170211ccc) at arena.c:813:17 [opt]
    frame #12: 0x0000000101319fd0 libpython3.13t.dylib`_mi_arena_alloc_aligned [inlined] mi_arena_reserve(req_size=33554432, allow_large=true, req_arena_id=0, arena_id=0x0000000170211ccc) at arena.c:362:11 [opt]
    frame #13: 0x0000000101319f6c libpython3.13t.dylib`_mi_arena_alloc_aligned(size=33554432, alignment=33554432, align_offset=0, commit=true, allow_large=true, req_arena_id=0, memid=0x0000000170211da8, tld=0x00000001016cc828) at arena.c:383:11 [opt]
    frame #14: 0x000000010132dde8 libpython3.13t.dylib`mi_segment_alloc [inlined] mi_segment_os_alloc(required=0, page_alignment=<unavailable>, eager_delayed=<unavailable>, req_arena_id=0, psegment_slices=<unavailable>, ppre_size=<unavailable>, pinfo_slices=<unavailable>, commit=<unavailable>, tld=0x00000001016cc490, os_tld=<unavailable>) at segment.c:823:42 [opt]
    frame #15: 0x000000010132ddd0 libpython3.13t.dylib`mi_segment_alloc(required=0, page_alignment=<unavailable>, req_arena_id=0, tld=<unavailable>, os_tld=<unavailable>, huge_page=0x0000000000000000) at segment.c:882:27 [opt]
    frame #16: 0x0000000101326a38 libpython3.13t.dylib`mi_segments_page_alloc [inlined] mi_segment_reclaim_or_alloc(heap=0x00000001016cac80, needed_slices=1, block_size=64, tld=0x00000001016cc490, os_tld=0x00000001016cc828) at segment.c:1489:10 [opt]
    frame #17: 0x0000000101326788 libpython3.13t.dylib`mi_segments_page_alloc(heap=0x00000001016cac80, page_kind=<unavailable>, required=64, block_size=64, tld=0x00000001016cc490, os_tld=0x00000001016cc828) at segment.c:1508:9 [opt]
    frame #18: 0x000000010132cc7c libpython3.13t.dylib`mi_page_fresh_alloc(heap=0x00000001016cac80, pq=0x00000001016cb150, block_size=64, page_alignment=<unavailable>) at page.c:284:21 [opt]
    frame #19: 0x0000000101324044 libpython3.13t.dylib`mi_find_page [inlined] mi_page_fresh(heap=0x00000001016cac80, pq=0x00000001016cb150) at page.c:305:21 [opt]
    frame #20: 0x0000000101324030 libpython3.13t.dylib`mi_find_page [inlined] mi_page_queue_find_free_ex(heap=0x00000001016cac80, pq=0x00000001016cb150, first_try=true) at page.c:782:12 [opt]
    frame #21: 0x0000000101323ff0 libpython3.13t.dylib`mi_find_page [inlined] mi_find_free_page(heap=0x00000001016cac80, size=<unavailable>) at page.c:821:10 [opt]
    frame #22: 0x0000000101323e6c libpython3.13t.dylib`mi_find_page(heap=0x00000001016cac80, size=64, huge_alignment=0) at page.c:920:12 [opt]
    frame #23: 0x0000000101316158 libpython3.13t.dylib`_mi_malloc_generic(heap=0x00000001016cac80, size=<unavailable>, zero=<unavailable>, huge_alignment=0) at page.c:946:21 [opt]
    frame #24: 0x0000000101422014 libpython3.13t.dylib`gc_alloc [inlined] _PyObject_MallocWithType(tp=<unavailable>, size=<unavailable>) at pycore_object_alloc.h:46:17 [opt]
    frame #25: 0x0000000101421ff4 libpython3.13t.dylib`gc_alloc(tp=<unavailable>, basicsize=<unavailable>, presize=0) at gc_free_threading.c:1695:17 [opt]
    frame #26: 0x0000000101421ea4 libpython3.13t.dylib`_PyObject_GC_New(tp=0x0000000101657268) at gc_free_threading.c:1716:20 [opt]
    frame #27: 0x00000001012f266c libpython3.13t.dylib`PyDict_New [inlined] new_dict(interp=0x0000000101698e80, keys=<unavailable>, values=0x0000000000000000, used=0, free_values_on_failure=0) at dictobject.c:929:14 [opt]
    frame #28: 0x00000001012f263c libpython3.13t.dylib`PyDict_New at dictobject.c:1026:12 [opt]
    frame #29: 0x0000000101386050 libpython3.13t.dylib`_PyUnicode_InitGlobalObjects [inlined] init_interned_dict(interp=0x0000000101698e80) at unicodeobject.c:284:37 [opt]
    frame #30: 0x000000010138604c libpython3.13t.dylib`_PyUnicode_InitGlobalObjects(interp=0x0000000101698e80) at unicodeobject.c:15016:9 [opt]
    frame #31: 0x00000001014521f0 libpython3.13t.dylib`pycore_interp_init [inlined] pycore_init_global_objects(interp=0x0000000101698e80) at pylifecycle.c:702:14 [opt]
    frame #32: 0x00000001014521dc libpython3.13t.dylib`pycore_interp_init(tstate=0x00000001016c9330) at pylifecycle.c:852:14 [opt]
    frame #33: 0x000000010144f89c libpython3.13t.dylib`Py_InitializeFromConfig [inlined] pyinit_config(runtime=<unavailable>, tstate_p=<unavailable>, config=0x0000000170212130) at pylifecycle.c:933:14 [opt]
    frame #34: 0x000000010144f784 libpython3.13t.dylib`Py_InitializeFromConfig [inlined] pyinit_core(runtime=<unavailable>, src_config=<unavailable>, tstate_p=<unavailable>) at pylifecycle.c:1096:18 [opt]
    frame #35: 0x000000010144f784 libpython3.13t.dylib`Py_InitializeFromConfig(config=0x00000001702123a0) at pylifecycle.c:1398:14 [opt]
    frame #36: 0x000000010144f9b0 libpython3.13t.dylib`Py_InitializeEx(install_sigs=<unavailable>) at pylifecycle.c:1436:14 [opt]
    frame #37: 0x0000000100105b2c pyo3-71e3a8c0dab6095a`pyo3::gil::prepare_freethreaded_python::_$u7b$$u7b$closure$u7d$$u7d$::h2cd40f096e08bff0((null)={closure_env#0} @ 0x00000001702125c7, (null)=0x0000000170212640) at gil.rs:69:13
    frame #38: 0x00000001000654fc pyo3-71e3a8c0dab6095a`std::sync::once::Once::call_once_force::_$u7b$$u7b$closure$u7d$$u7d$::hdab74cb30759684d(p=0x0000000170212640) at once.rs:208:40
    frame #39: 0x0000000100314024 pyo3-71e3a8c0dab6095a`std::sys::sync::once::queue::Once::call::h9dcea807fd617ccf at queue.rs:183:21 [opt]
    frame #40: 0x00000001000652f0 pyo3-71e3a8c0dab6095a`std::sync::once::Once::call_once_force::h6081e1899cf15adf(self=0x00000001004d8ea0, f={closure_env#0} @ 0x000000017021271f) at once.rs:208:9
    frame #41: 0x0000000100000b0c pyo3-71e3a8c0dab6095a`pyo3::gil::prepare_freethreaded_python::hb6ac25f766ef5eb8 at gil.rs:66:5
    frame #42: 0x0000000100000b7c pyo3-71e3a8c0dab6095a`pyo3::gil::GILGuard::acquire::hcdf69afc9ea1299e at gil.rs:174:21
    frame #43: 0x00000001001a1708 pyo3-71e3a8c0dab6095a`pyo3::marker::Python::with_gil::h089ec9aca00e522e(f={closure_env#0} @ 0x000000017021279f) at marker.rs:403:21
    frame #44: 0x0000000100216c2c pyo3-71e3a8c0dab6095a`pyo3::buffer::tests::test_bytes_buffer::h458708c8fd91dd1e at buffer.rs:849:9
    frame #45: 0x0000000100035e40 pyo3-71e3a8c0dab6095a`pyo3::buffer::tests::test_bytes_buffer::_$u7b$$u7b$closure$u7d$$u7d$::h8009c230847e71d5((null)=0x00000001702127fe) at buffer.rs:848:27

Both the debug python build and the optimized, crashing Python are built from source using pyenv.

Linked PRs

@colesbury
Copy link
Contributor Author

The part about this not happening in a debug build makes sense. The crashing code path is disabled in debug builds (where MI_DEBUG=0):

#if (MI_SECURE>0 || MI_DEBUG==0) // security: randomize start of aligned allocations unless in debug mode
uintptr_t r = _mi_heap_random_next(mi_prim_get_default_heap());
init = init + ((MI_SEGMENT_SIZE * ((r>>17) & 0xFFFFF)) % MI_HINT_AREA); // (randomly 20 bits)*4MiB == 0 to 4TiB
#endif

I think this might have to do with re-initializing Python (i.e., Py_Initialize()) multiple times in the same process.

@ngoldbaum
Copy link
Contributor

ngoldbaum commented Aug 14, 2024

This is the relevant code in PyO3 that calls the C API:

https://github.com/PyO3/pyo3/blob/ffeb901db402189f16a8145c94cf024bed773cf6/src/gil.rs#L62-L75

When I run the test binary in a debugger, there's only one call to Py_Initialize before it crashes.

@ZeroIntensity
Copy link
Member

Does this affect all calls to Py_Initialize, or just from PyO3? If it's the former, I would consider marking this as a release blocker, it would be good to get this fixed by the next 3.13 RC. Otherwise, we're limiting embedded projects from early testing.

@ngoldbaum

This comment was marked as outdated.

@lysnikolaou
Copy link
Contributor

But that shouldn't be possible if this is the GIL-disabled build, right, since it's outside of the Py_GIL_DISABLED block?

Line 46 above is not inside an #ifdef Py_GIL_DISABLED. There's an #endif right above, no #else.

@colesbury
Copy link
Contributor Author

The crash occurs when Python is initialized for the first time outside of the main thread. Here's a reproducer:

#include <Python.h>
#include <pthread.h>

void *thread_main(void *unused)
{
    Py_InitializeEx(0); // crashes!
}

int main()
{
    pthread_t thread;
    pthread_create(&thread, NULL, thread_main, NULL);
    pthread_join(thread, NULL);
    return 0;
}

The problem is that mimalloc is calling _mi_heap_random_next(mi_prim_get_default_heap()), but the default heap isn't initialized to a "real" heap (it's still mimalloc's "empty" heap).

This doesn't happen in the main thread because the main thread's default heap is initialized when the process starts.

I think the crash can also occur if the program allocates a lot of memory in total, because the condition is hint == 0 (uninitialized) or hint > MI_HINT_MAX (>30TiB).

uintptr_t hint = mi_atomic_add_acq_rel(&aligned_base, size);
if (hint == 0 || hint > MI_HINT_MAX) { // wrap or initialize

We can fix this by calling mi_thread_init() or ensuring that the default heap is set.

@colesbury colesbury changed the title Crash in Py_Initialize on macOS with PyO3 in free-threading build Crash in Py_Initialize in non-main thread in free-threading build Aug 15, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Aug 15, 2024
Check that the current default heap is initialized in
`_mi_os_get_aligned_hint` and `mi_os_claim_huge_pages`.

The mimalloc function `_mi_os_get_aligned_hint` assumes that there is an
initialized default heap. This is true for our main thread, but not for
background threads. The problematic code path is usually called during
initialization (i.e., `Py_Initialize`), but it may also be called if the
program allocates large amounts of memory in total.

The crash only affected the free-threaded build.
colesbury added a commit that referenced this issue Aug 17, 2024
Check that the current default heap is initialized in
`_mi_os_get_aligned_hint` and `mi_os_claim_huge_pages`.

The mimalloc function `_mi_os_get_aligned_hint` assumes that there is an
initialized default heap. This is true for our main thread, but not for
background threads. The problematic code path is usually called during
initialization (i.e., `Py_Initialize`), but it may also be called if the
program allocates large amounts of memory in total.

The crash only affected the free-threaded build.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 17, 2024
…ythonGH-123052)

Check that the current default heap is initialized in
`_mi_os_get_aligned_hint` and `mi_os_claim_huge_pages`.

The mimalloc function `_mi_os_get_aligned_hint` assumes that there is an
initialized default heap. This is true for our main thread, but not for
background threads. The problematic code path is usually called during
initialization (i.e., `Py_Initialize`), but it may also be called if the
program allocates large amounts of memory in total.

The crash only affected the free-threaded build.
(cherry picked from commit d061ffe)

Co-authored-by: Sam Gross <[email protected]>
colesbury added a commit that referenced this issue Aug 17, 2024
…GH-123052) (#123114)

Check that the current default heap is initialized in
`_mi_os_get_aligned_hint` and `mi_os_claim_huge_pages`.

The mimalloc function `_mi_os_get_aligned_hint` assumes that there is an
initialized default heap. This is true for our main thread, but not for
background threads. The problematic code path is usually called during
initialization (i.e., `Py_Initialize`), but it may also be called if the
program allocates large amounts of memory in total.

The crash only affected the free-threaded build.
(cherry picked from commit d061ffe)

Co-authored-by: Sam Gross <[email protected]>
jeremyhylton pushed a commit to jeremyhylton/cpython that referenced this issue Aug 19, 2024
…ython#123052)

Check that the current default heap is initialized in
`_mi_os_get_aligned_hint` and `mi_os_claim_huge_pages`.

The mimalloc function `_mi_os_get_aligned_hint` assumes that there is an
initialized default heap. This is true for our main thread, but not for
background threads. The problematic code path is usually called during
initialization (i.e., `Py_Initialize`), but it may also be called if the
program allocates large amounts of memory in total.

The crash only affected the free-threaded build.
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…ython#123052)

Check that the current default heap is initialized in
`_mi_os_get_aligned_hint` and `mi_os_claim_huge_pages`.

The mimalloc function `_mi_os_get_aligned_hint` assumes that there is an
initialized default heap. This is true for our main thread, but not for
background threads. The problematic code path is usually called during
initialization (i.e., `Py_Initialize`), but it may also be called if the
program allocates large amounts of memory in total.

The crash only affected the free-threaded build.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

5 participants