Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python/pystate.c:2218: _PyThreadState_PopFrame: Assertion `tstate->datastack_top >= base' failed. #93252

Closed
The-Compiler opened this issue May 26, 2022 · 50 comments
Labels
3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@The-Compiler
Copy link
Contributor

Crash report

When running my application (qutebrowser) in a way it spawns/kills a lot of threads, after around 140 (?) threads spawned/killed, I get (depending on which commit I'm testing) either of:

Python/pystate.c:2080: _PyThreadState_PopFrame: Assertion `tstate->datastack_top >= locals' failed
Python/pystate.c:2218: _PyThreadState_PopFrame: Assertion `tstate->datastack_top >= base' failed.

Referring to:

assert(tstate->datastack_top >= base);

I've spent hours on trying to find a more minimal reproducer, but unfortunately, the best I can offer without some help about what I could try next is this:

  • git clone https://github.com/qutebrowser/qutebrowser.git
  • cd qutebrowser
  • python3.11 -m venv .venv
  • .venv/bin/pip install -r requirements.txt -r misc/requirements/requirements-pyqt.txt
  • Reproducing manually:
    • .venv/bin/python3 -m qutebrowser --temp-basedir -s tabs.last_close blank -s qt.chromium.sandboxing disable-seccomp-bpf
    • Hit d which will close the current tab
    • Hit u which will reopen it
    • Repeat the above around 47 times
  • Reproducing automatically:
    • Save the code below to a crasher.py and make it executable
    • .venv/bin/python3 -m qutebrowser --temp-basedir -s tabs.last_close blank -s qt.chromium.sandboxing disable-seccomp-bpf ":later 1000 spawn -u $PWD/crasher.py"
#!/usr/bin/python3
import os
import time

def run(cmd):
    with open(os.environ["QUTE_FIFO"], "w") as f:
        f.write(f"{cmd}\n")

for _ in range(100):
    run("tab-close")
    time.sleep(0.2)
    run("undo")
    time.sleep(0.2)

run("quit")

After some time (due to having to skip most commits because of #92112), I was able to bisect this to:

ae0a2b7 ("bpo-44590: Lazily allocate frame objects (GH-27077)", @markshannon)

Error messages

Stacktrace:

#0  0x00007ffff7d1636c in ?? () from /usr/lib/libc.so.6
#1  0x00007ffff7cc6838 in raise () from /usr/lib/libc.so.6
#2  0x00007ffff7cb0535 in abort () from /usr/lib/libc.so.6
#3  0x00007ffff7cb045c in ?? () from /usr/lib/libc.so.6
#4  0x00007ffff7cbf366 in __assert_fail () from /usr/lib/libc.so.6
#5  0x000055555580316e in _PyThreadState_PopFrame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821f70) at Python/pystate.c:2218
#6  0x00005555557aa94e in _PyEvalFrameClearAndPop (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821f70) at Python/ceval.c:6442
#7  0x00005555557aa99d in pop_frame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821f70) at Python/ceval.c:1682
#8  0x00005555557b08bc in _PyEval_EvalFrameDefault (tstate=0x555555b31178 <_PyRuntime+166136>, frame=0x7ffff7821f70, throwflag=<optimized out>) at Python/ceval.c:2497
#9  0x00005555557be97d in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821de8, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:66
#10 0x00005555557bea8e in _PyEval_Vector (tstate=0x555555b31178 <_PyRuntime+166136>, func=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6468
#11 0x00005555556cfa0f in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:393
#12 0x00005555556cf5e8 in _PyVectorcall_Call (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, func=0x5555556cf9bf <_PyFunction_Vectorcall>, callable=callable@entry=<function at remote 0x7fffe535f750>, 
    tuple=tuple@entry=(<CommandDispatcher(_win_id=0, _tabbed_browser=<TabbedBrowser(widget=<TabWidget() at remote 0x7fffe4fb00f0>, _win_id=0, _tab_insert_idx_left=0, _tab_insert_idx_right=1, is_shutting_down=False, undo_stack=<collections.deque at remote 0x7fffe50b1350>, _filter=<SignalFilter(_win_id=0) at remote 0x7fffe4fb05f0>, _now_focused=<WebEngineTab(is_private=False, win_id=0, tab_id=49, registry=<ObjectRegistry(data={'tab': <...>}, _partial_objs={'tab': <functools.partial at remote 0x7fff85b89320>}, command_only=[]) at remote 0x7fffe52f2a40>, data=<TabData(keep_icon=False, viewing_source=False, inspector=None, open_target=<ClickTarget(_value_=1, _name_='normal', __objclass__=<EnumType(_generate_next_value_=<function at remote 0x7ffff769a780>, __module__='qutebrowser.utils.usertypes', __doc__='How to open a clicked link.', _new_member_=<built-in method __new__ of type object at remote 0x555555a1d800>, _use_args_=False, _member_names_=['normal', 'tab', 'tab_bg', 'window', 'hover'], _member_map_={'normal': <...>, 'tab': <Clic...(truncated), kwargs=kwargs@entry={}) at Objects/call.c:245
#13 0x00005555556cf951 in _PyObject_Call (tstate=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<function at remote 0x7fffe535f750>, 
    args=args@entry=(<CommandDispatcher(_win_id=0, _tabbed_browser=<TabbedBrowser(widget=<TabWidget() at remote 0x7fffe4fb00f0>, _win_id=0, _tab_insert_idx_left=0, _tab_insert_idx_right=1, is_shutting_down=False, undo_stack=<collections.deque at remote 0x7fffe50b1350>, _filter=<SignalFilter(_win_id=0) at remote 0x7fffe4fb05f0>, _now_focused=<WebEngineTab(is_private=False, win_id=0, tab_id=49, registry=<ObjectRegistry(data={'tab': <...>}, _partial_objs={'tab': <functools.partial at remote 0x7fff85b89320>}, command_only=[]) at remote 0x7fffe52f2a40>, data=<TabData(keep_icon=False, viewing_source=False, inspector=None, open_target=<ClickTarget(_value_=1, _name_='normal', __objclass__=<EnumType(_generate_next_value_=<function at remote 0x7ffff769a780>, __module__='qutebrowser.utils.usertypes', __doc__='How to open a clicked link.', _new_member_=<built-in method __new__ of type object at remote 0x555555a1d800>, _use_args_=False, _member_names_=['normal', 'tab', 'tab_bg', 'window', 'hover'], _member_map_={'normal': <...>, 'tab': <Clic...(truncated), kwargs=kwargs@entry={}) at Objects/call.c:328
#14 0x00005555556cf999 in PyObject_Call (callable=callable@entry=<function at remote 0x7fffe535f750>, 
    args=args@entry=(<CommandDispatcher(_win_id=0, _tabbed_browser=<TabbedBrowser(widget=<TabWidget() at remote 0x7fffe4fb00f0>, _win_id=0, _tab_insert_idx_left=0, _tab_insert_idx_right=1, is_shutting_down=False, undo_stack=<collections.deque at remote 0x7fffe50b1350>, _filter=<SignalFilter(_win_id=0) at remote 0x7fffe4fb05f0>, _now_focused=<WebEngineTab(is_private=False, win_id=0, tab_id=49, registry=<ObjectRegistry(data={'tab': <...>}, _partial_objs={'tab': <functools.partial at remote 0x7fff85b89320>}, command_only=[]) at remote 0x7fffe52f2a40>, data=<TabData(keep_icon=False, viewing_source=False, inspector=None, open_target=<ClickTarget(_value_=1, _name_='normal', __objclass__=<EnumType(_generate_next_value_=<function at remote 0x7ffff769a780>, __module__='qutebrowser.utils.usertypes', __doc__='How to open a clicked link.', _new_member_=<built-in method __new__ of type object at remote 0x555555a1d800>, _use_args_=False, _member_names_=['normal', 'tab', 'tab_bg', 'window', 'hover'], _member_map_={'normal': <...>, 'tab': <Clic...(truncated), kwargs=kwargs@entry={}) at Objects/call.c:355
#15 0x00005555557ad14b in do_call_core (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, func=func@entry=<function at remote 0x7fffe535f750>, 
    callargs=callargs@entry=(<CommandDispatcher(_win_id=0, _tabbed_browser=<TabbedBrowser(widget=<TabWidget() at remote 0x7fffe4fb00f0>, _win_id=0, _tab_insert_idx_left=0, _tab_insert_idx_right=1, is_shutting_down=False, undo_stack=<collections.deque at remote 0x7fffe50b1350>, _filter=<SignalFilter(_win_id=0) at remote 0x7fffe4fb05f0>, _now_focused=<WebEngineTab(is_private=False, win_id=0, tab_id=49, registry=<ObjectRegistry(data={'tab': <...>}, _partial_objs={'tab': <functools.partial at remote 0x7fff85b89320>}, command_only=[]) at remote 0x7fffe52f2a40>, data=<TabData(keep_icon=False, viewing_source=False, inspector=None, open_target=<ClickTarget(_value_=1, _name_='normal', __objclass__=<EnumType(_generate_next_value_=<function at remote 0x7ffff769a780>, __module__='qutebrowser.utils.usertypes', __doc__='How to open a clicked link.', _new_member_=<built-in method __new__ of type object at remote 0x555555a1d800>, _use_args_=False, _member_names_=['normal', 'tab', 'tab_bg', 'window', 'hover'], _member_map_={'normal': <...>, 'tab': <Clic...(truncated), kwdict=kwdict@entry={}, use_tracing=0) at Python/ceval.c:7365
#16 0x00005555557bd006 in _PyEval_EvalFrameDefault (tstate=0x555555b31178 <_PyRuntime+166136>, frame=0x7ffff7821d18, throwflag=<optimized out>) at Python/ceval.c:5431
#17 0x00005555557be97d in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821bb8, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:66
#18 0x00005555557bea8e in _PyEval_Vector (tstate=0x555555b31178 <_PyRuntime+166136>, func=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6468
#19 0x00005555556cfa0f in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:393
#20 0x00005555556d224c in _PyObject_VectorcallTstate (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<function at remote 0x7fffe61af960>, args=args@entry=0x7fffffffb220, nargsf=nargsf@entry=2, 
    kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:92
#21 0x00005555556d239f in method_vectorcall (method=<optimized out>, args=0x7fffe534d968, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:89
#22 0x00005555556cf5e8 in _PyVectorcall_Call (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, func=0x5555556d22ce <method_vectorcall>, callable=callable@entry=<method at remote 0x7fffe54aad50>, tuple=tuple@entry=('undo',), 
    kwargs=kwargs@entry=0x0) at Objects/call.c:245
#23 0x00005555556cf951 in _PyObject_Call (tstate=0x555555b31178 <_PyRuntime+166136>, callable=<method at remote 0x7fffe54aad50>, args=('undo',), kwargs=0x0) at Objects/call.c:328
--Type <RET> for more, q to quit, c to continue without paging--
#24 0x00005555556cf999 in PyObject_Call (callable=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at Objects/call.c:355
#25 0x00007ffff66b73f0 in PyQtSlot::call(_object*, _object*) const () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#26 0x00007ffff66b7898 in PyQtSlot::invoke(void**, _object*, void*, bool) const () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#27 0x00007ffff66b7a50 in PyQtSlot::invoke(void**, _object*, void*) const () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#28 0x00007ffff66ba3ad in qt_metacall_worker(_sipSimpleWrapper*, _typeobject*, _sipTypeDef*, QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#29 0x00007ffff66ba17c in qt_metacall_worker(_sipSimpleWrapper*, _typeobject*, _sipTypeDef*, QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#30 0x00007ffff669d3cd in sipQObject::qt_metacall(QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#31 0x00007ffff5ed5f97 in void doActivate<false>(QObject*, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#32 0x00007ffff66ba1fe in qt_metacall_worker(_sipSimpleWrapper*, _typeobject*, _sipTypeDef*, QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#33 0x00007ffff66ba17c in qt_metacall_worker(_sipSimpleWrapper*, _typeobject*, _sipTypeDef*, QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#34 0x00007ffff669d3cd in sipQObject::qt_metacall(QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#35 0x00007ffff5ed5f97 in void doActivate<false>(QObject*, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#36 0x00007ffff66b3f6e in pyqtBoundSignal_emit () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#37 0x00005555556db2f0 in method_vectorcall_VARARGS (func=<method_descriptor at remote 0x7ffff73238f0>, args=0x7ffff7821b98, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/descrobject.c:330
#38 0x00005555556cfd83 in _PyObject_VectorcallTstate (tstate=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<method_descriptor at remote 0x7ffff73238f0>, args=args@entry=0x7ffff7821b98, nargsf=9223372036854775810, 
    kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:92
#39 0x00005555556cfe5a in PyObject_Vectorcall (callable=callable@entry=<method_descriptor at remote 0x7ffff73238f0>, args=args@entry=0x7ffff7821b98, nargsf=<optimized out>, kwnames=kwnames@entry=0x0) at Objects/call.c:299
#40 0x00005555557ba3da in _PyEval_EvalFrameDefault (tstate=0x555555b31178 <_PyRuntime+166136>, frame=0x7ffff7821b28, throwflag=<optimized out>) at Python/ceval.c:4826
#41 0x00005555557be97d in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821b28, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:66
#42 0x00005555557bea8e in _PyEval_Vector (tstate=0x555555b31178 <_PyRuntime+166136>, func=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6468
#43 0x00005555556cfa0f in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:393
#44 0x00005555556d224c in _PyObject_VectorcallTstate (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<function at remote 0x7fffe5365f40>, args=args@entry=0x7fffffffba38, nargsf=nargsf@entry=1, 
    kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:92
#45 0x00005555556d2467 in method_vectorcall (method=<optimized out>, args=0x555555b16e00 <_PyRuntime+58752>, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:67
#46 0x00005555556cf5e8 in _PyVectorcall_Call (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, func=0x5555556d22ce <method_vectorcall>, callable=callable@entry=<method at remote 0x7fffe4f07ef0>, tuple=tuple@entry=(), 
    kwargs=kwargs@entry=0x0) at Objects/call.c:245
#47 0x00005555556cf951 in _PyObject_Call (tstate=0x555555b31178 <_PyRuntime+166136>, callable=<method at remote 0x7fffe4f07ef0>, args=(), kwargs=0x0) at Objects/call.c:328
#48 0x00005555556cf999 in PyObject_Call (callable=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at Objects/call.c:355
#49 0x00007ffff66b73f0 in PyQtSlot::call(_object*, _object*) const () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#50 0x00007ffff66b7898 in PyQtSlot::invoke(void**, _object*, void*, bool) const () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#51 0x00007ffff66b7a50 in PyQtSlot::invoke(void**, _object*, void*) const () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#52 0x00007ffff66ba3ad in qt_metacall_worker(_sipSimpleWrapper*, _typeobject*, _sipTypeDef*, QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#53 0x00007ffff669d3cd in sipQObject::qt_metacall(QMetaObject::Call, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#54 0x00007ffff5ed5f97 in void doActivate<false>(QObject*, int, void**) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#55 0x00007ffff5ed9f8b in QSocketNotifier::activated(int, QSocketNotifier::QPrivateSignal) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#56 0x00007ffff5eda7c2 in QSocketNotifier::event(QEvent*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#57 0x00007ffff6684423 in sipQSocketNotifier::event(QEvent*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#58 0x00007ffff296343c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Widgets.so.5
#59 0x00007ffff2969f20 in QApplication::notify(QObject*, QEvent*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Widgets.so.5
#60 0x00007ffff346bcd6 in sipQApplication::notify(QObject*, QEvent*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtWidgets.abi3.so
#61 0x00007ffff5e9d808 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#62 0x00007ffff5ef9d98 in socketNotifierSourceDispatch(_GSource*, int (*)(void*), void*) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#63 0x00007ffff691a163 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#64 0x00007ffff69709e9 in ?? () from /usr/lib/libglib-2.0.so.0
#65 0x00007ffff69176c5 in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#66 0x00007ffff5ef91cc in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
--Type <RET> for more, q to quit, c to continue without paging--
#67 0x00007ffff5e9c21a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#68 0x00007ffff5ea51d3 in QCoreApplication::exec() () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#69 0x00007ffff32b65a1 in meth_QApplication_exec () from /home/florian/proj/qutebrowser/git/.venv-thread/lib/python3.11/site-packages/PyQt5/QtWidgets.abi3.so
#70 0x0000555555717472 in cfunction_call (func=<built-in method exec of Application object at remote 0x7fffe523ac10>, args=(), kwargs=0x0) at Objects/methodobject.c:553
#71 0x00005555556cfcc0 in _PyObject_MakeTpCall (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<built-in method exec of Application object at remote 0x7fffe523ac10>, args=args@entry=0x7ffff781e390, 
    nargs=<optimized out>, keywords=keywords@entry=0x0) at Objects/call.c:214
#72 0x00005555556cfe03 in _PyObject_VectorcallTstate (tstate=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<built-in method exec of Application object at remote 0x7fffe523ac10>, args=args@entry=0x7ffff781e390, 
    nargsf=<optimized out>, kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:90
#73 0x00005555556cfe5a in PyObject_Vectorcall (callable=callable@entry=<built-in method exec of Application object at remote 0x7fffe523ac10>, args=args@entry=0x7ffff781e390, nargsf=<optimized out>, kwnames=kwnames@entry=0x0)
    at Objects/call.c:299
#74 0x00005555557ba3da in _PyEval_EvalFrameDefault (tstate=0x555555b31178 <_PyRuntime+166136>, frame=0x7ffff781e338, throwflag=<optimized out>) at Python/ceval.c:4826
#75 0x00005555557be97d in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff781e1b8, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:66
#76 0x00005555557bea8e in _PyEval_Vector (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, func=func@entry=0x7ffff789bd80, 
    locals=locals@entry={'__name__': '__main__', '__doc__': 'Simple launcher for qutebrowser.', '__package__': 'qutebrowser', '__loader__': <SourceFileLoader(name='qutebrowser.__main__', path='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py') at remote 0x7ffff7663f30>, '__spec__': <ModuleSpec(name='qutebrowser.__main__', loader=<...>, origin='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached='/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc') at remote 0x7ffff7663da0>, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff78e3050>, '__file__': '/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', '__cached__': '/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc', 'sys': <module at remote 0x7ffff78d3770>, 'qutebrowser': <module at remote 0x7ffff7641190>}, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0) at Python/ceval.c:6468
#77 0x00005555557beb99 in PyEval_EvalCode (co=co@entry=<code at remote 0x7ffff78b9a60>, 
    globals=globals@entry={'__name__': '__main__', '__doc__': 'Simple launcher for qutebrowser.', '__package__': 'qutebrowser', '__loader__': <SourceFileLoader(name='qutebrowser.__main__', path='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py') at remote 0x7ffff7663f30>, '__spec__': <ModuleSpec(name='qutebrowser.__main__', loader=<...>, origin='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached='/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc') at remote 0x7ffff7663da0>, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff78e3050>, '__file__': '/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', '__cached__': '/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc', 'sys': <module at remote 0x7ffff78d3770>, 'qutebrowser': <module at remote 0x7ffff7641190>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': 'Simple launcher for qutebrowser.', '__package__': 'qutebrowser', '__loader__': <SourceFileLoader(name='qutebrowser.__main__', path='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py') at remote 0x7ffff7663f30>, '__spec__': <ModuleSpec(name='qutebrowser.__main__', loader=<...>, origin='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached='/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc') at remote 0x7ffff7663da0>, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff78e3050>, '__file__': '/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', '__cached__': '/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc', 'sys': <module at remote 0x7ffff78d3770>, 'qutebrowser': <module at remote 0x7ffff7641190>}) at Python/ceval.c:1207
#78 0x00005555557a5d7f in builtin_exec_impl (module=module@entry=<module at remote 0x7ffff78e3050>, source=<code at remote 0x7ffff78b9a60>, 
    globals={'__name__': '__main__', '__doc__': 'Simple launcher for qutebrowser.', '__package__': 'qutebrowser', '__loader__': <SourceFileLoader(name='qutebrowser.__main__', path='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py') at remote 0x7ffff7663f30>, '__spec__': <ModuleSpec(name='qutebrowser.__main__', loader=<...>, origin='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached='/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc') at remote 0x7ffff7663da0>, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff78e3050>, '__file__': '/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', '__cached__': '/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc', 'sys': <module at remote 0x7ffff78d3770>, 'qutebrowser': <module at remote 0x7ffff7641190>}, 
    locals={'__name__': '__main__', '__doc__': 'Simple launcher for qutebrowser.', '__package__': 'qutebrowser', '__loader__': <SourceFileLoader(name='qutebrowser.__main__', path='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py') at remote 0x7ffff7663f30>, '__spec__': <ModuleSpec(name='qutebrowser.__main__', loader=<...>, origin='/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached='/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc') at remote 0x7ffff7663da0>, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff78e3050>, '__file__': '/home/florian/proj/qutebrowser/git/qutebrowser/__main__.py', '__cached__': '/home/florian/proj/qutebrowser/git/qutebrowser/__pycache__/__main__.cpython-311.pyc', 'sys': <module at remote 0x7ffff78d3770>, 'qutebrowser': <module at remote 0x7ffff7641190>}, closure=0x0) at Python/bltinmodule.c:1075
#79 0x00005555557a5e92 in builtin_exec (module=<module at remote 0x7ffff78e3050>, args=<optimized out>, args@entry=0x7ffff781e180, nargs=nargs@entry=2, kwnames=kwnames@entry=0x0) at Python/clinic/bltinmodule.c.h:465
#80 0x0000555555716d1b in cfunction_vectorcall_FASTCALL_KEYWORDS (func=<built-in method exec of module object at remote 0x7ffff78e3050>, args=0x7ffff781e180, nargsf=<optimized out>, kwnames=0x0) at Objects/methodobject.c:443
#81 0x00005555556cfd83 in _PyObject_VectorcallTstate (tstate=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<built-in method exec of module object at remote 0x7ffff78e3050>, args=args@entry=0x7ffff781e180, 
    nargsf=9223372036854775810, kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:92
#82 0x00005555556cfe5a in PyObject_Vectorcall (callable=callable@entry=<built-in method exec of module object at remote 0x7ffff78e3050>, args=args@entry=0x7ffff781e180, nargsf=<optimized out>, kwnames=kwnames@entry=0x0)
    at Objects/call.c:299
--Type <RET> for more, q to quit, c to continue without paging--
#83 0x00005555557ba3da in _PyEval_EvalFrameDefault (tstate=0x555555b31178 <_PyRuntime+166136>, frame=0x7ffff781e0d8, throwflag=<optimized out>) at Python/ceval.c:4826
#84 0x00005555557be97d in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, frame=frame@entry=0x7ffff781e020, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:66
#85 0x00005555557bea8e in _PyEval_Vector (tstate=0x555555b31178 <_PyRuntime+166136>, func=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6468
#86 0x00005555556cfa0f in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:393
#87 0x00005555556cf5e8 in _PyVectorcall_Call (tstate=tstate@entry=0x555555b31178 <_PyRuntime+166136>, func=0x5555556cf9bf <_PyFunction_Vectorcall>, callable=callable@entry=<function at remote 0x7ffff7646ba0>, 
    tuple=tuple@entry=('qutebrowser', True), kwargs=kwargs@entry=0x0) at Objects/call.c:245
#88 0x00005555556cf951 in _PyObject_Call (tstate=0x555555b31178 <_PyRuntime+166136>, callable=callable@entry=<function at remote 0x7ffff7646ba0>, args=args@entry=('qutebrowser', True), kwargs=kwargs@entry=0x0) at Objects/call.c:328
#89 0x00005555556cf999 in PyObject_Call (callable=callable@entry=<function at remote 0x7ffff7646ba0>, args=args@entry=('qutebrowser', True), kwargs=kwargs@entry=0x0) at Objects/call.c:355
#90 0x0000555555823bf4 in pymain_run_module (modname=<optimized out>, set_argv0=set_argv0@entry=1) at Modules/main.c:300
#91 0x00005555558246d5 in pymain_run_python (exitcode=exitcode@entry=0x7fffffffc9f4) at Modules/main.c:595
#92 0x0000555555824964 in Py_RunMain () at Modules/main.c:680
#93 0x00005555558249de in pymain_main (args=args@entry=0x7fffffffca50) at Modules/main.c:710
#94 0x0000555555824aad in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:734
#95 0x0000555555643732 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

Debugging:

(gdb) p base
$1 = (PyObject **) 0x7ffff7821f70
(gdb) p tstate->datastack_top
$2 = (PyObject **) 0x7fffe5872090
(gdb) pp tstate tstate.datastack_top
tstate = 
   autoderefcount="1",[
      prev = <PyThreadState*> = {"0x0"}
      next = <PyThreadState*> = {"0x0"}
      interp = autoderefcount="1",<PyInterpreterState> = {"{...}"}
      _initialized = <int> = {"1"}
      _static = <int> = {"1"}
      recursion_remaining = <int> = {"985"}
      recursion_limit = <int> = {"1000"}
      recursion_headroom = <int> = {"0"}
      tracing = <int> = {"0"}
      tracing_what = <int> = {"0"}
      cframe = autoderefcount="1",<_PyCFrame> = {"{...}"}
      c_profilefunc = <Py_tracefunc> = {"0x0"}
      c_tracefunc = <Py_tracefunc> = {"0x0"}
      c_profileobj = <PyObject*> = {"0x0"}
      c_traceobj = <PyObject*> = {"0x0"}
      curexc_type = <PyObject*> = {"0x0"}
      curexc_value = <PyObject*> = {"0x0"}
      curexc_traceback = <PyObject*> = {"0x0"}
      exc_info = autoderefcount="1",<_PyErr_StackItem> = {"{...}"}
      dict = autoderefcount="1",<PyObject> = {"{...}"}
      gilstate_counter = <int> = {"4"}
      async_exc = <PyObject*> = {"0x0"}
      thread_id = <long unsigned int> = {"140737350489920"}
      native_thread_id = <long unsigned int> = {"2990635"}
      trash_delete_nesting = <int> = {"0"}
      trash_delete_later = <PyObject*> = {"0x0"}
      on_delete = <void (void *)*> = {"0x55555587c8e6 <release_sentinel>"}
      on_delete_data = <void*> = {"0x7ffff18dc750"}
      coroutine_origin_tracking_depth = <int> = {"0"}
      async_gen_firstiter = <PyObject*> = {"0x0"}
      async_gen_finalizer = <PyObject*> = {"0x0"}
      context = <PyObject*> = {"0x0"}
      context_ver = <uint64_t> = {"1"}
      id = <uint64_t> = {"1"}
      trace_info = <PyTraceInfo> = {"{...}"}
      datastack_chunk = autoderefcount="1",<_PyStackChunk> = {"{...}"}
      datastack_top = autoderefcount="2",[
         [class] = "<class 'function'>"
         [super class] = "<class 'object'>"
         [meta type] = "<class 'type'>"
         ob_refcnt = <Py_ssize_t> = {"2"}
         ob_type = autoderefcount="1",<PyTypeObject> = {"{...}"}
      ],<PyObject> = {"{...}"}
      datastack_limit = <PyObject*> = {"0x8fc02fc04"}
      exc_state = <_PyErr_StackItem> = {"{...}"}
      root_cframe = <_PyCFrame> = {"{...}"}
   ],<PyThreadState> = {"{...}"}

Happy to try more or report some debugging information, but I'm afraid I'm stuck at this point.

Your environment

  • CPython versions tested on: 3.11.0b1
  • Operating system and architecture: Arch Linux, x86_64
@The-Compiler The-Compiler added the type-crash A hard crash of the interpreter, possibly with a core dump label May 26, 2022
@markshannon
Copy link
Member

Spawning and killing lots of threads does suggest a race condition.

Do you have the values of datastack_top and datastack_limit, not what they point to?
Could you also get the contents of the stack chunk when the assert fails?
p *tstate->datastack_chunk
I don't need the data in it, just the previous, size and top fields.

The question I want to answer is this: Has one of the internal pointers exceeded its bounds, or is it pointing to the wrong piece of memory?

Thanks

@AA-Turner AA-Turner added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.11 only security fixes labels May 26, 2022
@The-Compiler
Copy link
Contributor Author

What's strange is that this always seems to happen at around the same time. I'm not entirely sure it's always the same number of repetitions, but it certainly always seems to be 50 ± 5 so far (spawning 3 threads each, from what I can see in gdb). Might well always be 47 times, it's somewhat hard to count since the process sending the commands runs independently.

#6  0x0000555555800261 in _PyThreadState_PopFrame (tstate=tstate@entry=0x555555b2deb8 <_PyRuntime+166136>, frame=frame@entry=0x7ffff7821f70) at Python/pystate.c:2218
2218	       assert(tstate->datastack_top >= base);
(gdb) p tstate->datastack_top
$1 = (PyObject **) 0x7fffe4c06090
(gdb) p tstate->datastack_limit
$2 = (PyObject **) 0x7fffe4c0a000
(gdb) p *tstate->datastack_chunk
$3 = {previous = 0x7ffff781e000, size = 16384, top = 0, data = {<function at remote 0x7fffe4f58e10>}}

Also, info threads if that's of any value:

(gdb) info threads
  Id   Target Id                                             Frame 
* 1    Thread 0x7ffff7c85740 (LWP 3097731) "python3"         __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
  2    Thread 0x7fffe45ff640 (LWP 3097779) "QXcbEventQueue"  0x00007ffff7d8dfaf in __GI___poll (fds=0x7fffe45feca8, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
  3    Thread 0x7fffe1a0b640 (LWP 3097780) "python3:cs0"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556718910) at futex-internal.c:57
  4    Thread 0x7fffe120a640 (LWP 3097781) "python3:disk$0"  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556711058) at futex-internal.c:57
  5    Thread 0x7fffe0a09640 (LWP 3097782) "python3:sh0"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219ac) at futex-internal.c:57
  6    Thread 0x7fffd3a20640 (LWP 3097783) "python3:sh1"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  7    Thread 0x7fffd321f640 (LWP 3097784) "python3:sh2"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219ac) at futex-internal.c:57
  8    Thread 0x7fffd2a1e640 (LWP 3097785) "python3:sh3"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  9    Thread 0x7fffd221d640 (LWP 3097786) "python3:sh4"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  10   Thread 0x7fffd1a1c640 (LWP 3097787) "python3:sh5"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  11   Thread 0x7fffd121b640 (LWP 3097788) "python3:sh6"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  12   Thread 0x7fffd0a1a640 (LWP 3097789) "python3:sh7"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  13   Thread 0x7fffabfff640 (LWP 3097790) "python3:sh8"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  14   Thread 0x7fffab7fe640 (LWP 3097791) "python3:sh9"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  15   Thread 0x7fffaaffd640 (LWP 3097792) "python3:sh10"    __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219a8) at futex-internal.c:57
  16   Thread 0x7fffaa7fc640 (LWP 3097793) "python3:sh11"    __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567219ac) at futex-internal.c:57
  17   Thread 0x7fffa9ffb640 (LWP 3097794) "python3:shlo0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556721f18) at futex-internal.c:57
  18   Thread 0x7fffa97fa640 (LWP 3097795) "python3:shlo1"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556721f18) at futex-internal.c:57
  19   Thread 0x7fffa8ff9640 (LWP 3097796) "python3:shlo2"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556721f18) at futex-internal.c:57
  20   Thread 0x7fff8bfff640 (LWP 3097797) "python3:shlo3"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556721f18) at futex-internal.c:57
  21   Thread 0x7fff8b7fe640 (LWP 3097798) "python3:shlo4"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555556721f18) at futex-internal.c:57
  22   Thread 0x7fff8affd640 (LWP 3097799) "python3:gdrv0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555567e6220) at futex-internal.c:57
  23   Thread 0x7fff7bfff640 (LWP 3097800) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff7bffed00, op=137, expected=0, futex_word=0x555556891080) at futex-internal.c:57
  24   Thread 0x7fff7b7fe640 (LWP 3097801) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff7b7fdd00, op=137, expected=0, futex_word=0x555556891494) at futex-internal.c:57
  25   Thread 0x7fff7affd640 (LWP 3097802) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff7affcd00, op=137, expected=0, futex_word=0x555556890e54) at futex-internal.c:57
  26   Thread 0x7fff7a7fc640 (LWP 3097803) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff7a7fbd00, op=137, expected=0, futex_word=0x55555689f3b4) at futex-internal.c:57
  27   Thread 0x7fff79ffb640 (LWP 3097804) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff79ffad00, op=137, expected=0, futex_word=0x5555568a0414) at futex-internal.c:57
  28   Thread 0x7fff797fa640 (LWP 3097805) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff797f9d00, op=137, expected=0, futex_word=0x5555568960b4) at futex-internal.c:57
  29   Thread 0x7fff78ff9640 (LWP 3097806) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff78ff8d00, op=137, expected=0, futex_word=0x5555568964c4) at futex-internal.c:57
  30   Thread 0x7fff5ffff640 (LWP 3097807) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5fffed00, op=137, expected=0, futex_word=0x555556896980) at futex-internal.c:57
  31   Thread 0x7fff5f7fe640 (LWP 3097808) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5f7fdd00, op=137, expected=0, futex_word=0x555556896e60) at futex-internal.c:57
  32   Thread 0x7fff5effd640 (LWP 3097809) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5effcd00, op=137, expected=0, futex_word=0x555556897340) at futex-internal.c:57
  33   Thread 0x7fff5e7fc640 (LWP 3097810) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5e7fbd00, op=137, expected=0, futex_word=0x555556897824) at futex-internal.c:57
  34   Thread 0x7fff5dffb640 (LWP 3097811) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5dffad00, op=137, expected=0, futex_word=0x55555689be80) at futex-internal.c:57
  35   Thread 0x7fff5d7fa640 (LWP 3097812) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5d7f9d00, op=137, expected=0, futex_word=0x55555689c364) at futex-internal.c:57
  36   Thread 0x7fff5cff9640 (LWP 3097813) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff5cff8d00, op=137, expected=0, futex_word=0x55555689c840) at futex-internal.c:57
  37   Thread 0x7fff3bfff640 (LWP 3097814) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff3bffed00, op=137, expected=0, futex_word=0x55555689cd24) at futex-internal.c:57
  38   Thread 0x7fff3b7fe640 (LWP 3097815) "Thread (pooled)" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff3b7fdd00, op=137, expected=0, futex_word=0x55555689d204) at futex-internal.c:57
  40   Thread 0x7fff3affd640 (LWP 3097817) "sandbox_ipc_thr" 0x00007ffff7d8dfaf in __GI___poll (fds=0x7fff3affcc40, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
  41   Thread 0x7fff39df7640 (LWP 3097823) "python3"         0x00007ffff7d64f9f in __GI___wait4 (pid=3097820, stat_loc=0x7fff39df6d9c, options=0, usage=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:30
  42   Thread 0x7fff395f6640 (LWP 3097824) "ThreadPoolServi" 0x00007ffff7d99f3e in epoll_wait (epfd=46, events=0x7fff1c0024c0, maxevents=32, timeout=39322) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  43   Thread 0x7fff38df5640 (LWP 3097825) "ThreadPoolForeg" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff38df4b20, op=137, expected=0, futex_word=0x7fff38df4cb8) at futex-internal.c:57
  44   Thread 0x7fff1bfff640 (LWP 3097826) "Chrome_IOThread" 0x00007ffff7d99f3e in epoll_wait (epfd=52, events=0x7fff14002480, maxevents=32, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  45   Thread 0x7fff1b0f5640 (LWP 3097827) "ThreadPoolForeg" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff1b0f4b20, op=137, expected=0, futex_word=0x7fff1b0f4cb8) at futex-internal.c:57
  46   Thread 0x7fff1a8f4640 (LWP 3097828) "ThreadPoolForeg" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff1a8f3b20, op=137, expected=0, futex_word=0x7fff1a8f3cb8) at futex-internal.c:57
  47   Thread 0x7fff1a0f3640 (LWP 3097829) "ThreadPoolForeg" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff1a0f2b20, op=137, expected=0, futex_word=0x7fff1a0f2cb8) at futex-internal.c:57
  48   Thread 0x7fff198f2640 (LWP 3097830) "inotify_reader"  0x00007ffff7d907ec in __GI___select (nfds=58, readfds=0x7fff198f1d70, writefds=0x0, exceptfds=0x0, timeout=0x0) at ../sysdeps/unix/sysv/linux/select.c:69
  49   Thread 0x7fff190f1640 (LWP 3097831) "ThreadPoolForeg" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff190f0b20, op=137, expected=0, futex_word=0x7fff190f0cb8) at futex-internal.c:57
  50   Thread 0x7fff188f0640 (LWP 3097832) "CompositorTileW" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55555697aa98) at futex-internal.c:57
  51   Thread 0x7fff13fff640 (LWP 3097833) "Chrome_InProcGp" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff13ffea20, op=137, expected=0, futex_word=0x7fff13ffebb8) at futex-internal.c:57
  52   Thread 0x7fff137fe640 (LWP 3097834) "Chrome_ChildIOT" 0x00007ffff7d99f3e in epoll_wait (epfd=62, events=0x7ffef00024c0, maxevents=32, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  53   Thread 0x7fff12ffd640 (LWP 3097835) "VideoCaptureThr" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fff12ffcb98) at futex-internal.c:57
  56   Thread 0x7fff127fc640 (LWP 3097838) "VizCompositorTh" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x7fff127fba20, op=137, expected=0, futex_word=0x7fff127fbbb8) at futex-internal.c:57
  57   Thread 0x7fff11ffb640 (LWP 3097839) "python3:gdrv0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7ffef8051680) at futex-internal.c:57
  58   Thread 0x7fff117fa640 (LWP 3097840) "Qt bearer threa" 0x00007ffff7d8dfaf in __GI___poll (fds=0x7ffedc0052d0, nfds=1, timeout=9991) at ../sysdeps/unix/sysv/linux/poll.c:29
  59   Thread 0x7fff10ff9640 (LWP 3097841) "QDBusConnection" 0x00007ffff7d8dfaf in __GI___poll (fds=0x7ffee004c000, nfds=4, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
  60   Thread 0x7ffedbbff640 (LWP 3097846) "ThreadPoolSingl" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7ffedbbfec98) at futex-internal.c:57
  61   Thread 0x7ffedb3fe640 (LWP 3097848) "CacheThread_Blo" 0x00007ffff7d99f3e in epoll_wait (epfd=82, events=0x7ffecc0024c0, maxevents=32, timeout=30000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
  63   Thread 0x7ffeda3fc640 (LWP 3097863) "python3:gdrv0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55555772edd0) at futex-internal.c:57
  64   Thread 0x7ffed9bfb640 (LWP 3097867) "MemoryInfra"     __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7ffed9bfab98) at futex-internal.c:57
  65   Thread 0x7ffed93fa640 (LWP 3097873) "python3:gdrv0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555557911dc0) at futex-internal.c:57
  66   Thread 0x7ffed8bf9640 (LWP 3097874) "python3:gdrv0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7ffef8160394) at futex-internal.c:57
  68   Thread 0x7ffebb7fe640 (LWP 3097876) "ThreadPoolSingl" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7ffebb7fdc98) at futex-internal.c:57
  361  Thread 0x7ffedabfd640 (LWP 3099598) "python3:gdrv0"   __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x5555575d9240) at futex-internal.c:57

Thanks for taking a look! Let me know if you need more information. Appreciate you taking a look, despite the lack of a simpler reproducer.

@markshannon
Copy link
Member

Sorry, I didn't make myself clear. Could I get all the addresses, so I can see how they are inconsistent.
Could you include the above and base and tstate->datastack_chunk (the address, as well as the contents).
Also could you also include *frame? if the debugger doesn't have frame, then p *(_PyInterpreterFrame *)base

datastack_top should point into the chunk, and datastack_limit should point to the top of the chunk.
The following should be true datastack_chunk < base < datastack_top < datastack_limit, but isn't 😞
Having the contents of frame might help as we can then see which parts of the stack it is consistent with.

Thanks for helping.

@The-Compiler
Copy link
Contributor Author

The-Compiler commented May 26, 2022

Sure:

Name Type Value
tstate->datastack_chunk (_PyStackChunk *) 0x7fffe4c06000
base (PyObject **) 0x7ffff7821f10
tstate->datastack_top (PyObject **) 0x7fffe4c06098
tstate->datastack_limit (PyObject **) 0x7fffe4c0a000
>>> datastack_chunk = 0x7fffe4c06000
>>> base = 0x7ffff7821f10
>>> datastack_top = 0x7fffe4c06098
>>> datastack_limit = 0x7fffe4c0a000
>>> 
>>> datastack_chunk < base
True
>>> base < datastack_top
False
>>> datastack_top < datastack_limit
True
(gdb) p *tstate->datastack_chunk
$12 = {previous = 0x7ffff781e000, size = 16384, top = 0, data = {<function at remote 0x7fffe4f58520>}}
(gdb) p *frame
$8 = {f_func = 0x7fffe59ab960, 
  f_globals = {'__name__': 'qutebrowser.browser.browsertab', '__doc__': 'Base class for a wrapper over QWebView/QWebEngineView.', '__package__': 'qutebrowser.browser', '__loader__': <SourceFileLoader(name='qutebrowser.browser.browsertab', path='/home/florian/proj/qutebrowser/git/qutebrowser/browser/browsertab.py') at remote 0x7fffe61e16e0>, '__spec__': <ModuleSpec(name='qutebrowser.browser.browsertab', loader=<...>, origin='/home/florian/proj/qutebrowser/git/qutebrowser/browser/browsertab.py', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=True, _cached='/home/florian/proj/qutebrowser/git/qutebrowser/browser/__pycache__/browsertab.cpython-311.pyc', _initializing=False) at remote 0x7fffe61e1730>, '__file__': '/home/florian/proj/qutebrowser/git/qutebrowser/browser/browsertab.py', '__cached__': '/home/florian/proj/qutebrowser/git/qutebrowser/browser/__pycache__/browsertab.cpython-311.pyc', '__builtins__': {'__name__': 'builtins', '__doc__': "Built-in functions, exceptions, and ...(truncated), 
  f_builtins = {'__name__': 'builtins', '__doc__': "Built-in functions, exceptions, and other objects.\n\nNoteworthy: None is the `nil' object; Ellipsis represents `...' in slices.", '__package__': '', '__loader__': <type at remote 0x555555bbaad0>, '__spec__': <ModuleSpec(name='builtins', loader=<type at remote 0x555555bbaad0>, origin='built-in', loader_state=None, submodule_search_locations=None, _uninitialized_submodules=[], _set_fileattr=False, _cached=None) at remote 0x7ffff78e5d20>, '__build_class__': <built-in method __build_class__ of module object at remote 0x7ffff78e31d0>, '__import__': <built-in method __import__ of module object at remote 0x7ffff78e31d0>, 'abs': <built-in method abs of module object at remote 0x7ffff78e31d0>, 'all': <built-in method all of module object at remote 0x7ffff78e31d0>, 'any': <built-in method any of module object at remote 0x7ffff78e31d0>, 'ascii': <built-in method ascii of module object at remote 0x7ffff78e31d0>, 'bin': <built-in method bin of module object at remote 0x7ffff78e31d0>, ...(truncated), f_locals = 0x0, f_code = 0x7fffe654a040, frame_obj = 0x0, previous = 0x7ffff7821e88, prev_instr = 0x7fffe654a21e, stacktop = 2, 
  is_entry = false, owner = 0 '\000', localsplus = {
    <WebEngineTab(is_private=False, win_id=0, tab_id=52, registry=<ObjectRegistry(data={'tab': <...>}, _partial_objs={'tab': <functools.partial at remote 0x7fffa1f5c8a0>}, command_only=[]) at remote 0x7fffe508b670>, data=<TabData(keep_icon=False, viewing_source=False, inspector=None, open_target=<ClickTarget(_value_=1, _name_='normal', __objclass__=<EnumType(_generate_next_value_=<function at remote 0x7ffff7676570>, __module__='qutebrowser.utils.usertypes', __doc__='How to open a clicked link.', _new_member_=<built-in method __new__ of type object at remote 0x555555a1a800>, _use_args_=False, _member_names_=['normal', 'tab', 'tab_bg', 'window', 'hover'], _member_map_={'normal': <...>, 'tab': <ClickTarget(_value_=2, _name_='tab', __objclass__=<...>, _sort_order_=1) at remote 0x7fffe6d48d80>, 'tab_bg': <ClickTarget(_value_=3, _name_='tab_bg', __objclass__=<...>, _sort_order_=2) at remote 0x7fffe6d48e20>, 'window': <ClickTarget(_value_=4, _name_='window', __objclass__=<...>, _sort_order_=3) at remote 0x7fffe6d48ec0>,...(truncated)}}
(gdb) pp *frame
[Thread 0x7fff1a0f3640 (LWP 3108203) exited]
[Thread 0x7fff1a8f4640 (LWP 3108202) exited]
[Thread 0x7fff5effd640 (LWP 3108183) exited]
[Thread 0x7fff5e7fc640 (LWP 3108184) exited]
[Thread 0x7fff5cff9640 (LWP 3108187) exited]
[Thread 0x7fff7bfff640 (LWP 3108174) exited]
[Thread 0x7fff79ffb640 (LWP 3108178) exited]
[Thread 0x7fff7b7fe640 (LWP 3108175) exited]
[Thread 0x7fff7affd640 (LWP 3108176) exited]
[Thread 0x7fff78ff9640 (LWP 3108180) exited]
[Thread 0x7fff3bfff640 (LWP 3108188) exited]
[Thread 0x7fff7a7fc640 (LWP 3108177) exited]
[Thread 0x7fff5f7fe640 (LWP 3108182) exited]
[Thread 0x7fff5d7fa640 (LWP 3108186) exited]
[Thread 0x7fff5ffff640 (LWP 3108181) exited]
[Thread 0x7fff3b7fe640 (LWP 3108189) exited]
[Thread 0x7fff5dffb640 (LWP 3108185) exited]
[Thread 0x7fff797fa640 (LWP 3108179) exited]
*frame = 
   [
      f_func = autoderefcount="1",<PyFunctionObject> = {"{...}"}
      f_globals = autoderefcount="1",<PyObject> = {"{...}"}
      f_builtins = autoderefcount="1",<PyObject> = {"{...}"}
      f_locals = <PyObject*> = {"0x0"}
      f_code = autoderefcount="1",<PyCodeObject> = {"{...}"}
      frame_obj = <PyFrameObject*> = {"0x0"}
      previous = autoderefcount="1",<struct _PyInterpreterFrame> = {"{...}"}
      prev_instr = autoderefcount="1",<_Py_CODEUNIT> = {"83"}
      stacktop = <int> = {"2"}
      is_entry = <_Bool> = {"0"}
      owner = <char> = {"0"}
      localsplus = <PyObject*[1]> = {"@0x7ffff7821f58"}
   ],<_PyInterpreterFrame> = {"{...}"}

@markshannon
Copy link
Member

It looks like the frame and thread's frame stack are both consistent, but that the frame belongs to another threadstate.

Could you check whether the tstate's frame is the same as the frame.
tstate->cframe->current_frame == frame should be true.
Also tstate should be the correct thread. If you up until you're in _PyEval_EvalFrameDefault does tstate->cframe == &cframe

@The-Compiler
Copy link
Contributor Author

(same run as before, keeping it open as long as I don't need to reboot)

Could you check whether the tstate's frame is the same as the frame.
tstate->cframe->current_frame == frame should be true.

It is:

(gdb) p frame
$21 = (_PyInterpreterFrame *) 0x7ffff7821f10
(gdb) p tstate->cframe->current_frame
$22 = (struct _PyInterpreterFrame *) 0x7ffff7821f10

Also tstate should be the correct thread. If you up until you're in _PyEval_EvalFrameDefault does tstate->cframe == &cframe

Yep!

(gdb) p tstate->cframe
$24 = (_PyCFrame *) 0x7fffffffb7d0
(gdb) p &cframe
$25 = (_PyCFrame *) 0x7fffffffb7d0

@The-Compiler
Copy link
Contributor Author

@markshannon You (or someone who knows their way around this code more than I do) don't happen to be at PyConIT by any chance (or perhaps Europython, though that's still a bit farther away)? Would be happy to go for a debugging session there if that's an option, might be easier, given that the issue seems rather tricky to track down.

@markshannon
Copy link
Member

I will be at EuroPython. Hopefully we will have fixed this before then

@The-Compiler
Copy link
Contributor Author

Unfortunately I had to cancel my Europython attendance thanks to COVID, so I won't be available for any in-person debugging there 😢.

If you'd be interested in a video call or something to debug this together on my machine, I'd be in for that, though. Or I can just continue to provide information here if you tell me what to look at - I have no idea what the involved code even does exactly, so I'm afraid I probably won't be able to find out anything interesting alone.

@kumaraditya303
Copy link
Contributor

Have you tested the current main branch? Have you tried tuning the gc thresholds to see if it makes it easier to trigger?

@The-Compiler
Copy link
Contributor Author

The mian branch fails in the same way:

python3: Python/pystate.c:2202: _PyThreadState_PopFrame: Assertion `tstate->datastack_top >= base' failed.

I don't think it's related to the gc in any way. I tried a gc.disable() and it still crashes in the same way.

@The-Compiler
Copy link
Contributor Author

@pablogsal should this perhaps be a deferred/release blocker? I realize I'm currently the only reporter seeing this, but it also happens with a Python 3.11 release build and is a crash happening during normal user usage - thus making qutebrowser unusable for daily usage on Python 3.11.

@pablogsal
Copy link
Member

Without a reproducer only involving CPython we sadly don't know if this is something on the dependencies or an extension or similar, unfortunately.

If you or someone else can provide a simpler reproducer only using CPython code, that would help a lot to estimate the severity of this.

@The-Compiler
Copy link
Contributor Author

Makes sense. Unfortunately I have no idea on where to even start with this (other than the bisecting/debugging above). All I can gather so far is that it's something about Python's internal frame objects and perhaps multiple threads...

@pablogsal
Copy link
Member

I understand, and thanks a lot for reporting this and the effort bisecting it. Sadly is very difficult for me as Release Manager to block on this if we cannot even reproduce it ourselves easily because this means that even if there is enough suspicion that something is missing it would be blocked indefinitely as we don't even know where the problem could be.

Another possibility is to wait until other simpler code triggers the same kind of error.

@pablogsal
Copy link
Member

Based on the reproducer it may be that something of the thread state management is not protected by the GIL and is causing thread states to be mixed or re-used. But this is just a wild guess.

@The-Compiler
Copy link
Contributor Author

The good news: I arrived at a much simpler reproducer after lots of tinkering. The bad news: It still requires either PyQt5 or PyQt6 to reproduce.... Here it is:

from PyQt5.QtCore import QObject

def run():
    obj = QObject()
    for _ in range(202):
        obj.destroyed.connect(lambda: None)

run()

Stacktrace:

#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007ffff7c8e3d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007ffff7c3e838 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007ffff7c28535 in __GI_abort () at abort.c:79
#4  0x00007ffff7c2845c in __assert_fail_base (fmt=0x7ffff7dbfe70 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55555592532a "tstate->datastack_top >= base", file=0x555555924fc6 "Python/pystate.c", line=2201, 
    function=<optimized out>) at assert.c:92
#5  0x00007ffff7c37366 in __GI___assert_fail (assertion=assertion@entry=0x55555592532a "tstate->datastack_top >= base", file=file@entry=0x555555924fc6 "Python/pystate.c", line=line@entry=2201, 
    function=function@entry=0x555555925360 <__PRETTY_FUNCTION__.0> "_PyThreadState_PopFrame") at assert.c:101
#6  0x0000555555803027 in _PyThreadState_PopFrame (tstate=tstate@entry=0x555555b368e8 <_PyRuntime+166312>, frame=frame@entry=0x7ffff7e6c078) at Python/pystate.c:2201
#7  0x00005555557aa14a in _PyEvalFrameClearAndPop (tstate=tstate@entry=0x555555b368e8 <_PyRuntime+166312>, frame=frame@entry=0x7ffff7e6c078) at Python/ceval.c:6391
#8  0x00005555557aa199 in pop_frame (tstate=tstate@entry=0x555555b368e8 <_PyRuntime+166312>, frame=frame@entry=0x7ffff7e6c078) at Python/ceval.c:1629
#9  0x00005555557b022e in _PyEval_EvalFrameDefault (tstate=0x555555b368e8 <_PyRuntime+166312>, frame=0x7ffff7e6c078, throwflag=<optimized out>) at Python/ceval.c:2447
#10 0x00005555557be6e4 in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b368e8 <_PyRuntime+166312>, frame=frame@entry=0x7ffff7e6c020, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:73
#11 0x00005555557be7e5 in _PyEval_Vector (tstate=tstate@entry=0x555555b368e8 <_PyRuntime+166312>, func=func@entry=0x7ffff76224c0, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0) at Python/ceval.c:6417
#12 0x00005555557be8fa in PyEval_EvalCode (co=co@entry=<code at remote 0x7ffff770fa40>, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}) at Python/ceval.c:1154
#13 0x0000555555803100 in run_eval_code_obj (tstate=tstate@entry=0x555555b368e8 <_PyRuntime+166312>, co=co@entry=0x7ffff770fa40, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}) at Python/pythonrun.c:1714
#14 0x00005555558031bd in run_mod (mod=mod@entry=0x555555c30630, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/repro.py', 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, flags=flags@entry=0x7fffffffc948, arena=arena@entry=0x7ffff7683100) at Python/pythonrun.c:1735
#15 0x0000555555803285 in pyrun_file (fp=fp@entry=0x555555b7f4f0, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/repro.py', start=start@entry=257, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/repro.py') at remote 0x7ffff7635af0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/repro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cdf450>, 'run': <function at remote 0x7ffff768db20>}, closeit=closeit@entry=1, flags=0x7fffffffc948) at Python/pythonrun.c:1630
#16 0x000055555580604f in _PyRun_SimpleFileObject (fp=fp@entry=0x555555b7f4f0, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/repro.py', closeit=closeit@entry=1, flags=flags@entry=0x7fffffffc948)
--Type <RET> for more, q to quit, c to continue without paging--
    at Python/pythonrun.c:440
#17 0x0000555555806203 in _PyRun_AnyFileObject (fp=fp@entry=0x555555b7f4f0, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/repro.py', closeit=closeit@entry=1, flags=flags@entry=0x7fffffffc948)
    at Python/pythonrun.c:79
#18 0x0000555555823b76 in pymain_run_file_obj (program_name=program_name@entry='/home/florian/tmp/python-crash/qutebrowser/.venv/bin/python', filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/repro.py', 
    skip_source_first_line=0) at Modules/main.c:360
#19 0x0000555555823c94 in pymain_run_file (config=config@entry=0x555555b1c940 <_PyRuntime+59904>) at Modules/main.c:379
#20 0x0000555555824403 in pymain_run_python (exitcode=exitcode@entry=0x7fffffffcaa4) at Modules/main.c:601
#21 0x0000555555824658 in Py_RunMain () at Modules/main.c:680
#22 0x00005555558246d2 in pymain_main (args=args@entry=0x7fffffffcb00) at Modules/main.c:710
#23 0x00005555558247a1 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:734
#24 0x0000555555644732 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

Some observations:

  • There is no multithreading involved. info threads only shows * 1 Thread 0x7ffff7e91740 (LWP 368123) "python" __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44.
  • When checking out CPython at ae0a2b7~1, the crash is gone (as expected).
  • When making obj global rather than wrapping it in a function, the crash is gone.
  • With 201 iterations instead of 202, the crash is consistently gone, and with 202 iterations, it's consistently there.
  • The crash happens with both PyQt5 and PyQt6
  • The crash does not happen with PySide2 (different Python bindings for the same C++ source - replace PyQt5 by PySide2 and exec() by exec_())

@pablogsal
Copy link
Member

pablogsal commented Jul 8, 2022

I'm marking this as a release blocker because ae0a2b7 should not affect C extensions and that's suspicious.

Note that I'm still not sure this is not just some latent bug in PyQt that was just triggered by these changes, but it would be good to double check as this could be a more general issue.

@pablogsal
Copy link
Member

Btw thanks a lot @The-Compiler for working on a simpler reproducer!

@The-Compiler
Copy link
Contributor Author

Much appreciated, thanks! I've been trying to figure out more, and there's one more thing that caught my eye: When adjusting the reproducer to print the Python stack:

from PyQt5.QtCore import QObject
import traceback

def fun():
    traceback.print_stack()
    print("========")

def run():
    obj = QObject()
    for _ in range(169):
        obj.destroyed.connect(fun)

run()

then:

  • only 169 connections to the signal are required (i.e. fun gets called 169 times), again, 168 connections works with no issues reliably
  • the Python stack trace looks different to how it looks with Python 3.10, and it doesn't make any sense I believe

From what I understand, fun gets called from C++ when the object is destroyed - in other words, somewhere when obj gets garbage collected. (I've tried to experiment with pure Python objects and __del__, with no success so far). With Python 3.10, that's roughly what the stacktrace shows:

  File "/home/florian/tmp/python-crash/qutebrowser/repro.py", line 13, in <module>
    run()
  File "/home/florian/tmp/python-crash/qutebrowser/repro.py", line 5, in fun
    traceback.print_stack()

but with Python 3.11, the stacktrace seems to point to the for loop:

  File "/home/florian/tmp/python-crash/qutebrowser/repro.py", line 13, in <module>
    run()
  File "/home/florian/tmp/python-crash/qutebrowser/repro.py", line 10, in run
    for _ in range(169):
  File "/home/florian/tmp/python-crash/qutebrowser/repro.py", line 5, in fun
    traceback.print_stack()

however, that might also be a red herring perhaps? When I check out ae0a2b7 (the first commit with the issue), I do see the crash as soon as I use range(184), but I do not see the additional frame in the stacktrace.

@pablogsal
Copy link
Member

pablogsal commented Jul 8, 2022

Can you show the C++ stack at the time fun is called? You could do this by sleeping in func and attaching with a debugger.

I'm having trouble installing PyQt5 in my M1 mac laptop unfortunately, so so far I cannot reproduce. I'm getting this error, for the curious:

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [42 lines of output]
      Traceback (most recent call last):
        File "/Users/pgalindo3/.local/lib/python3.11/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 156, in prepare_metadata_for_build_wheel
          hook = backend.prepare_metadata_for_build_wheel
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      AttributeError: module 'sipbuild.api' has no attribute 'prepare_metadata_for_build_wheel'

@The-Compiler
Copy link
Contributor Author

The-Compiler commented Jul 8, 2022

By pure accident, I found a Python-only reproducer finally! Fingers crossed that it's in fact the same issue, but it does sure look like it:

class Obj:

    def __del__():  # sic!
        pass

def run():
    for _ in range(202):
        obj = Obj()

run()

As before, 201 works fine, 202 fails. And the traceback does again point to the for-loop, though I suppose it actually makes sense here, as that's where the old obj gets deleted probably:

Exception ignored in: <function Obj.__del__ at 0x7f3881485b20>
Traceback (most recent call last):
  File "/home/florian/tmp/python-crash/qutebrowser/repro.py", line 7, in run
    for _ in range(202):
    ^^^^^^^^^^^^^^^^^^^^
TypeError: Obj.__del__() takes 0 positional arguments but 1 was given
python: Python/pystate.c:2201: _PyThreadState_PopFrame: Assertion `tstate->datastack_top >= base' failed.

@pablogsal I believe PyQt5 does not have wheels for M1 macs. You should however be able to simply replace PyQt5 by PyQt6 in the imports and run again with PyQt6 installed instead, which does have M1 wheels.

C++ stack in fun:

#0  __GI___clock_nanosleep (clock_id=clock_id@entry=1, flags=flags@entry=1, req=req@entry=0x7fffffffbe20, rem=rem@entry=0x0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:71
#1  0x000055555587ba21 in pysleep (timeout=<optimized out>) at ./Modules/timemodule.c:2162
#2  0x000055555587bada in time_sleep (self=<optimized out>, timeout_obj=timeout_obj@entry=30) at ./Modules/timemodule.c:383
#3  0x000055555571622d in cfunction_vectorcall_O (func=<built-in method sleep of module object at remote 0x7ffff77f4470>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/methodobject.c:514
#4  0x00005555556cf499 in _PyObject_VectorcallTstate (tstate=0x555555b2df38 <_PyRuntime+166232>, callable=callable@entry=<built-in method sleep of module object at remote 0x7ffff77f4470>, args=args@entry=0x7ffff7e6c1a8, 
    nargsf=9223372036854775809, kwnames=kwnames@entry=0x0) at ./Include/internal/pycore_call.h:92
#5  0x00005555556cf570 in PyObject_Vectorcall (callable=callable@entry=<built-in method sleep of module object at remote 0x7ffff77f4470>, args=args@entry=0x7ffff7e6c1a8, nargsf=<optimized out>, kwnames=kwnames@entry=0x0)
    at Objects/call.c:299
#6  0x00005555557b7ebd in _PyEval_EvalFrameDefault (tstate=0x555555b2df38 <_PyRuntime+166232>, frame=0x7ffff7e6c150, throwflag=<optimized out>) at Python/ceval.c:4773
#7  0x00005555557bc59b in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, frame=frame@entry=0x7ffff7e6c150, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:72
#8  0x00005555557bc69c in _PyEval_Vector (tstate=0x555555b2df38 <_PyRuntime+166232>, func=0x7ffff7685bd0, locals=locals@entry=0x0, args=0x555555b13bc0 <_PyRuntime+58848>, argcount=<optimized out>, kwnames=0x0) at Python/ceval.c:6421
#9  0x00005555556cf138 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:393
#10 0x00005555556cecdd in _PyVectorcall_Call (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, func=0x5555556cf0e8 <_PyFunction_Vectorcall>, callable=callable@entry=<function at remote 0x7ffff7685bd0>, tuple=tuple@entry=(), 
    kwargs=kwargs@entry=0x0) at Objects/call.c:245
#11 0x00005555556cf07a in _PyObject_Call (tstate=0x555555b2df38 <_PyRuntime+166232>, callable=<function at remote 0x7ffff7685bd0>, args=(), kwargs=0x0) at Objects/call.c:328
#12 0x00005555556cf0c2 in PyObject_Call (callable=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at Objects/call.c:355
#13 0x00007ffff7236f70 in PyQtSlot::call(_object*, _object*) const () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#14 0x00007ffff7237418 in PyQtSlot::invoke(void**, _object*, void*, bool) const () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#15 0x00007ffff723770e in PyQtSlotProxy::unislot(void**) () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#16 0x00007ffff72381d7 in PyQtSlotProxy::qt_metacall(QMetaObject::Call, int, void**) () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#17 0x00007ffff6ad5f97 in void doActivate<false>(QObject*, int, void**) () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#18 0x00007ffff6acf50f in QObject::destroyed(QObject*) () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#19 0x00007ffff6ad4082 in QObject::~QObject() () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/Qt5/lib/libQt5Core.so.5
#20 0x00007ffff71a20fd in sipQObject::~sipQObject() () from /home/florian/tmp/python-crash/qutebrowser/.venv/lib/python3.11/site-packages/PyQt5/QtCore.abi3.so
#21 0x00007ffff7e4f24b in forgetObject (sw=sw@entry=0x7ffff759e670) at siplib.c:11418
#22 0x00007ffff7e50553 in sipWrapper_dealloc (self=self@entry=0x7ffff759e670) at siplib.c:11037
#23 0x00005555557301d5 in subtype_dealloc (self=<QObject() at remote 0x7ffff759e670>) at Objects/typeobject.c:1473
#24 0x0000555555719b97 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2384
#25 0x00005555557d845e in Py_DECREF (filename=filename@entry=0x555555899658 "./Include/object.h", lineno=lineno@entry=602, op=<optimized out>) at ./Include/object.h:527
#26 0x00005555557d847d in Py_XDECREF (op=<optimized out>) at ./Include/object.h:602
#27 0x00005555557d89a3 in _PyFrame_Clear (frame=frame@entry=0x7ffff7e6c078) at Python/frame.c:107
#28 0x00005555557a7fc2 in _PyEvalFrameClearAndPop (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, frame=frame@entry=0x7ffff7e6c078) at Python/ceval.c:6393
#29 0x00005555557a8025 in pop_frame (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, frame=frame@entry=0x7ffff7e6c078) at Python/ceval.c:1626
#30 0x00005555557ae0ab in _PyEval_EvalFrameDefault (tstate=0x555555b2df38 <_PyRuntime+166232>, frame=0x7ffff7e6c078, throwflag=<optimized out>) at Python/ceval.c:2444
#31 0x00005555557bc59b in _PyEval_EvalFrame (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, frame=frame@entry=0x7ffff7e6c020, throwflag=throwflag@entry=0) at ./Include/internal/pycore_ceval.h:72
#32 0x00005555557bc69c in _PyEval_Vector (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, func=func@entry=0x7ffff761e570, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, args=args@entry=0x0, argcount=argcount@entry=0, 
    kwnames=kwnames@entry=0x0) at Python/ceval.c:6421
#33 0x00005555557bc7b1 in PyEval_EvalCode (co=co@entry=<code at remote 0x7ffff770e640>, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': --Type <RET> for more, q to quit, c to continue without paging--
<module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}) at Python/ceval.c:1155
#34 0x0000555555800b66 in run_eval_code_obj (tstate=tstate@entry=0x555555b2df38 <_PyRuntime+166232>, co=co@entry=0x7ffff770e640, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}) at Python/pythonrun.c:1714
#35 0x0000555555800c23 in run_mod (mod=mod@entry=0x555555c28f40, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, flags=flags@entry=0x7fffffffc948, arena=arena@entry=0x7ffff767b160)
    at Python/pythonrun.c:1735
#36 0x0000555555800ceb in pyrun_file (fp=fp@entry=0x555555b764f0, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', start=start@entry=257, 
    globals=globals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, 
    locals=locals@entry={'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <SourceFileLoader(name='__main__', path='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py') at remote 0x7ffff7631c30>, '__spec__': None, '__annotations__': {}, '__builtins__': <module at remote 0x7ffff779f230>, '__file__': '/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', '__cached__': None, 'QObject': <sip.wrappertype at remote 0x555555cd5a00>, 'traceback': <module at remote 0x7ffff767bbf0>, 'time': <module at remote 0x7ffff77f4470>, 'fun': <function at remote 0x7ffff7685bd0>, 'run': <function at remote 0x7ffff7412200>}, closeit=closeit@entry=1, flags=0x7fffffffc948)
    at Python/pythonrun.c:1630
#37 0x0000555555803ab5 in _PyRun_SimpleFileObject (fp=fp@entry=0x555555b764f0, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', closeit=closeit@entry=1, flags=flags@entry=0x7fffffffc948)
    at Python/pythonrun.c:440
#38 0x0000555555803c69 in _PyRun_AnyFileObject (fp=fp@entry=0x555555b764f0, filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', closeit=closeit@entry=1, flags=flags@entry=0x7fffffffc948)
    at Python/pythonrun.c:79
#39 0x0000555555821387 in pymain_run_file_obj (program_name=program_name@entry='/home/florian/tmp/python-crash/qutebrowser/.venv/bin/python', filename=filename@entry='/home/florian/tmp/python-crash/qutebrowser/oldrepro.py', 
    skip_source_first_line=0) at Modules/main.c:360
#40 0x00005555558214a5 in pymain_run_file (config=config@entry=0x555555b13f90 <_PyRuntime+59824>) at Modules/main.c:379
#41 0x0000555555821c14 in pymain_run_python (exitcode=exitcode@entry=0x7fffffffcaa4) at Modules/main.c:601
#42 0x0000555555821e69 in Py_RunMain () at Modules/main.c:680
#43 0x0000555555821ee3 in pymain_main (args=args@entry=0x7fffffffcb00) at Modules/main.c:710
#44 0x0000555555821fb2 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:734
#45 0x0000555555643732 in main (argc=<optimized out>, argv=<optimized out>) at ./Programs/python.c:15

@pablogsal
Copy link
Member

Fantastic, with this reproducer we can start working! This is indeed a release blocker, and is a bit scary how simple the code that crases is. Fantastic work producing the reproducer The-Compiler 🥇

@brandtbucher
Copy link
Member

Yeah, something’s wrong with the frame stack logic. Our stack of frames shouldn’t be growing at all, but it is, deeper and deeper, until it’s time to allocate a new chunk. Once we try to pop the next frame off after allocating that new chunk, base and datastack_top are pointing at different chunks, and things blow up.

It appears that we don't pop off the new _PyInterpreterFrame correctly after a failed call. Simpler reproducer without GC:

def f():
    pass

for _ in range(203):
    try:
        f(None)
    except:
        pass

@brandtbucher
Copy link
Member

(So, in other words, this is really serious.)

@kumaraditya303 kumaraditya303 added the 3.12 bugs and security fixes label Jul 8, 2022
@brandtbucher
Copy link
Member

I think I found it. _PyEvalFramePushAndInit clears the frame, but does not pop it when initialize_locals fails. I'll try out a fix.

@kumaraditya303
Copy link
Contributor

I think I found it. _PyEvalFramePushAndInit clears the frame, but does not pop it when initialize_locals fails. I'll try out a fix.

That seems correct.
The following patch fixes the issue for me on Linux.

diff --git a/Python/ceval.c b/Python/ceval.c
index 0176002432..b4e2fee0a4 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -6410,7 +6410,7 @@ _PyEvalFramePushAndInit(PyThreadState *tstate, PyFunctionObject *func,
     }
     if (initialize_locals(tstate, func, localsarray, args, argcount, kwnames)) {
         assert(frame->owner != FRAME_OWNED_BY_GENERATOR);
-        _PyFrame_Clear(frame);
+        _PyEvalFrameClearAndPop(tstate, frame);
         return NULL;
     }
     return frame;

@brandtbucher
Copy link
Member

@pablogsal, what if you crank up 203 to, like, 1000? Could be that frames are smaller on your build or something.

@pablogsal
Copy link
Member

@pablogsal, what if you crank up 203 to, like, 1000? Could be that frames are smaller on your build or something.

I tried up to 20000 and I also tried to decrement the recursion limit with no luck :(

@brandtbucher
Copy link
Member

I wonder if your compiler can prove that the comparison is undefined for those two (different) buffers and skips it. Or something.

@pablogsal
Copy link
Member

pablogsal commented Jul 8, 2022

(So, in other words, this is really serious.)

Ugh, is quite unfortunate that we didn't catched this before :(

I suppose there aren't many reason for the locals allocation to fail (on normal, working code that's passing the correct parameters to functions) other than funky destructors, but is a bit worrisome that our entire test suite never touched this path

@brandtbucher
Copy link
Member

(So, in other words, this is really serious.)

Ugh, is quite unfortunate that we didn't catched this before :(

I suppose there aren't many reason for the locals allocation to fail other than funky destructors, but is a bit worrisome that our entire test suite never touched this path

Thankfully it's easy to write a test for now. I'm testing out fixes on both branches.

@brandtbucher
Copy link
Member

brandtbucher commented Jul 8, 2022

but is a bit worrisome that our entire test suite never touched this path

It's actually a bit tricky to hit. You need make N failing calls (enough to overflow the current 16KB stack chunk) and recover gracefully from all of them without exiting (successfully or otherwise) from the oldest frame where a failed call occurred. And even then, it doesn't actually crash until that oldest frame finally exits.

@pablogsal
Copy link
Member

Thanks @brandtbucher, you rock 🤘

@The-Compiler
Copy link
Contributor Author

I suppose there aren't many reason for the locals allocation to fail other than funky destructors, but is a bit worrisome that our entire test suite never touched this path

One thing that's still unclear to me: How does the locals allocation fail in my original reproducer using QObject::destroy from here? #93252 (comment) (I don't necessarily expect you to answer this, since it might be something PyQt does - but I don't have an answer either...).

FWIW I'm currently trying out the fix proposed by @kumaraditya303 above on top of b3, and it looks like the qutebrowser testsuite looks much better with it! ✨

(I also saw a weird SIGIOT when trying to run pytest with the latest 3.11 branch, but that's probably something for a new issue, if none exists yet...)

@pablogsal
Copy link
Member

One thing that's still unclear to me: How does the locals allocation fail in my original reproducer using QObject::destroy from here?

Likely there is some C call that's failing and is raising unraisable exceptions or is cleaning the error indicator. These calls consume stacks that are never pop-ed but they are cleaned.

If the original error doesn't happen again, it has to be something like this (fail to allocate the local stack) because otherwise that would indicate that we are failing to pop in other places.

@The-Compiler
Copy link
Contributor Author

@pablogsal Out of curiosity, I tried to build with CXX=clang CC=clang (on Linux), but I could still reproduce after that. Not sure why you can't... I don't have a recent macOS available to test with I'm afraid.

I will give #94693 a try somewhen over the next couple of days, I need to take a break after hours of reproducing bugs 😅.

Finally, the pytest issue I have encountered while looking into this one:

@tiran
Copy link
Member

tiran commented Jul 9, 2022

The bug has been fixed in main and 3.11 branch. Thank you @brandtbucher and @kumaraditya303

@tiran tiran closed this as completed Jul 9, 2022
Repository owner moved this from Todo to Done in Release and Deferred blockers 🚫 Jul 9, 2022
@The-Compiler
Copy link
Contributor Author

I just tested the 3.11 fix, and can confirm everything seems to work properly now, including the qutebrowser testsuite (minus a few issues which are probably not CPython bugs).

Thanks a lot, @markshannon @pablogsal @kumaraditya303 @brandtbucher @tiran! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Development

No branches or pull requests

7 participants