Description
Bug report
When using sys._current_frames
in a threaded application there is a high risk of a deadlock in the interpreter. Below is my analysis from one such hang. We have never seen this problem with Python 3.8 but experience it regularly with Python 3.11. I know that the sys._current_frames
documentation warns about the function and one solution to this bug report could be to document that the function is not thread safe.
We have a huge number of threads in this application but I'll limit the output to the two threads that cause the deadlock.
We have a thread calling sys._current_frames
with a Python stack trace:
Thread 118868 (idle): "Profile"
_gc_callback (distributed/utils_perf.py:184)
_watch (distributed/profile.py:261)
run (threading.py:975)
_bootstrap_inner (threading.py:1038)
_bootstrap (threading.py:995)
and a corresponding C stack trace:
Thread 325 (Thread 0x7f8b992d6700 (LWP 118868) ""):
#0 0x00007f8ce83adde2 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00000000004eae32 in PyCOND_TIMEDWAIT (us=<optimized out>, mut=<optimized out>, cond=0x879128 <_PyRuntime+392>) at /usr/local/src/conda/python-3.11.4/Python/condvar.h:73
#2 take_gil (tstate=0x7f8b992d54e0) at /usr/local/src/conda/python-3.11.4/Python/ceval_gil.h:250
#3 0x00000000005134bc in eval_frame_handle_pending (tstate=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:1211
#4 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:1754
#5 0x000000000055451f in _PyEval_EvalFrame (throwflag=0, frame=0x7f8bb645e248, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_ceval.h:73 #6 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, func=0x7f8cdc1d2340, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:6439
#7 _PyFunction_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, stack=<optimized out>, func=<function at remote 0x7f8cdc1d2340>) at /usr/local/src/conda/python-3.11.4/Objects/call.c:393
#8 _PyObject_VectorcallTstate (tstate=0x5c60c20, callable=<function at remote 0x7f8cdc1d2340>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_call.h:92#9 0x0000000000554172 in method_vectorcall (method=method@entry=<method at remote 0x7f8bce070c00>, args=args@entry=0x7f8b992d57d0, nargsf=<optimized out>, kwnames=kwnames@entry=0x0) at /usr/local/src/conda/python-3.11.4/Objects/classobject.c:89
#10 0x00000000004fe378 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7f8b992d57d0, callable=<method at remote 0x7f8bce070c00>, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_call.h:92
#11 _PyObject_CallFunctionVa (is_size_t=0, va=0x7f8b992d57b0, format=<optimized out>, callable=<method at remote 0x7f8bce070c00>, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Objects/call.c:536
#12 PyObject_CallFunction (callable=<method at remote 0x7f8bce070c00>, format=<optimized out>) at /usr/local/src/conda/python-3.11.4/Objects/call.c:558
#13 0x00000000005c23d2 in invoke_gc_callback (phase=0x654129 "stop", generation=<optimized out>, collected=<optimized out>, uncollectable=<optimized out>, tstate=<optimized out>, tstate=<optimized out>) at /usr/local/src/conda/python-3.11.4/Modules/gcmodule.c:1378
#14 0x00000000005c230a in gc_collect_with_callback (tstate=0x5c60c20, generation=0) at /usr/local/src/conda/python-3.11.4/Modules/gcmodule.c:1401
#15 0x000000000053b418 in _PyObject_GC_Link (op=set()) at /usr/local/src/conda/python-3.11.4/Modules/gcmodule.c:2270
#16 gc_alloc (presize=<optimized out>, basicsize=<optimized out>) at /usr/local/src/conda/python-3.11.4/Modules/gcmodule.c:2290
#17 _PyObject_GC_NewVar (nitems=<optimized out>, tp=0x869fe0 <PyFrame_Type>) at /usr/local/src/conda/python-3.11.4/Modules/gcmodule.c:2318
#18 _PyFrame_New_NoTrack (code=<optimized out>, code=<optimized out>) at /usr/local/src/conda/python-3.11.4/Objects/frameobject.c:1024
#19 _PyFrame_MakeAndSetFrameObject (frame=0x7f8b90e95a68) at /usr/local/src/conda/python-3.11.4/Python/frame.c:33
#20 0x00000000004e19df in _PyFrame_GetFrameObject (frame=0x7f8b90e95a68) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_frame.h:171 #21 _PyThread_CurrentFrames () at /usr/local/src/conda/python-3.11.4/Python/pystate.c:1420
#22 0x00000000005061c1 in cfunction_vectorcall_NOARGS (func=<built-in method _current_frames of module object at remote 0x7f8ce8746ca0>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.4/Include/cpython/methodobject.h:52
#23 0x000000000051bde1 in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<built-in method _current_frames of module object at remote 0x7f8ce8746ca0>, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_call.h:92
#24 PyObject_Vectorcall (callable=<built-in method _current_frames of module object at remote 0x7f8ce8746ca0>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.4/Objects/call.c:299
#25 0x000000000050ed83 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:4774
#26 0x0000000000534f13 in _PyEval_EvalFrame (throwflag=0, frame=0x7f8bb645e188, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_ceval.h:73
#27 _PyEval_Vector (kwnames=<optimized out>, argcount=0, args=0x7f8bce07e978, locals=0x0, func=<optimized out>, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:6439
#28 _PyFunction_Vectorcall (func=<optimized out>, stack=0x7f8bce07e978, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.4/Objects/call.c:393
#29 0x000000000053ef0d in _PyVectorcall_Call (kwargs=<optimized out>, tuple=<optimized out>, callable=<function at remote 0x7f8cdc121800>, func=0x534da0 <_PyFunction_Vectorcall>, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Objects/call.c:257
#30 _PyObject_Call (kwargs=<optimized out>, args=<optimized out>, callable=<function at remote 0x7f8cdc121800>, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Objects/call.c:328
#31 PyObject_Call (callable=<function at remote 0x7f8cdc121800>, args=<optimized out>, kwargs=<optimized out>) at /usr/local/src/conda/python-3.11.4/Objects/call.c:355
#32 0x000000000051290a in do_call_core (use_tracing=<optimized out>, kwdict={'thread_id': 140243172636480, 'interval': '10ms', 'cycle': '1000ms', 'log': <collections.deque at remote 0x7f8bc008b1f0>, 'omit': ('profile.py', 'selectors.py'), 'stop': <function at remote 0x7f8bc0058220>}, callargs=(), func=<function at remote 0x7f8cdc121800>, tstate=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:7357
#33 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:5381
#34 0x000000000055451f in _PyEval_EvalFrame (throwflag=0, frame=0x7f8bb645e020, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_ceval.h:73
#35 _PyEval_Vector (kwnames=<optimized out>, argcount=<optimized out>, args=<optimized out>, locals=0x0, func=0x7f8ce0e41b20, tstate=0x5c60c20) at /usr/local/src/conda/python-3.11.4/Python/ceval.c:6439
#36 _PyFunction_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, stack=<optimized out>, func=<function at remote 0x7f8ce0e41b20>) at /usr/local/src/conda/python-3.11.4/Objects/call.c:393
#37 _PyObject_VectorcallTstate (tstate=0x5c60c20, callable=<function at remote 0x7f8ce0e41b20>, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at /usr/local/src/conda/python-3.11.4/Include/internal/pycore_call.h:92
#38 0x00000000005541f8 in method_vectorcall (method=<optimized out>, args=0x8875d0 <_PyRuntime+58928>, nargsf=<optimized out>, kwnames=0x0) at /usr/local/src/conda/python-3.11.4/Objects/classobject.c:67
#39 0x0000000000627f6f in thread_run (boot_raw=0x7f8cc9c987b0) at /usr/local/src/conda/python-3.11.4/Modules/_threadmodule.c:1092
#40 0x00000000005f63d4 in pythread_wrapper (arg=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/thread_pthread.h:241
#41 0x00007f8ce83a9ea5 in start_thread () from /lib64/libpthread.so.0
#42 0x00007f8ce79c9b0d in clone () from /lib64/libc.so.6
So, this thread does not hold the GIL but seems to hold the runtime->interpreters.mutex
We have another thread that seems to hold the GIL:
Thread 388 (Thread 0x7f8b62f7f700 (LWP 120217) ""):
#0 0x00007f8ce83afb3b in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1 0x00007f8ce83afbcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2 0x00007f8ce83afc6b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 #3 0x00000000004ea969 in PyThread_acquire_lock_timed (intr_flag=<optimized out>, microseconds=<optimized out>, lock=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/thread_pthread.h:497
#4 PyThread_acquire_lock (lock=0x18bd040, waitflag=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/thread_pthread.h:751
#5 0x00000000005f7266 in tstate_delete_common (tstate=0x7f8cc417c250, gilstate=0x8791d8 <_PyRuntime+568>) at /usr/local/src/conda/python-3.11.4/Python/pystate.c:1080
#6 0x00000000005f6d93 in _PyThreadState_DeleteCurrent (tstate=0x7f8cc417c250) at /usr/local/src/conda/python-3.11.4/Python/pystate.c:1132
#7 0x00000000005f63d4 in pythread_wrapper (arg=<optimized out>) at /usr/local/src/conda/python-3.11.4/Python/thread_pthread.h:241
#8 0x00007f8ce83a9ea5 in start_thread () from /lib64/libpthread.so.0
#9 0x00007f8ce79c9b0d in clone () from /lib64/libc.so.6
See the native_thread_id
in the following gdb output.
(gdb) p *PyThreadState_Get()
$10 = {prev = 0x7f8cc4269680, next = 0x7f8c94483340, interp = 0x8875d8 <_PyRuntime+58936>, _initialized = 1, _static = 0, recursion_remaining = 1000, recursion_limit = 1000, recursion_headroom = 0, tracing = 0, tracing_what = 0,
cframe = 0x7f8cc417c3a0, c_profilefunc = 0x0, c_tracefunc = 0x0, c_profileobj = 0x0, c_traceobj = 0x0, curexc_type = 0x0, curexc_value = 0x0, curexc_traceback = 0x0, exc_info = 0x7f8cc417c390, dict = 0x0, gilstate_counter = 1,
async_exc = 0x0, thread_id = 140236637599488, native_thread_id = 120217, trash_delete_nesting = 0, trash_delete_later = 0x0, on_delete = 0x628f10 <release_sentinel>, on_delete_data = 0x7f8b974b1ee0,
coroutine_origin_tracking_depth = 0, async_gen_firstiter = 0x0, async_gen_finalizer = 0x0, context = 0x0, context_ver = 1, id = 1085, trace_info = {code = 0x0, bounds = {ar_start = 0, ar_end = 0, ar_line = 0, opaque = {
computed_line = 0, lo_next = 0x0, limit = 0x0}}}, datastack_chunk = 0x7f8b99522000, datastack_top = 0x7f8b99522020, datastack_limit = 0x7f8b99526000, exc_state = {exc_value = 0x0, previous_item = 0x0}, root_cframe = {
use_tracing = 0 '\000', current_frame = 0x0, previous = 0x0}}
Your environment
Python 3.11.4 on a CentOS 7 machine with conda-forge binaries