Skip to content

Commit ac70c17

Browse files
RihamSelimfacebook-github-bot
authored andcommitted
Fix Crashy PyStack Walking in Python 3.12 (S516843)
Summary: After recent Python 3.12 rollout we started seeing crashes, e.g. T223049180. Crash stack from core dumps points to tstate being null when `_PyErr_GetRaisedException` is called ``` * thread #1, name = 'server#native-m', stop reason = SIGSEGV: address not mapped to object (fault address=0x60) * frame #0: 0x00007f94039e0197 libpython3.12.so.1.0`_PyFrame_MakeAndSetFrameObject [inlined] _PyErr_GetRaisedException(tstate=0x0000000000000000) at errors.c:490 frame #1: 0x00007f94039e0197 libpython3.12.so.1.0`_PyFrame_MakeAndSetFrameObject [inlined] PyErr_GetRaisedException at errors.c:499 frame facebook#2: 0x00007f94039e0184 libpython3.12.so.1.0`_PyFrame_MakeAndSetFrameObject(frame=0x00007f940977f340) at frame.c:32 frame facebook#3: 0x00007f9403b9d3de libpython3.12.so.1.0`PyThreadState_GetFrame [inlined] _PyFrame_GetFrameObject(frame=<unavailable>) (.__uniq.305758459065587366612134685926705492046) at pycore_frame.h:213 frame facebook#4: 0x00007f9403b9d3d0 libpython3.12.so.1.0`PyThreadState_GetFrame(tstate=<unavailable>) at pystate.c:1754 ``` The problem happens when this code is called (https://fburl.com/26dhwv4v) ``` PyObject * PyErr_GetRaisedException(void) { PyThreadState *tstate = _PyThreadState_GET(); return _PyErr_GetRaisedException(tstate); } ``` Where `_PyThreadState_GET` is implemented as follows https://fburl.com/92hrrluf ``` /* Get the current Python thread state. This function is unsafe: it does not check for error and it can return NULL. The caller must hold the GIL See also PyThreadState_Get() and _PyThreadState_UncheckedGet(). */ static inline PyThreadState* _PyThreadState_GET(void) { #if defined(HAVE_THREAD_LOCAL) && !defined(Py_BUILD_CORE_MODULE) return _Py_tss_tstate; #else return _PyThreadState_GetCurrent(); #endif ``` **The caller must hold the GIL** is a problem for us, since the whole idea behind implementing this hook this way is to get the stack when we **don't** have the GIL. This code implements a copy of the current To work around the issue we are copying the Python runtime code, but WITHOUT the GIL checks [_PyFrame_MakeAndSetFrameObject](https://github.com/python/cpython/blob/3.12/Python/frame.c#L28C1-L64C1) is reimplemented to skip the `PyErr_GetRaisedException` call, e.g. [_PyFrame_New_NoTrack](https://github.com/python/cpython/blob/main/Objects/frameobject.c#L2107C1-L2125C2) Changes from ``` PyFrameObject* _PyFrame_New_NoTrack(PyCodeObject *code) { CALL_STAT_INC(frame_objects_created); int slots = code->co_nlocalsplus + code->co_stacksize; PyFrameObject *f = PyObject_GC_NewVar(PyFrameObject, &PyFrame_Type, slots); if (f == NULL) { return NULL; } f->f_back = NULL; f->f_trace = NULL; f->f_trace_lines = 1; f->f_trace_opcodes = 0; f->f_lineno = 0; f->f_extra_locals = NULL; f->f_locals_cache = NULL; f->f_overwritten_fast_locals = NULL; return f; } ``` to ``` thread_local PyFrameObject my_frame; PyFrameObject* MyPyFrame_New_NoTrack(_PyInterpreterFrame* frame) { PyFrameObject* f = &my_frame; if (f == NULL) { return NULL; } f->f_back = NULL; f->f_trace = NULL; f->f_trace_lines = 1; f->f_trace_opcodes = 0; f->f_fast_as_locals = 0; f->f_lineno = 0; f->f_frame = frame; frame->frame_obj = f; Py_SET_TYPE(reinterpret_cast<PyObject*>(frame->frame_obj), &PyFrame_Type); return f; } ``` A thread local variable is being reused since we don't want to keep allocating and deallocating on the heap while doing heap profiling. Reviewed By: fried Differential Revision: D74118996 fbshipit-source-id: 65b9a9a10d6aa26f9c1cd93e30486333f20f4b24
1 parent d0a4481 commit ac70c17

File tree

1 file changed

+3
-0
lines changed
  • third-party/folly/src/folly/python

1 file changed

+3
-0
lines changed

third-party/folly/src/folly/python/Weak.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,9 @@ Py_Weak(int) Py_IsFinalizing(void);
7979
Py_Weak(int) _Py_IsFinalizing(void);
8080
#endif
8181

82+
// Python Types
83+
Py_Weak(PyTypeObject) PyFrame_Type;
84+
8285
#undef Py_Weak
8386
#endif // NOT_WINDOWS
8487

0 commit comments

Comments
 (0)