You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[DRAFT] pythongh-135358 follow-up: Use PyErr_CheckSignalsDetached in the stdlib.
This patch demonstrates how `PyErr_CheckSignalsDetached` (python#135358; see
also python#133465 and capi-workgroup/decisions#68) might be used for some
(small) efficiency improvements in the stdlib. Almost all existing
uses of `PyErr_CheckSignals` in the stdlib appear in loops with this
general structure:
```c
int res = 0;
int async_err = 0;
do {
Py_BEGIN_ALLOW_THREADS
res = system_call(arg1, arg2);
Py_END_ALLOW_THREADS
} while (res < 0 && errno == EINTR
&& !(async_err = PyErr_CheckSignals()));
if (res < 0) {
return !async_err ? PyErr_SetFromErrno(PyExc_OSError) : NULL;
}
Py_RETURN_NONE;
```
With the addition of `PyErr_CheckSignalsDetached`, this can become
```c
int res = 0;
int async_err = 0;
PyThreadState *tstate = PyEval_SaveThread();
do {
res = system_call(arg1, arg2);
} while (res < 0 && errno == EINTR
&& !(async_err = PyErr_CheckSignalsDetached(tstate)));
PyEval_RestoreThread(tstate);
if (res < 0) {
return !async_err ? PyErr_SetFromErrno(PyExc_OSError) : NULL;
}
Py_RETURN_NONE;
```
There were also a small number of places where
```
Py_BLOCK_THREADS
sig = PyErr_CheckSignals();
Py_UNBLOCK_THREADS
```
could become just
```
sig = PyErr_CheckSignalsDetached(tstate);
```
I chose to use explicit `PyErr_{Save,Restore}Thread` throughout this
patch, in order to give the thread-state parameter to
`PyErr_CheckSignalsDetached` a visible name in the calling code.
This is possibly an argument for adding a wrapper macro
(`PyErr_CHECK_SIGNALS_DETACHED()`, perhaps) that is part of the
`Py_BEGIN_ALLOW_THREADS` family, and is therefore entitled to use the
hidden `_state` variable. Or it could be an argument for adding a new
construct like
```
Py_WITH_DETACHED_THREAD_STATE (tstate) {
do {
res = system_call(arg1, arg2);
} while (res < 0 && errno == EINTR
&& !(async_err = PyErr_CheckSignalsDetached(tstate)));
}
```
(This could be implemented in plain C by off-label use of a for loop.
I’m not sure if that’s a good idea or not. In particular, `break`
and `continue` at the top level of a `Py_WITH_DETACHED_THREAD_STATE`
block would have surprising effects.)
The patch only tackles the most straightforward conversions. Places
in the stdlib that could probably benefit from a conversion, but that
I did not touch, are:
* `signal.sigtimedwait`, `select.select`, `select.poll`, `select.devpoll`,
`select.epoll`, `select.kevent`, `time.sleep`, `_multiprocessing.semaphore`,
and the ssl and socket modules. All of these places need to do arithmetic
on some combination of `struct timespec`, `struct timeval`, and `Py_time_t`
in their loops, and the APIs for doing so are not currently safe to use
without an attached thread state (in particular, they may set the error
indicator).
* The ssl and socket modules also do a bunch of other work in the
same loops that detach the thread state and I’m not familiar enough
with those APIs to know if it’s safe to expand the regions with
detached thread state. This is unfortunate as I think these are the
loops with the biggest chance of _substantial_ efficiency gains from
enlarging those regions.
* `time.sleep` is also a horrible maze of `#ifdef`s and I think I see
a race bug in the Windows-specific code (filed separately as python#135407).
* The tkinter module goes back and forth between holding a Python
thread state and holding a _Tcl_ thread state and I didn’t want to
mess with that.
There are also many places where no conversion is necessary:
* `signal.pause`, `signal.raise`, `signal.signal`,
`signal.pthread_sigmask`, `signal.pthread_kill`, `os.kill`,
`os._getdiskusage`, `PyObject_Print`, `PyObject_Repr`,
`PyObject_Str`, `PyErr_SetFromErrnoWithFilenameObjects`,
`builtin_input_impl`:
These do not contain a loop, so there’s no reason to mess with them.
* The `_sre` module, `longobject.c`, `faulthandler.c`, `traceback.c`:
These don’t detach the thread state at all, so again there’s no
reason to mess with them.
As a final note, it would probably be a good idea to make it a
documented guarantee that `PyEval_{Save,Restore}Thread` and
`PyErr_CheckSignals{,Detached}` preserve the value of `errno`
and all the Windows-specific errno analogues. A lot of this code
is already assuming that this is the case, and the code that isn’t
making that assumption is substantially clunkier for it.
0 commit comments