-
Notifications
You must be signed in to change notification settings - Fork 27
Description
I have found a race condition that can occur when one calls sv up <service>
and sv down <service>
within 1 second while runsv is sleeping. This race-condition will lead to a service staying down even when the most recent call was sv up <service>
. I have outlined the events which create this race-condition below:
- If a service unexpectedly terminates within <1 second of starting, runsv will sleep for 1 second to avoid starving the CPU:
https://github.com/void-linux/runit/blob/2b8000f1ebd07fd68ee0e3c32737d97bcd1687fb/src/runsv.c#L553C1-L573C59
child =wait_nohang(&wstat);
if (!child) break;
if ((child == -1) && (errno != error_intr)) break;
if (child == svd[0].pid) {
svd[0].pid =0;
pidchanged =1;
svd[0].wstat =wstat;
svd[0].ctrl &=~C_TERM;
if (svd[0].state != S_FINISH)
if ((fd =open_read("finish")) != -1) {
close(fd);
svd[0].state =S_FINISH;
update_status(&svd[0]);
continue;
}
svd[0].state =S_DOWN;
taia_uint(&deadline, 1);
taia_add(&deadline, &svd[0].start, &deadline);
taia_now(&svd[0].start);
update_status(&svd[0]);
**if (taia_less(&svd[0].start, &deadline)) sleep(1);**
- While runsv sleeps, let's suppose another process calls
sv down <service>
, which writes 'd' to 'supervise/control'. - Then, while runsv is still sleeping, let's say a process makes another call to
sv up <service>
. Since runsv is sleeping, its s->want (svstatus[17]
) has not been updated toS_DOWN
. In sv.c, we see that svstatus[17] is stillS_UP
, so we do not write 'u' to 'supervise/control', and instead return early.
https://github.com/void-linux/runit/blob/2b8000f1ebd07fd68ee0e3c32737d97bcd1687fb/src/sv.c#L251C1-L254C71
int control(char *a) {
if (svstatus_get() <= 0) return(-1);
if (svstatus[17] == *a)
if (*a != 'd' || svstatus[18] == 1) return(0); /* once w/o term */
- Runsv wakes up and, since 'd' is the only character in 'supervise/control', runsv enters the
S_DOWN
state.
This is undesired behavior because the most recent call was sv up <service>
. Since sv.c is not synchronous with the runsv state-machine, I do not think it should be relying on runsv's internal state in-order to write to the control pipe, else you lead to race-conditions such as this one.
I have created a pull request for my proposed change, which will allow 'up' to be written to 'supervise/control' even when svstatus[17] already shows 'up': #32