sleep/fork/shell/SIGCHLD interaction problem
sleep/fork/shell/SIGCHLD interaction problem
am 11.11.2007 16:41:34 von Justin Fletcher
Hiya,
I'm having a problem trying to get a simple program to respond the way
that I expect. The basic premise is thus :
1. Fork a child.
2. Sleep for a while.
3. Do other stuff.
This seems pretty simple, and I have a SIGCHLD handler which will catch my
forked process if it exits. I thought everything was fine. Then I found
that is I press ctrl-Z to suspend the parent whilst I'm running the
program and then background it, it hangs. I've reduced the problem to the
simplest I can, as follows :
----
#!/bin/perl
$SIG{'CHLD'} = sub {
print "SIGCHLD\n";
$pid = wait;
print "leave SIGCHLD for pid $pid\n";
};
print "Forking to do some long running task\n";
unless ($pid = fork) {
$SIG{'CHLD'} = 'DEFAULT';
exec "tail -f /dev/null";
die "failed\n";
};
print "Sleeping\n";
sleep 50;
print "Waking\n";
----
The problem is that if I press ctrl-Z whilst the program is sleeping, and
then resume it in the background with 'bg', a SIGCHLD is triggered. The
handler then does a 'wait' to get the PID and hangs because there isn't a
child that's exited. We never leave the SIGCHLD handler (unless the long
running task completes). The use of 'tail -f /dev/null' is purely to
simulate a task which just keeps running.
In the shell, the following sequence is seen:
----
justin@buttercup:~/Root/perltest$ perl testsleep.pl
Forking to do some long running task
Sleeping
[1]+ Stopped perl testsleep.pl
justin@buttercup:~/Root/perltest$ bg
[1]+ perl testsleep.pl &
SIGCHLD
justin@buttercup:~/Root/perltest$
----
I'm running bash 3.1.17, linux kernel 2.6.18, from debian stable, with
perl 5.8.8.
I believe this sort of construct to be normal and even recommended from
the perlipc pages; so... am I doing something wrong ? is bash ? is the
kernel ? is perl ?
I'm hoping I'm just misunderstanding how process control should be done.
--
Gerph
.... And you never see me walking toward you.
Re: sleep/fork/shell/SIGCHLD interaction problem
am 11.11.2007 21:40:37 von Martijn Lievaart
On Sun, 11 Nov 2007 15:41:34 +0000, Justin Fletcher wrote:
> The problem is that if I press ctrl-Z whilst the program is sleeping,
> and then resume it in the background with 'bg', a SIGCHLD is triggered.
> The handler then does a 'wait' to get the PID and hangs because there
> isn't a child that's exited. We never leave the SIGCHLD handler (unless
> the long running task completes). The use of 'tail -f /dev/null' is
> purely to simulate a task which just keeps running.
> I believe this sort of construct to be normal and even recommended from
> the perlipc pages; so... am I doing something wrong ? is bash ? is the
> kernel ? is perl ?
>
> I'm hoping I'm just misunderstanding how process control should be done.
It seems you are getting signals for the stop and start of the child, see
man sigaction and look at the possible CHLD signals.
This is worrying, your code is quite a normal construct and there must be
a lot of production code out there that has this same problem.
Additionally I could not find out how to get at the si_code for the
signal.
The solution seems to me to use (thanks to perldoc perlipc):
#!/usr/bin/perl
use strict;
use warnings;
use POSIX ":sys_wait_h";
sub REAPER {
print "entering reaper\n";
my $child;
# If a second child dies while in the signal handler caused by the
# first death, we wonât get another signal. So must loop here else
# we will leave the unreaped child as a zombie. And the next time
# two children die we get another zombie. And so on.
# Also, we can get signals on stopping and continuation of children
# so there is no process to wait on
while (($child = waitpid(-1,WNOHANG)) > 0) {
print "Reaped $child: $?\n";
}
$SIG{CHLD} = \&REAPER; # still loathe sysV
print "Leaving reaper\n";
}
$SIG{CHLD} = \&REAPER;
my $pid;
print "Forking to do some long running task\n";
unless ($pid = fork) {
$SIG{'CHLD'} = 'DEFAULT';
my $i=0;
while (1) {
print $i++, "\n";
sleep 1;
}
}
print "pid=$pid\n";
print "Sleeping\n";
sleep 20;
print "Waking\n";
kill 'INT', $pid;
sleep 2;
Re: sleep/fork/shell/SIGCHLD interaction problem
am 12.11.2007 01:21:05 von Ben Morrow
Quoth Justin Fletcher :
>
> The problem is that if I press ctrl-Z whilst the program is sleeping, and
> then resume it in the background with 'bg', a SIGCHLD is triggered.
This is expected bahaviour if your signal handler is installed with
sigaction without specifying the SA_NOCLDSTOP flag, which is what perl
does. See your system's sigaction(2).
> The handler then does a 'wait' to get the PID and hangs because there
> isn't a child that's exited.
You shouldn't simply call wait in a SIGCHLD handler, anyway. You don't
know how many children have exitted before you could handle the signal.
The usual idiom is something like
use POSIX qw/:sys_wait_h/;
$SIG{CHLD} = sub { 1 while 0 < waitpid -1, WNOHANG };
which will wait for everything that needs waiting for. See perlipc for
examples which let you get the child pid and exit status, and waitpid(2)
for how to check for children that have stopped/continued.
> In the shell, the following sequence is seen:
>
> ----
> justin@buttercup:~/Root/perltest$ perl testsleep.pl
> Forking to do some long running task
> Sleeping
>
> [1]+ Stopped perl testsleep.pl
How do you think the shell knew its child had stopped? It relies on
SIGCHLD being sent when the process's status changes.
Ben
Re: sleep/fork/shell/SIGCHLD interaction problem
am 12.11.2007 04:30:18 von xhoster
Justin Fletcher wrote:
> Hiya,
>
> I'm having a problem trying to get a simple program to respond the way
> that I expect. The basic premise is thus :
>
> 1. Fork a child.
> 2. Sleep for a while.
> 3. Do other stuff.
>
> This seems pretty simple, and I have a SIGCHLD handler which will catch
> my forked process if it exits. I thought everything was fine. Then I
> found that is I press ctrl-Z to suspend the parent whilst I'm running the
> program and then background it, it hangs.
I find that this only occurs if I hit ctrl-Z from the keyboard. If I
send the process the TSTP signal via some other means, it doesn't happen.
I know that shells often respond to ctrl-Z, ctrl-C, etc, by sending signals
to entire process groups, rather than just the main process. I don't
exactly how this leads to the observed phenomena, though.
Also, be using "strace", I see that the process actually is getting a
SIGCHLD, (as opposed to some bug in Perl causing it to think that it did
when really it didn't)
to quote it.>
> I believe this sort of construct to be normal and even recommended from
> the perlipc pages; so... am I doing something wrong ? is bash ? is the
> kernel ? is perl ?
I see the same or similar behavior under tcsh. So I'm thinking it is the
kernel. I often find that programs which spawn other program do not behave
well when put into the background after the fact, but yours is the only
simple demonstration of this that I've seen. When using programs that fork
or spawn others, I've learned to try to start such programs in the
background with &, and if I forget then I just kill them and restart them
in the background rather than using ctrl-Z
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
Re: sleep/fork/shell/SIGCHLD interaction problem
am 12.11.2007 15:34:48 von Ben Morrow
Quoth xhoster@gmail.com:
> Justin Fletcher wrote:
> >
> > I'm having a problem trying to get a simple program to respond the way
> > that I expect. The basic premise is thus :
> >
> > 1. Fork a child.
> > 2. Sleep for a while.
> > 3. Do other stuff.
> >
> > This seems pretty simple, and I have a SIGCHLD handler which will catch
> > my forked process if it exits. I thought everything was fine. Then I
> > found that is I press ctrl-Z to suspend the parent whilst I'm running the
> > program and then background it, it hangs.
>
> I find that this only occurs if I hit ctrl-Z from the keyboard. If I
> send the process the TSTP signal via some other means, it doesn't happen.
> I know that shells often respond to ctrl-Z, ctrl-C, etc, by sending signals
> to entire process groups, rather than just the main process. I don't
> exactly how this leads to the observed phenomena, though.
SIGCHLD is sent to the parent whenever a child changes status. So when
you press ctrl-Z, the whole process group is signalled, the child is
stopped, and the parent gets a SIGCHLD. When the process group is
resumed (bg or fg) the parent gets another SIGCHLD: since it hasn't
responded to the first yet (because it was stopped), this is not
usually apparent.
If the OP really doesn't want SIGCHLDs when a child stops, he can
install the signal handler explicitly with sigaction and SA_NOCLDSTOP
(under systems which support that). Since one must assume that any
number of children may have exitted when handling SIGCHLD anyway,
including 0 in 'any number' is generally easier.
Ben
Re: sleep/fork/shell/SIGCHLD interaction problem
am 12.11.2007 18:46:44 von xhoster
Ben Morrow wrote:
> Quoth xhoster@gmail.com:
> > Justin Fletcher wrote:
> > >
> > > I'm having a problem trying to get a simple program to respond the
> > > way that I expect. The basic premise is thus :
> > >
> > > 1. Fork a child.
> > > 2. Sleep for a while.
> > > 3. Do other stuff.
> > >
> > > This seems pretty simple, and I have a SIGCHLD handler which will
> > > catch my forked process if it exits. I thought everything was fine.
> > > Then I found that is I press ctrl-Z to suspend the parent whilst I'm
> > > running the program and then background it, it hangs.
> >
> > I find that this only occurs if I hit ctrl-Z from the keyboard. If I
> > send the process the TSTP signal via some other means, it doesn't
> > happen. I know that shells often respond to ctrl-Z, ctrl-C, etc, by
> > sending signals to entire process groups, rather than just the main
> > process. I don't exactly how this leads to the observed phenomena,
> > though.
>
> SIGCHLD is sent to the parent whenever a child changes status. So when
> you press ctrl-Z, the whole process group is signalled, the child is
> stopped, and the parent gets a SIGCHLD. When the process group is
> resumed (bg or fg) the parent gets another SIGCHLD: since it hasn't
> responded to the first yet (because it was stopped), this is not
> usually apparent.
Thanks for the explanation. I did notice sometimes the parent went into
the $SIG{CHLD} code when ctrl-Z was hit. Presumably the child received its
TSTP first, and the parent for some reason got the CHLD from that before it
got the initial TSTP.
> If the OP really doesn't want SIGCHLDs when a child stops, he can
> install the signal handler explicitly with sigaction and SA_NOCLDSTOP
> (under systems which support that).
Oy. That forces me to know more about the system thing than I wish I had
to know, at least for such a conceptually simple thing. Not that that is
surprising--there are limits to how much Perl can do to insulate me. malloc
and free it does a good job of, but signals I guess are harder.
> Since one must assume that any
> number of children may have exitted when handling SIGCHLD anyway,
This is only true if one knows there is more than one child to exit, or one
is writing code that is only a small part of a larger unknown system. If
one knows that there is only one child to exit, because only one has been
started, then one doesn't need to assume that any number greater than one
may have exited. And if it *is* part of a larger system, than all the
other parts need to agree on how to go about doing it. If one part does a
waitpid -1, WNOHANG and comes up with some other part's child, that could
cause problems. Maybe there should be a way to unwait on a child, which
would store the pid and exit status away somewhere, then if the localized
$SIG{CHLD} becomes unlocalized it would fire a fake SIG_CHLD and waitpid
could return the stored away value when it is next called.
> including 0 in 'any number' is generally easier.
I find it easier to design/work around the need to ever set $SIG{CHLD} (to
anything other than the default or IGNORE) in the first place. :)
I'm perhaps fortunate in that I've usually been able to do so. Obviously,
not all people will be lucky enough to be able get away with that.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
Re: sleep/fork/shell/SIGCHLD interaction problem
am 12.11.2007 20:44:16 von Eric Schwartz
xhoster@gmail.com writes:
> Oy. That forces me to know more about the system thing than I wish I had
> to know, at least for such a conceptually simple thing. Not that that is
> surprising--there are limits to how much Perl can do to insulate me. malloc
> and free it does a good job of, but signals I guess are harder.
I am reminded of some commercial Unix kernel hackers who were
responsible for the signal handling code. They had a pole 8 or 10
feet high with a sign on the top saying, "You must be THIS TALL to use
signals." As much as possible, they included themselves in this rule.
-=Eric
Re: sleep/fork/shell/SIGCHLD interaction problem
am 13.11.2007 04:01:25 von Charles DeRykus
On Nov 11, 7:41 am, Justin Fletcher wrote:
> Hiya,
>
> I'm having a problem trying to get a simple program to respond the way
> that I expect. The basic premise is thus :
>
> 1. Fork a child.
> 2. Sleep for a while.
> 3. Do other stuff.
>
> This seems pretty simple, and I have a SIGCHLD handler which will catch my
> forked process if it exits. I thought everything was fine. Then I found
> that is I press ctrl-Z to suspend the parent whilst I'm running the
> program and then background it, it hangs. I've reduced the problem to the
> simplest I can, as follows :
>
> ----
> #!/bin/perl
>
> $SIG{'CHLD'} = sub {
> print "SIGCHLD\n";
> $pid = wait;
> print "leave SIGCHLD for pid $pid\n";
> };
>
> print "Forking to do some long running task\n";
> unless ($pid = fork) {
> $SIG{'CHLD'} = 'DEFAULT';
> exec "tail -f /dev/null";
> die "failed\n";
> };
>
> print "Sleeping\n";
> sleep 50;
> print "Waking\n";
> ----
>
> The problem is that if I press ctrl-Z whilst the program is sleeping, and
> then resume it in the background with 'bg', a SIGCHLD is triggered. The
> handler then does a 'wait' to get the PID and hangs because there isn't a
> child that's exited. We never leave the SIGCHLD handler (unless the long
> running task completes). The use of 'tail -f /dev/null' is purely to
> simulate a task which just keeps running.
>
> In the shell, the following sequence is seen:
>
> ----
> justin@buttercup:~/Root/perltest$ perl testsleep.pl
> Forking to do some long running task
> Sleeping
>
> [1]+ Stopped perl testsleep.pl
> justin@buttercup:~/Root/perltest$ bg
> [1]+ perl testsleep.pl &
> SIGCHLD
> justin@buttercup:~/Root/perltest$
> ----
>
> I'm running bash 3.1.17, linux kernel 2.6.18, from debian stable, with
> perl 5.8.8.
>
> I believe this sort of construct to be normal and even recommended from
> the perlipc pages; so... am I doing something wrong ? is bash ? is the
> kernel ? is perl ?
>
think you could lose the SIGCHLD handler
as it's not necessary at all here. You're
not spawning multiple processes and SIGSTP
is problematic as you've seen. A simple
waitpid on the child should eliminate the
problems, eg.,
my $pid = fork;
die "fork: $!" unless defined $pid;
unless ($pid) { # child
exec "tail -f /dev/null"
or die "exec failed: $!\n";
} else { # parent
sleep 50;
waitpid $pid, 0;
}
--
Charles DeRykus
Re: sleep/fork/shell/SIGCHLD interaction problem
am 15.11.2007 09:52:49 von Justin Fletcher
On Mon, 12 Nov 2007, Ben Morrow wrote:
>
> Quoth xhoster@gmail.com:
>> Justin Fletcher wrote:
>>>
>>> I'm having a problem trying to get a simple program to respond the way
>>> that I expect. The basic premise is thus :
>>>
>>> 1. Fork a child.
>>> 2. Sleep for a while.
>>> 3. Do other stuff.
>>>
>>> This seems pretty simple, and I have a SIGCHLD handler which will catch
>>> my forked process if it exits. I thought everything was fine. Then I
>>> found that is I press ctrl-Z to suspend the parent whilst I'm running the
>>> program and then background it, it hangs.
>>
>> I find that this only occurs if I hit ctrl-Z from the keyboard. If I
>> send the process the TSTP signal via some other means, it doesn't happen.
>> I know that shells often respond to ctrl-Z, ctrl-C, etc, by sending signals
>> to entire process groups, rather than just the main process. I don't
>> exactly how this leads to the observed phenomena, though.
>
> SIGCHLD is sent to the parent whenever a child changes status. So when
> you press ctrl-Z, the whole process group is signalled, the child is
> stopped, and the parent gets a SIGCHLD. When the process group is
> resumed (bg or fg) the parent gets another SIGCHLD: since it hasn't
> responded to the first yet (because it was stopped), this is not
> usually apparent.
>
> If the OP really doesn't want SIGCHLDs when a child stops, he can
> install the signal handler explicitly with sigaction and SA_NOCLDSTOP
> (under systems which support that). Since one must assume that any
> number of children may have exitted when handling SIGCHLD anyway,
> including 0 in 'any number' is generally easier.
Thanks for your (everyone on this group) help! I hadn't appreciated that
SIGCHLD was delivered for all the information signals, or that there might
be multiple children present.
The explanations given have helped me resolve the odd hangs I've been
getting. Yay :-)
--
Gerph
.... Over the hills and far away there's a place that's heaven.