Wait for background processes to complete

Wait for background processes to complete

am 14.01.2008 04:08:01 von pgodfrin

Greetings,
Well - I've spent a bunch of time trying to figure this out - to no
avail.

Here's what I want to do - run several commands in the background and
have the perl program wait for the commands to complete. Fork doesn't
do it, nor does wait nor waitpid.

Any thoughts?

Here's a sample program which starts the processes:

while (<*.txt>)
{
print "Copying $_ \n";
system("cp $_ $_.old &") ;
}
print "End of excercise\n";
exit;

I mean if this were a shell program - this would work:

for x in `ls *.txt`
do
print "Copying $_ \n"
cp $_ $_.old &
done
wait

thanks,
pg

Re: Wait for background processes to complete

am 14.01.2008 04:43:33 von xhoster

pgodfrin wrote:
> Greetings,
> Well - I've spent a bunch of time trying to figure this out - to no
> avail.
>
> Here's what I want to do - run several commands in the background and
> have the perl program wait for the commands to complete. Fork doesn't
> do it, nor does wait nor waitpid.

None of them individually do it, no. You have to use them together.

>
> Any thoughts?
>
> Here's a sample program which starts the processes:
>
> while (<*.txt>)
> {
> print "Copying $_ \n";
> system("cp $_ $_.old &") ;

This starts a shell, which then starts cp in the background. As soon as
the cp is *started*, the shell exits. So Perl has nothing to wait for, as
the shell is already done (and waited for) before system returns. You need
to use fork and system or fork and exec. Or you could use
Parallel::ForkManager, which will wrap this stuff up nicely for you and
also prevent you from fork-bombing your computer if there are thousands of
*.txt

> }
> print "End of excercise\n";
> exit;

1 until -1==wait(); # on my system, yours may differ


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 14.01.2008 04:51:38 von Ben Morrow

Quoth pgodfrin :
>
> Here's what I want to do - run several commands in the background and
> have the perl program wait for the commands to complete. Fork doesn't
> do it, nor does wait nor waitpid.
>
> Any thoughts?
>
> Here's a sample program which starts the processes:
>
> while (<*.txt>)
> {
> print "Copying $_ \n";
> system("cp $_ $_.old &") ;

This string contains a shell metachar (&), so system will fork a shell
and wait for it. The shell will run cp in the background, and then exit,
at which point system will return. Unfortunately, the only process which
knew cp's pid was the shell, which has just exitted, so you can't wait
for that process at all (cp now has init as its parent, like any other
orphaned process).

You need to either implement the behaviour you want with fork, exec and
waitpid (it's a little complicated, but entirely possible) or use
IPC::Run, something like

use IPC::Run qw/run/;

my @cmds;

while (<*.txt>) {
print "Copying $_\n";
push @cmds, [cp => $_, "$_.old"];
}

run map { ($_, '&') } @cmds;

This is also safer than system STRING in the case where your filenames
have funny characters in them.

> }
> print "End of excercise\n";
> exit;

Falling off the end is a perfectly valid way to end a Perl program. exit
is usually reserved for exceptional circumstances.

> I mean if this were a shell program - this would work:
>
> for x in `ls *.txt`
> do
> print "Copying $_ \n"
> cp $_ $_.old &
> done
> wait

This works because the shell implements '&' directly, rather than using
a different shell, so it can remember the pids to wait for itself.

Ben

Re: Wait for background processes to complete

am 14.01.2008 06:10:00 von pgodfrin

On Jan 13, 9:51 pm, Ben Morrow wrote:
> Quoth pgodfrin :
>
>
>
> > Here's what I want to do - run several commands in the background and
> > have the perl program wait for the commands to complete. Fork doesn't
> > do it, nor does wait nor waitpid.
>
> > Any thoughts?
>
> > Here's a sample program which starts the processes:
>
> > while (<*.txt>)
> > {
> > print "Copying $_ \n";
> > system("cp $_ $_.old &") ;
>
> This string contains a shell metachar (&), so system will fork a shell
> and wait for it. The shell will run cp in the background, and then exit,
> at which point system will return. Unfortunately, the only process which
> knew cp's pid was the shell, which has just exitted, so you can't wait
> for that process at all (cp now has init as its parent, like any other
> orphaned process).
>
> You need to either implement the behaviour you want with fork, exec and
> waitpid (it's a little complicated, but entirely possible) or use
> IPC::Run, something like
>
> use IPC::Run qw/run/;
>
> my @cmds;
>
> while (<*.txt>) {
> print "Copying $_\n";
> push @cmds, [cp => $_, "$_.old"];
> }
>
> run map { ($_, '&') } @cmds;
>
> This is also safer than system STRING in the case where your filenames
> have funny characters in them.
>
> > }
> > print "End of excercise\n";
> > exit;
>
> Falling off the end is a perfectly valid way to end a Perl program. exit
> is usually reserved for exceptional circumstances.
>
> > I mean if this were a shell program - this would work:
>
> > for x in `ls *.txt`
> > do
> > print "Copying $_ \n"
> > cp $_ $_.old &
> > done
> > wait
>
> This works because the shell implements '&' directly, rather than using
> a different shell, so it can remember the pids to wait for itself.
>
> Ben

OK - would you have a good example of the fork-system-waitpid method -
not the same one that's in all the other posts or the camel book?

smiles,
pg

Re: Wait for background processes to complete

am 14.01.2008 06:10:12 von pgodfrin

On Jan 13, 9:43 pm, xhos...@gmail.com wrote:
> pgodfrin wrote:
> > Greetings,
> > Well - I've spent a bunch of time trying to figure this out - to no
> > avail.
>
> > Here's what I want to do - run several commands in the background and
> > have the perl program wait for the commands to complete. Fork doesn't
> > do it, nor does wait nor waitpid.
>
> None of them individually do it, no. You have to use them together.
>
>
>
> > Any thoughts?
>
> > Here's a sample program which starts the processes:
>
> > while (<*.txt>)
> > {
> > print "Copying $_ \n";
> > system("cp $_ $_.old &") ;
>
> This starts a shell, which then starts cp in the background. As soon as
> the cp is *started*, the shell exits. So Perl has nothing to wait for, as
> the shell is already done (and waited for) before system returns. You need
> to use fork and system or fork and exec. Or you could use
> Parallel::ForkManager, which will wrap this stuff up nicely for you and
> also prevent you from fork-bombing your computer if there are thousands of
> *.txt
>
> > }
> > print "End of excercise\n";
> > exit;
>
> 1 until -1==wait(); # on my system, yours may differ
>
> Xho
>
> --
> --------------------http://NewsReader.Com/------------------ --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.

Well - that's beginning to make a little sense - the shell completes
and perl has nothing to wait for. No wonder I'm pulling out what
little of my hair is left! :) I guess the fork process returns the pid
of the process, but - if it's the pid of the shell process, then we're
back to square one.

Re: Wait for background processes to complete

am 14.01.2008 07:09:42 von Ben Morrow

[please trim your quotations]

Quoth pgodfrin :
> On Jan 13, 9:51 pm, Ben Morrow wrote:
> > Quoth pgodfrin :
> >
> > > Here's what I want to do - run several commands in the background and
> > > have the perl program wait for the commands to complete. Fork doesn't
> > > do it, nor does wait nor waitpid.
> >

> >
> > You need to either implement the behaviour you want with fork, exec and
> > waitpid (it's a little complicated, but entirely possible) or use
> > IPC::Run, something like
>
> OK - would you have a good example of the fork-system-waitpid method -
> not the same one that's in all the other posts or the camel book?

It's a more efficient use of everybody's time for you to use IPC::Run,
which has been written and tested by someone who understands these
issues and is prepared to solve them properly and portably, than it is
for a random Usenaut to provide you with a code snippet.

However, off the top of my head, completely untested, probably not
portable to Win32 or other non-POSIX systems, etc. etc.,

use POSIX qw/WNOHANG/;

{
my %kids;

$SIG{CHLD} = sub {
my ($pid, @died);
push @died, $pid while $pid = waitpid -1, WNOHANG;
delete @kids{@died};
};

sub background {
my (@cmd) = @_;

defined (my $pid = fork)
or die "can't fork for '$cmd[0]': $!";

if ($pid) {
$kids{$pid} = 1;
return;
}
else {
local $" = "' '";
exec @cmd or die "can't exec '@cmd': $!";
}
}

sub finish {
waitpid $_, 0 for keys %kids;
%kids = ();
}
}

while (<*.txt>) {
print "Copying $_\n";
background cp => $_, "$_.old";
}

finish;

This will break if used in conjunction with system or anything else that
relies on SIGCHLD, and an OO person would probably say it should be
implemented as an object. Use IPC::Run.

Ben

Re: Wait for background processes to complete

am 14.01.2008 16:37:35 von pgodfrin

On Jan 13, 9:43 pm, xhos...@gmail.com wrote:
> pgodfrin wrote:
> > Greetings,
> > Well - I've spent a bunch of time trying to figure this out - to no
> > avail.
>
> > Here's what I want to do - run several commands in the background and
> > have the perl program wait for the commands to complete. Fork doesn't
> > do it, nor does wait nor waitpid.
>
> None of them individually do it, no. You have to use them together.
>
>
>
> > Any thoughts?
>
> > Here's a sample program which starts the processes:
>
> > while (<*.txt>)
> > {
> > print "Copying $_ \n";
> > system("cp $_ $_.old &") ;
>
> This starts a shell, which then starts cp in the background. As soon as
> the cp is *started*, the shell exits. So Perl has nothing to wait for, as
> the shell is already done (and waited for) before system returns. You need
> to use fork and system or fork and exec. Or you could use
> Parallel::ForkManager, which will wrap this stuff up nicely for you and
> also prevent you from fork-bombing your computer if there are thousands of
> *.txt
>
> > }
> > print "End of excercise\n";
> > exit;
>
> 1 until -1==wait(); # on my system, yours may differ
>
> Xho
>
> --
> --------------------http://NewsReader.Com/------------------ --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.

Hi Xho,
Well - It seems you're on to something. Thanks. I have tested this
using system and fork (also exec) and it appears to be exactly what
you're saying - the pid returned by fork is not the pid of the cp
command.

Frankly I'm amazed that something as basic as this, handled easily by
a plain old shell, is not an easy thing in Perl. (no I don't want to
use a shell script for this - I'm a perl bigot :) ).

I'll take a look at Parallel::ForkManager and any other modules that
might be useful...

cheers,
pg

Re: Wait for background processes to complete

am 14.01.2008 18:11:53 von xhoster

pgodfrin wrote:
> On Jan 13, 9:43 pm, xhos...@gmail.com wrote:
> > pgodfrin wrote:
> > > Greetings,
> > > Well - I've spent a bunch of time trying to figure this out - to no
> > > avail.
> >
> > > Here's what I want to do - run several commands in the background and
> > > have the perl program wait for the commands to complete. Fork doesn't
> > > do it, nor does wait nor waitpid.
> >
> > None of them individually do it, no. You have to use them together.
> >
> >
> >
> > > Any thoughts?
> >
> > > Here's a sample program which starts the processes:
> >
> > > while (<*.txt>)
> > > {
> > > print "Copying $_ \n";
> > > system("cp $_ $_.old &") ;
> >
> > This starts a shell, which then starts cp in the background. As soon
> > as the cp is *started*, the shell exits. So Perl has nothing to wait
> > for, as the shell is already done (and waited for) before system
> > returns. You need to use fork and system or fork and exec. Or you
> > could use Parallel::ForkManager, which will wrap this stuff up nicely
> > for you and also prevent you from fork-bombing your computer if there
> > are thousands of *.txt
> >
> > > }
> > > print "End of excercise\n";
> > > exit;
> >
> > 1 until -1==wait(); # on my system, yours may differ
> >
>
> Well - that's beginning to make a little sense - the shell completes
> and perl has nothing to wait for. No wonder I'm pulling out what
> little of my hair is left! :) I guess the fork process returns the pid
> of the process, but - if it's the pid of the shell process, then we're
> back to square one.

The fork returns (to the parent) the pid of the process forked off.
(but you don't actually need to know the pid if you merely want to wait,
rather than waitpid.) If that forked-off process then itself starts the cp
in the background, of course you are no better off. But if the forked-off
process either becomes cp (using exec) or it starts up cp in the foreground
(using system without a "&"), then you now have something to wait for. In
the first case, you wait for cp itself. In the second case, you wait for
the forked-off perl process which is itself waiting for the cp.

$ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
my $x; do {$x=wait; print $x} until $x==-1'
438
439
440
-1

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 14.01.2008 20:05:26 von pgodfrin

On Jan 14, 11:11 am, xhos...@gmail.com wrote:
> pgodfrin wrote:
> > On Jan 13, 9:43 pm, xhos...@gmail.com wrote:
> > > pgodfrin wrote:
> > > > Greetings,
> > > > Well - I've spent a bunch of time trying to figure this out - to no
> > > > avail.
>
> > > > Here's what I want to do - run several commands in the background and
> > > > have the perl program wait for the commands to complete. Fork doesn't
> > > > do it, nor does wait nor waitpid.
>
> > > None of them individually do it, no. You have to use them together.
>
> > > > Any thoughts?
>
> > > > Here's a sample program which starts the processes:
>
> > > > while (<*.txt>)
> > > > {
> > > > print "Copying $_ \n";
> > > > system("cp $_ $_.old &") ;
>
> > > This starts a shell, which then starts cp in the background. As soon
> > > as the cp is *started*, the shell exits. So Perl has nothing to wait
> > > for, as the shell is already done (and waited for) before system
> > > returns. You need to use fork and system or fork and exec. Or you
> > > could use Parallel::ForkManager, which will wrap this stuff up nicely
> > > for you and also prevent you from fork-bombing your computer if there
> > > are thousands of *.txt
>
> > > > }
> > > > print "End of excercise\n";
> > > > exit;
>
> > > 1 until -1==wait(); # on my system, yours may differ
>
> > Well - that's beginning to make a little sense - the shell completes
> > and perl has nothing to wait for. No wonder I'm pulling out what
> > little of my hair is left! :) I guess the fork process returns the pid
> > of the process, but - if it's the pid of the shell process, then we're
> > back to square one.
>
> The fork returns (to the parent) the pid of the process forked off.
> (but you don't actually need to know the pid if you merely want to wait,
> rather than waitpid.) If that forked-off process then itself starts the cp
> in the background, of course you are no better off. But if the forked-off
> process either becomes cp (using exec) or it starts up cp in the foreground
> (using system without a "&"), then you now have something to wait for. In
> the first case, you wait for cp itself. In the second case, you wait for
> the forked-off perl process which is itself waiting for the cp.
>
> $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
> my $x; do {$x=wait; print $x} until $x==-1'
> 438
> 439
> 440
> -1
>
> Xho
>
> --
> --------------------http://NewsReader.Com/------------------ --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.

Hi Xho,
Well - your code and concepts work fine when you want to wait
sequentially. My goal here is to fire off x number of process and then
wait for ALL of them to complete (this is basically rudimentary job
control, trying to use the shell concepts and maximize parallelism).
So, that requires the use of the & to send a process to the
background. It looks like exec doesn't recognize the &, which leaves
the system() command - which leaves us back at square one. The shell
completes and the program has nothing to wait for.

What's interesting is by using the system(" ... &") construct, the
background tasks all get the same PGID, which makes me wonder if
waitpid(0, ) should wait for wait for process in the process
group of the main perl program...

I wonder...

smiles,
pg

Re: Wait for background processes to complete

am 14.01.2008 21:48:32 von xhoster

pgodfrin wrote:
> On Jan 14, 11:11 am, xhos...@gmail.com wrote:
> >
> > The fork returns (to the parent) the pid of the process forked off.
> > (but you don't actually need to know the pid if you merely want to
> > wait, rather than waitpid.) If that forked-off process then itself
> > starts the cp in the background, of course you are no better off. But
> > if the forked-off process either becomes cp (using exec) or it starts
> > up cp in the foreground (using system without a "&"), then you now have
> > something to wait for. In the first case, you wait for cp itself. In
> > the second case, you wait for the forked-off perl process which is
> > itself waiting for the cp.
> >
> > $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
> > my $x; do {$x=wait; print $x} until $x==-1'
> > 438
> > 439
> > 440
> > -1
> >
>
> Hi Xho,
> Well - your code and concepts work fine when you want to wait
> sequentially.

Is there an alternative to waiting sequentially? Waiting sequentially
is what the shell does, too, behind the scenes.

> My goal here is to fire off x number of process and then
> wait for ALL of them to complete

That is what my code does.

> (this is basically rudimentary job
> control, trying to use the shell concepts and maximize parallelism).
> So, that requires the use of the & to send a process to the
> background.

No, that is not required. Since Perl is not a shell, there really
isn't such a thing as the "background" in a Perl context. However,
fork will launch another process which runs without blocking the original
process (well, at least not until the original process asks to be blocked)
or blocking sibling processes. That is what you want, no?

> It looks like exec doesn't recognize the &, which leaves
> the system() command - which leaves us back at square one. The shell
> completes and the program has nothing to wait for.

Yes, so don't do that. "&" basically means "Fork and then don't wait".
Since that isn't what you want to do, then don't do that. Do your own
fork, and then do your own wait.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 14.01.2008 23:26:12 von Charles DeRykus

On Jan 13, 7:08 pm, pgodfrin wrote:
> Greetings,
> Well - I've spent a bunch of time trying to figure this out - to no
> avail.
>
> Here's what I want to do - run several commands in the background and
> have the perl program wait for the commands to complete. Fork doesn't
> do it, nor does wait nor waitpid.
>
> Any thoughts?
>
> Here's a sample program which starts the processes:
>
> while (<*.txt>)
> {
> print "Copying $_ \n";
> system("cp $_ $_.old &") ;
> }
> print "End of excercise\n";
> exit;
>
> I mean if this were a shell program - this would work:
>
> for x in `ls *.txt`
> do
> print "Copying $_ \n"
> cp $_ $_.old &
> done
> wait
>

A 'quick 'n dirty' take :)

while (<*.txt>) {
...
system("cp $_ $_.old &") ;
}
my $jobs;
do { sleep 1; $jobs =`ps |grep cp`; }
while $jobs;


--
Charles DeRykus

Re: Wait for background processes to complete

am 15.01.2008 23:04:56 von pgodfrin

On Jan 14, 2:48 pm, xhos...@gmail.com wrote:
> pgodfrin wrote:
> > On Jan 14, 11:11 am, xhos...@gmail.com wrote:
>
> > > The fork returns (to the parent) the pid of the process forked off.
> > > (but you don't actually need to know the pid if you merely want to
> > > wait, rather than waitpid.) If that forked-off process then itself
> > > starts the cp in the background, of course you are no better off. But
> > > if the forked-off process either becomes cp (using exec) or it starts
> > > up cp in the foreground (using system without a "&"), then you now have
> > > something to wait for. In the first case, you wait for cp itself. In
> > > the second case, you wait for the forked-off perl process which is
> > > itself waiting for the cp.
>
> > > $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
> > > my $x; do {$x=wait; print $x} until $x==-1'
> > > 438
> > > 439
> > > 440
> > > -1
>
> > Hi Xho,
> > Well - your code and concepts work fine when you want to wait
> > sequentially.
>
> Is there an alternative to waiting sequentially? Waiting sequentially
> is what the shell does, too, behind the scenes.
>
> > My goal here is to fire off x number of process and then
> > wait for ALL of them to complete
>
> That is what my code does.
>
> > (this is basically rudimentary job
> > control, trying to use the shell concepts and maximize parallelism).
> > So, that requires the use of the & to send a process to the
> > background.
>
> No, that is not required. Since Perl is not a shell, there really
> isn't such a thing as the "background" in a Perl context. However,
> fork will launch another process which runs without blocking the original
> process (well, at least not until the original process asks to be blocked)
> or blocking sibling processes. That is what you want, no?
>
> > It looks like exec doesn't recognize the &, which leaves
> > the system() command - which leaves us back at square one. The shell
> > completes and the program has nothing to wait for.
>
> Yes, so don't do that. "&" basically means "Fork and then don't wait".
> Since that isn't what you want to do, then don't do that. Do your own
> fork, and then do your own wait.
>
> Xho
>
> --
> --------------------http://NewsReader.Com/------------------ --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.

Hi Xho,
It seems to me that firing off say 5 processes with the '&' character
to send them to the background is 5 parallel processes, while
executing them off one at a time is sequential. Your code (fork or
exec "sleep" ... ) waits for each sleep process to complete - so that
is what I meant by "waiting sequentially". Technically speaking you're
right - but the idea is to have tasks run in parallel versus
sequentially, which is ostensibly faster.

Insofar as using fork() - your original observation still stands -
system() runs a shell command and then terminates after sending the
command within to run (I watched two different PIDs - the long running
command within the system() statement continued while the shell became
)

To make a long story short, this is how I solved the problem. It seems
that the PIDs of the commands run via system() and the '&' background
thing end up belonging to the same Process Group - seen in the ps
command, plus a little extra:

ps -C cp -o pid,pgid,command
PID PGID COMMAND
29068 29063 cp example01.txt example01.txt.old
29070 29063 cp example02.txt example02.txt.old
29072 29063 cp example03.txt example03.txt.old
29074 29063 cp example04.txt example04.txt.old
29076 29063 cp example05.txt example05.txt.old
29078 29063 cp example06.txt example06.txt.old

So I wrap the ps command and do some looping:

for (;;)
{
open PGRP, "ps -C cp h |" ;
@pidlist= ;
if ($#pidlist<0) {die "\nNo more processes\n" ;}
}

It's not pretty but it works...

But, I believe this is an architectural FLAW with Perl.

regards,
pg

Re: Wait for background processes to complete

am 15.01.2008 23:16:03 von glex_no-spam

pgodfrin wrote:
[...]
> So I wrap the ps command and do some looping:
>
> for (;;)
> {
> open PGRP, "ps -C cp h |" ;
> @pidlist= ;
> if ($#pidlist<0) {die "\nNo more processes\n" ;}
> }
>
> It's not pretty but it works...

Why not use Parallel::ForkManager? It IS pretty and will
use much less system resources, compared to running many
'ps' commands.


> But, I believe this is an architectural FLAW with Perl.

Everyone else believes it's due to your level of knowledge.

Re: Wait for background processes to complete

am 16.01.2008 00:49:47 von Ben Morrow

Quoth pgodfrin :
> On Jan 14, 2:48 pm, xhos...@gmail.com wrote:
> > pgodfrin wrote:
> > > On Jan 14, 11:11 am, xhos...@gmail.com wrote:
> >
> > > > $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
> > > > my $x; do {$x=wait; print $x} until $x==-1'



> It seems to me that firing off say 5 processes with the '&' character
> to send them to the background is 5 parallel processes, while
> executing them off one at a time is sequential. Your code (fork or
> exec "sleep" ... ) waits for each sleep process to complete - so that
> is what I meant by "waiting sequentially".

No, I think you don't understand how fork works. It *is* rather
confusing until you're used to it. Read the documentation again (both
Perl's and your system's), or find a good Unix programming book.

> Technically speaking you're
> right - but the idea is to have tasks run in parallel versus
> sequentially, which is ostensibly faster.

The tasks do run in parallel with Xho's version.

> To make a long story short, this is how I solved the problem. It seems
> that the PIDs of the commands run via system() and the '&' background
> thing end up belonging to the same Process Group - seen in the ps
> command, plus a little extra:
>

> So I wrap the ps command and do some looping:
>
> for (;;)
> {
> open PGRP, "ps -C cp h |" ;

Use lexical filehandles and three-or-more-arg open.
Check the return value.

open my $PGRP, '-|', 'ps', '-C', 'cp', 'h'
or die "couldn't fork ps: $!";

> @pidlist= ;
> if ($#pidlist<0) {die "\nNo more processes\n" ;}

IMHO any use of $#ary is an error; certainly in this case you should be
using @pidlist instead.

@pidlist or die "No more processes\n";

This will run around in a tight loop running probably hundreds of ps
processes per second. This is not a effective use of your system's
resources, to say the least. If you must poll like this you need a sleep
in there somewhere to limit the damage.

> }
>
> It's not pretty but it works...

Yuck. The whole point of the wait syscall is to avoid nastiness like
that. I suggest you learn how it works, or use a module written by
someone who does (you have been given at least two suggestions so far),
or stick to shell.

> But, I believe this is an architectural FLAW with Perl.

No. Firstly, the only flaw here is in your understanding; secondly, if
there was a flaw it would be in Unix, not Perl, since Perl just exposes
the underlying system interfaces.

Ben

Re: Wait for background processes to complete

am 16.01.2008 16:17:02 von it_says_BALLS_on_your forehead

On Jan 14, 2:05=A0pm, pgodfrin wrote:
> On Jan 14, 11:11 am, xhos...@gmail.com wrote:
>
>
>
> > pgodfrin wrote:
> > > On Jan 13, 9:43 pm, xhos...@gmail.com wrote:
> > > > pgodfrin wrote:
> > > > > Greetings,
> > > > > Well - I've spent a bunch of time trying to figure this out - to n=
o
> > > > > avail.
>
> > > > > Here's what I want to do - run several commands in the background =
and
> > > > > have the perl program wait for the commands to complete. Fork does=
n't
> > > > > do it, nor does wait nor waitpid.
>
> > > > None of them individually do it, no. =A0You have to use them togethe=
r.
>
> > > > > Any thoughts?
>
> > > > > Here's a sample program which starts the processes:
>
> > > > > =A0 =A0while (<*.txt>)
> > > > > =A0 =A0{
> > > > > =A0 =A0 =A0 =A0 print "Copying =A0$_ \n";
> > > > > =A0 =A0 =A0 =A0 system("cp $_ $_.old &") ;
>
> > > > This starts a shell, which then starts cp in the background. =A0As s=
oon
> > > > as the cp is *started*, the shell exits. =A0So Perl has nothing to w=
ait
> > > > for, as the shell is already done (and waited for) before system
> > > > returns. =A0You need to use fork and system or fork and exec. Or you=

> > > > could use Parallel::ForkManager, which will wrap this stuff up nicel=
y
> > > > for you and also prevent you from fork-bombing your computer if ther=
e
> > > > are thousands of *.txt
>
> > > > > =A0 =A0}
> > > > > =A0 =A0print "End of excercise\n";
> > > > > =A0 =A0exit;
>
> > > > 1 until -1==wait(); =A0# on my system, yours may differ
>
> > > Well - that's beginning to make a little sense - the shell completes
> > > and perl has nothing to wait for. No wonder I'm pulling out what
> > > little of my hair is left! :) I guess the fork process returns the pid=

> > > of the process, but - if it's the pid of the shell process, then we're=

> > > back to square one.
>
> > The fork returns (to the parent) the pid of the process forked off.
> > (but you don't actually need to know the pid if you merely want to wait,=

> > rather than waitpid.) =A0If that forked-off process then itself starts t=
he cp
> > in the background, of course you are no better off. =A0But if the forked=
-off
> > process either becomes cp (using exec) or it starts up cp in the foregro=
und
> > (using system without a "&"), then you now have something to wait for. =
=A0In
> > the first case, you wait for cp itself. =A0In the second case, you wait =
for
> > the forked-off perl process which is itself waiting for the cp.
>
> > $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 my $x; do {$x=3Dwait; print $x} until $x=3D=
=3D-1'
> > 438
> > 439
> > 440
> > -1
>
> > Xho
>
> > --
> > --------------------http://NewsReader.Com/------------------ --
> > The costs of publication of this article were defrayed in part by the
> > payment of page charges. This article must therefore be hereby marked
> > advertisement in accordance with 18 U.S.C. Section 1734 solely to indica=
te
> > this fact.
>
> Hi Xho,
> Well - your code and concepts work fine when you want to wait
> sequentially. My goal here is to fire off x number of process and then
> wait for ALL of them to complete (this is basically rudimentary job
> control, trying to use the shell concepts and maximize parallelism).

Reading Xho's code, it looks like 3 processes are kicked off. 1 sleeps
3 seconds, 2 sleeps 6 seconds, and 3 sleeps 9 seconds.
If the processes were sequential, you wouldn't see the last pid until
after 18 seconds. instead you see it after 9 seconds.

HTH

Re: Wait for background processes to complete

am 16.01.2008 19:19:16 von pgodfrin

On Jan 15, 5:49 pm, Ben Morrow wrote:

>
>
> > So I wrap the ps command and do some looping:
>
> > for (;;)
> > {
> > open PGRP, "ps -C cp h |" ;
>
> Use lexical filehandles and three-or-more-arg open.
> Check the return value.
>
> open my $PGRP, '-|', 'ps', '-C', 'cp', 'h'
> or die "couldn't fork ps: $!";
>
> > @pidlist= ;
> > if ($#pidlist<0) {die "\nNo more processes\n" ;}
>
> IMHO any use of $#ary is an error; certainly in this case you should be
> using @pidlist instead.
>
> @pidlist or die "No more processes\n";
>
> This will run around in a tight loop running probably hundreds of ps
> processes per second. This is not a effective use of your system's
> resources, to say the least. If you must poll like this you need a sleep
> in there somewhere to limit the damage.
>
> > }
>
> > It's not pretty but it works...
>
> Yuck. The whole point of the wait syscall is to avoid nastiness like
> that. I suggest you learn how it works, or use a module written by
> someone who does (you have been given at least two suggestions so far),
> or stick to shell.
>
> > But, I believe this is an architectural FLAW with Perl.
>
> No. Firstly, the only flaw here is in your understanding; secondly, if
> there was a flaw it would be in Unix, not Perl, since Perl just exposes
> the underlying system interfaces.
>
> Ben



Whew! I knew I might get some feather's ruffled with the 'flaw'
comment. Sorry my knowledge is not as extensive as yours
(collectively) - but it appears it is true.

I had an exit statement in the if..elsif construct. By removing that
exit and changing the system() to exec() I am at least getting the
fork() process to kick off multiple tasks.

I'm still having problems making it wait though. I think I'm getting
there. I'll report back later - when my knowledge has increased :)
pg
(sorry Xho...)

Re: Wait for background processes to complete

am 16.01.2008 22:00:14 von pgodfrin

On Jan 15, 5:49 pm, Ben Morrow wrote:
> Quoth pgodfrin :
>
> > On Jan 14, 2:48 pm, xhos...@gmail.com wrote:
> > > pgodfrin wrote:
> > > > On Jan 14, 11:11 am, xhos...@gmail.com wrote:
>
> > > > > $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ; \
> > > > > my $x; do {$x=wait; print $x} until $x==-1'
>
>
>
> > It seems to me that firing off say 5 processes with the '&' character
> > to send them to the background is 5 parallel processes, while
> > executing them off one at a time is sequential. Your code (fork or
> > exec "sleep" ... ) waits for each sleep process to complete - so that
> > is what I meant by "waiting sequentially".
>
> No, I think you don't understand how fork works. It *is* rather
> confusing until you're used to it. Read the documentation again (both
> Perl's and your system's), or find a good Unix programming book.
>
> > Technically speaking you're
> > right - but the idea is to have tasks run in parallel versus
> > sequentially, which is ostensibly faster.
>
> The tasks do run in parallel with Xho's version.
>
>
>
> > To make a long story short, this is how I solved the problem. It seems
> > that the PIDs of the commands run via system() and the '&' background
> > thing end up belonging to the same Process Group - seen in the ps
> > command, plus a little extra:
>
>
> > So I wrap the ps command and do some looping:
>
> > for (;;)
> > {
> > open PGRP, "ps -C cp h |" ;
>
> Use lexical filehandles and three-or-more-arg open.
> Check the return value.
>
> open my $PGRP, '-|', 'ps', '-C', 'cp', 'h'
> or die "couldn't fork ps: $!";
>
> > @pidlist= ;
> > if ($#pidlist<0) {die "\nNo more processes\n" ;}
>
> IMHO any use of $#ary is an error; certainly in this case you should be
> using @pidlist instead.
>
> @pidlist or die "No more processes\n";
>
> This will run around in a tight loop running probably hundreds of ps
> processes per second. This is not a effective use of your system's
> resources, to say the least. If you must poll like this you need a sleep
> in there somewhere to limit the damage.
>
> > }
>
> > It's not pretty but it works...
>
> Yuck. The whole point of the wait syscall is to avoid nastiness like
> that. I suggest you learn how it works, or use a module written by
> someone who does (you have been given at least two suggestions so far),
> or stick to shell.
>
> > But, I believe this is an architectural FLAW with Perl.
>
> No. Firstly, the only flaw here is in your understanding; secondly, if
> there was a flaw it would be in Unix, not Perl, since Perl just exposes
> the underlying system interfaces.
>
> Ben
darn... still can't make it wait!
pg

Re: Wait for background processes to complete

am 16.01.2008 22:46:44 von pgodfrin

Hi Ben,
Well - it seems I have made some progress.But - I still need some
advice...

Here's my code:

#!/usr/bin/perl
use POSIX ":sys_wait_h";
$SIG{CHLD} = sub { wait }; # an 'installed' signal handler

$f=0 ;
while ()
{
$f += 1;
if ($pid = fork)
{
print "Fork $f pid: $pid\n" ;
print "Copying $_ ($pid)\n";
# exec() NEVER returns...
exec("cp $_ $_.old") ;
} elsif (defined $pid)
{
print "Found child...($pid)\n" ;
} elsif ($! =~ /No more process/)
{
print "Fork returned no more processes\n";
} else
{
die "Fork error.\n";
} # end fork
} # end while
print "\n<<<<< End of exercise >>>>>\n";
exit;

So - this code ends up starting 6 forks, and then falls out of the
while(<*.txt>) loop - printing the 'End of exercise' message. A 'ps -
ef |grep fk' (fktest5 is the name of this program) shows it's no
longer running, but the cp command indeed are still running.

What I would like to do is wait for the cp processes to finish before
executing the exit statement.

Any ideas?

thanks,
pg

Re: Wait for background processes to complete

am 16.01.2008 23:17:40 von xhoster

pgodfrin wrote:
> Hi Ben,
> Well - it seems I have made some progress.But - I still need some
> advice...
>
> Here's my code:
>
> #!/usr/bin/perl
> use POSIX ":sys_wait_h";
> $SIG{CHLD} = sub { wait }; # an 'installed' signal handler

You probably don't need a signal handler. And if you want it,
you should aware that that signal can be fired even when there is
no child ready to be waited for, leading to blocking in the wait.
So you would probably want to do a nonblocking waitpid instead.

>
> $f=0 ;
> while ()
> {
> $f += 1;
> if ($pid = fork)
> {
> print "Fork $f pid: $pid\n" ;
> print "Copying $_ ($pid)\n";
> # exec() NEVER returns...
> exec("cp $_ $_.old") ;
> } elsif (defined $pid)
> {
> print "Found child...($pid)\n" ;
> } elsif ($! =~ /No more process/)
> {
> print "Fork returned no more processes\n";
> } else
> {
> die "Fork error.\n";
> } # end fork
> } # end while
> print "\n<<<<< End of exercise >>>>>\n";

## On Linux, wait returns -1 when there are no living children to wait for.
1 until -1==wait();

> exit;

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 16.01.2008 23:29:16 von pgodfrin

On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
> pgodfrin wrote:
> > Hi Ben,
> > Well - it seems I have made some progress.But - I still need some
> > advice...
>
> > Here's my code:
>
> > #!/usr/bin/perl
> > use POSIX ":sys_wait_h";
> > $SIG{CHLD} = sub { wait }; # an 'installed' signal handler
>
> You probably don't need a signal handler. And if you want it,
> you should aware that that signal can be fired even when there is
> no child ready to be waited for, leading to blocking in the wait.
> So you would probably want to do a nonblocking waitpid instead.
>
>
>
>
>
> > $f=0 ;
> > while ()
> > {
> > $f += 1;
> > if ($pid = fork)
> > {
> > print "Fork $f pid: $pid\n" ;
> > print "Copying $_ ($pid)\n";
> > # exec() NEVER returns...
> > exec("cp $_ $_.old") ;
> > } elsif (defined $pid)
> > {
> > print "Found child...($pid)\n" ;
> > } elsif ($! =~ /No more process/)
> > {
> > print "Fork returned no more processes\n";
> > } else
> > {
> > die "Fork error.\n";
> > } # end fork
> > } # end while
> > print "\n<<<<< End of exercise >>>>>\n";
>
> ## On Linux, wait returns -1 when there are no living children to wait for.
> 1 until -1==wait();
>
> > exit;
>
> Xho
>
> --
> --------------------http://NewsReader.Com/------------------ --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.

Thanks Xho - I've removed the signal handler, but it seems wait always
returns -1 so - the loop is a nop? Where in the code should it go?

thanks,
pg

Re: Wait for background processes to complete

am 17.01.2008 01:52:01 von Charles DeRykus

On Jan 16, 2:29 pm, pgodfrin wrote:
> On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
>
>
>
> > pgodfrin wrote:
> > > Hi Ben,
> > > Well - it seems I have made some progress.But - I still need some
> > > advice...
>
> > > Here's my code:
>
> > > #!/usr/bin/perl
> > > use POSIX ":sys_wait_h";
> > > $SIG{CHLD} = sub { wait }; # an 'installed' signal handler
>
> > You probably don't need a signal handler. And if you want it,
> > you should aware that that signal can be fired even when there is
> > no child ready to be waited for, leading to blocking in the wait.
> > So you would probably want to do a nonblocking waitpid instead.
>
> > > $f=0 ;
> > > while ()
> > > {
> > > $f += 1;
> > > if ($pid = fork)
> > > {
> > > print "Fork $f pid: $pid\n" ;
> > > print "Copying $_ ($pid)\n";
> > > # exec() NEVER returns...
> > > exec("cp $_ $_.old") ;
> > > } elsif (defined $pid)
> > > {
> > > print "Found child...($pid)\n" ;
> > > } elsif ($! =~ /No more process/)
> > > {
> > > print "Fork returned no more processes\n";
> > > } else
> > > {
> > > die "Fork error.\n";
> > > } # end fork
> > > } # end while
> > > print "\n<<<<< End of exercise >>>>>\n";
>
> > ## On Linux, wait returns -1 when there are no living children to wait for.
> > 1 until -1==wait();
>
> > > exit;
>

> Thanks Xho - I've removed the signal handler, but it seems wait always
> returns -1 so - the loop is a nop? Where in the code should it go?
>

I'll pipe in here since the 'quick 'n dirty' solution
was mangled and diss'ed.

The safest action is an asynchronous wait with a
tight loop in the handler (perldoc perlipc):

use POSIX ":sys_wait_h";
$SIG{CHLD} = \&REAPER;

# now do something that forks...
...

sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }


--
Charles DeRykus

Re: Wait for background processes to complete

am 17.01.2008 03:06:25 von grocery_stocker

On Jan 13, 10:09 pm, Ben Morrow wrote:
> [please trim your quotations]
>
> Quoth pgodfrin :
>
> > On Jan 13, 9:51 pm, Ben Morrow wrote:
> > > Quoth pgodfrin :
>
> > > > Here's what I want to do - run several commands in the background and
> > > > have the perl program wait for the commands to complete. Fork doesn't
> > > > do it, nor does wait nor waitpid.
>
>
>
> > > You need to either implement the behaviour you want with fork, exec and
> > > waitpid (it's a little complicated, but entirely possible) or use
> > > IPC::Run, something like
>
> > OK - would you have a good example of the fork-system-waitpid method -
> > not the same one that's in all the other posts or the camel book?
>
> It's a more efficient use of everybody's time for you to use IPC::Run,
> which has been written and tested by someone who understands these
> issues and is prepared to solve them properly and portably, than it is
> for a random Usenaut to provide you with a code snippet.
>
> However, off the top of my head, completely untested, probably not
> portable to Win32 or other non-POSIX systems, etc. etc.,
>
> use POSIX qw/WNOHANG/;
>
> {
> my %kids;
>
> $SIG{CHLD} = sub {
> my ($pid, @died);
> push @died, $pid while $pid = waitpid -1, WNOHANG;
> delete @kids{@died};
> };
>
> sub background {
> my (@cmd) = @_;
>
> defined (my $pid = fork)
> or die "can't fork for '$cmd[0]': $!";
>
> if ($pid) {
> $kids{$pid} = 1;
> return;
> }
> else {
> local $" = "' '";
> exec @cmd or die "can't exec '@cmd': $!";
> }
> }
>
> sub finish {
> waitpid $_, 0 for keys %kids;
> %kids = ();
> }
> }
>
> while (<*.txt>) {
> print "Copying $_\n";
> background cp => $_, "$_.old";
> }
>
> finish;
>
> This will break if used in conjunction with system or anything else that
> relies on SIGCHLD, and an OO person would probably say it should be
> implemented as an object. Use IPC::Run.
>
> Ben

I really don't grasp the significance of having $kids{$pid} equal 1.
Can some enlighten me o this?

Re: Wait for background processes to complete

am 17.01.2008 06:37:10 von Ben Morrow

Quoth grocery_stocker :
> On Jan 13, 10:09 pm, Ben Morrow wrote:
> >

> > $SIG{CHLD} = sub {
> > my ($pid, @died);
> > push @died, $pid while $pid = waitpid -1, WNOHANG;
> > delete @kids{@died};
> > };

> > if ($pid) {
> > $kids{$pid} = 1;
> > return;
> > }

> > sub finish {
> > waitpid $_, 0 for keys %kids;
> > %kids = ();
> > }
>
> I really don't grasp the significance of having $kids{$pid} equal 1.
> Can some enlighten me o this?

It's one of the standard idioms for using a hash as a set. Every time we
create a child, we add an entry to the hash; every time one dies on its
own, we delete its entry. Then at the end we can use keys %pids to
retrieve the list of pids we still need to wait for. The only thing that
matters about %kids are its keys: we never use the values, so they can
be set to anything. I prefer using 1 since then the values are all true;
you can get slightly better memory use with

$kids{$pid} = ();

which inserts the key but doesn't create a value for it at all, but then
you have to test with exists, which I find annoying. Since in this case
I don't test for existance of keys at all, this doesn't matter: using 1 is
just a habit.

Ben

Re: Wait for background processes to complete

am 17.01.2008 16:20:33 von pgodfrin

On Jan 16, 6:52 pm, "comp.llang.perl.moderated" sam-01.ca.boeing.com> wrote:
> On Jan 16, 2:29 pm, pgodfrin wrote:
>
>
>
> > On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
>
> > > pgodfrin wrote:
> > > > Hi Ben,
> > > > Well - it seems I have made some progress.But - I still need some
> > > > advice...
>
> > > > Here's my code:
>
> > > > #!/usr/bin/perl
> > > > use POSIX ":sys_wait_h";
> > > > $SIG{CHLD} = sub { wait }; # an 'installed' signal handler
>
> > > You probably don't need a signal handler. And if you want it,
> > > you should aware that that signal can be fired even when there is
> > > no child ready to be waited for, leading to blocking in the wait.
> > > So you would probably want to do a nonblocking waitpid instead.
>
> > > > $f=0 ;
> > > > while ()
> > > > {
> > > > $f += 1;
> > > > if ($pid = fork)
> > > > {
> > > > print "Fork $f pid: $pid\n" ;
> > > > print "Copying $_ ($pid)\n";
> > > > # exec() NEVER returns...
> > > > exec("cp $_ $_.old") ;
> > > > } elsif (defined $pid)
> > > > {
> > > > print "Found child...($pid)\n" ;
> > > > } elsif ($! =~ /No more process/)
> > > > {
> > > > print "Fork returned no more processes\n";
> > > > } else
> > > > {
> > > > die "Fork error.\n";
> > > > } # end fork
> > > > } # end while
> > > > print "\n<<<<< End of exercise >>>>>\n";
>
> > > ## On Linux, wait returns -1 when there are no living children to wait for.
> > > 1 until -1==wait();
>
> > > > exit;
>
> > Thanks Xho - I've removed the signal handler, but it seems wait always
> > returns -1 so - the loop is a nop? Where in the code should it go?
>
> I'll pipe in here since the 'quick 'n dirty' solution
> was mangled and diss'ed.
>
> The safest action is an asynchronous wait with a
> tight loop in the handler (perldoc perlipc):
>
> use POSIX ":sys_wait_h";
> $SIG{CHLD} = \&REAPER;
>
> # now do something that forks...
> ...
>
> sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>
> --
> Charles DeRykus

Hi Charles - still doesn't wait - in fact the REAPER subroutine never
even gets called - I'm beginning to get back the 'flaw' concept...

pg

Re: Wait for background processes to complete

am 17.01.2008 16:24:58 von it_says_BALLS_on_your forehead

On Jan 17, 10:20=A0am, pgodfrin wrote:
> On Jan 16, 6:52 pm, "comp.llang.perl.moderated" >
>
>
> sam-01.ca.boeing.com> wrote:
> > On Jan 16, 2:29 pm, pgodfrin wrote:
>
> > > On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
>
> > > > pgodfrin wrote:
> > > > > Hi Ben,
> > > > > Well - it seems I have made some progress.But - I still need some
> > > > > advice...
>
> > > > > Here's my code:
>
> > > > > #!/usr/bin/perl
> > > > > use POSIX ":sys_wait_h";
> > > > > $SIG{CHLD} =3D sub { wait }; =A0# an 'installed' signal handler
>
> > > > You probably don't need a signal handler. =A0And if you want it,
> > > > you should aware that that signal can be fired even when there is
> > > > no child ready to be waited for, leading to blocking in the wait.
> > > > So you would probably want to do a nonblocking waitpid instead.
>
> > > > > $f=3D0 ;
> > > > > while ()
> > > > > {
> > > > > =A0 =A0$f +=3D 1;
> > > > > =A0 =A0if ($pid =3D fork)
> > > > > =A0 =A0{
> > > > > =A0 =A0print "Fork $f pid: $pid\n" ;
> > > > > =A0 =A0print "Copying =A0$_ ($pid)\n";
> > > > > =A0 =A0# exec() NEVER returns...
> > > > > =A0 =A0exec("cp $_ $_.old") ;
> > > > > =A0 =A0} elsif (defined $pid)
> > > > > =A0 =A0{
> > > > > =A0 =A0 =A0 print "Found child...($pid)\n" ;
> > > > > =A0 =A0} elsif ($! =3D~ /No more process/)
> > > > > =A0 =A0{
> > > > > =A0 =A0 =A0 print "Fork returned no more processes\n";
> > > > > =A0 =A0} else
> > > > > =A0 =A0{
> > > > > =A0 =A0 =A0 die "Fork error.\n";
> > > > > =A0 =A0} =A0# end fork
> > > > > } =A0# end while
> > > > > print "\n<<<<< End of exercise >>>>>\n";
>
> > > > ## On Linux, wait returns -1 when there are no living children to wa=
it for.
> > > > 1 until -1==wait();
>
> > > > > exit;
>
> > > Thanks Xho - I've removed the signal handler, but it seems wait always=

> > > returns -1 so - the loop is a nop? Where in the code should it go?
>
> > I'll pipe in here since the 'quick 'n dirty' solution
> > was mangled and diss'ed.
>
> > The safest action is an asynchronous wait with a
> > tight loop in the handler =A0(perldoc perlipc):
>
> > =A0 =A0use POSIX ":sys_wait_h";
> > =A0 =A0$SIG{CHLD} =3D \&REAPER;
>
> > =A0 =A0# now do something that forks...
> > =A0 =A0...
>
> > =A0 =A0sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>
> > --
> > Charles DeRykus
>
> Hi Charles - still doesn't wait - in fact the REAPER subroutine never
> even gets called - I'm beginning to get back the 'flaw' concept...
>

What's wrong with Xho's code earlier in the thread?

$ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ;
\
my $x; do {$x=3Dwait; print $x} until $x==-1'

Re: Wait for background processes to complete

am 17.01.2008 16:36:10 von pgodfrin

On Jan 17, 9:24 am, nolo contendere wrote:
> On Jan 17, 10:20 am, pgodfrin wrote:
>
>
>
> > On Jan 16, 6:52 pm, "comp.llang.perl.moderated" >
> > sam-01.ca.boeing.com> wrote:
> > > On Jan 16, 2:29 pm, pgodfrin wrote:
>
> > > > On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
>
> > > > > pgodfrin wrote:
> > > > > > Hi Ben,
> > > > > > Well - it seems I have made some progress.But - I still need some
> > > > > > advice...
>
> > > > > > Here's my code:
>
> > > > > > #!/usr/bin/perl
> > > > > > use POSIX ":sys_wait_h";
> > > > > > $SIG{CHLD} = sub { wait }; # an 'installed' signal handler
>
> > > > > You probably don't need a signal handler. And if you want it,
> > > > > you should aware that that signal can be fired even when there is
> > > > > no child ready to be waited for, leading to blocking in the wait.
> > > > > So you would probably want to do a nonblocking waitpid instead.
>
> > > > > > $f=0 ;
> > > > > > while ()
> > > > > > {
> > > > > > $f += 1;
> > > > > > if ($pid = fork)
> > > > > > {
> > > > > > print "Fork $f pid: $pid\n" ;
> > > > > > print "Copying $_ ($pid)\n";
> > > > > > # exec() NEVER returns...
> > > > > > exec("cp $_ $_.old") ;
> > > > > > } elsif (defined $pid)
> > > > > > {
> > > > > > print "Found child...($pid)\n" ;
> > > > > > } elsif ($! =~ /No more process/)
> > > > > > {
> > > > > > print "Fork returned no more processes\n";
> > > > > > } else
> > > > > > {
> > > > > > die "Fork error.\n";
> > > > > > } # end fork
> > > > > > } # end while
> > > > > > print "\n<<<<< End of exercise >>>>>\n";
>
> > > > > ## On Linux, wait returns -1 when there are no living children to wait for.
> > > > > 1 until -1==wait();
>
> > > > > > exit;
>
> > > > Thanks Xho - I've removed the signal handler, but it seems wait always
> > > > returns -1 so - the loop is a nop? Where in the code should it go?
>
> > > I'll pipe in here since the 'quick 'n dirty' solution
> > > was mangled and diss'ed.
>
> > > The safest action is an asynchronous wait with a
> > > tight loop in the handler (perldoc perlipc):
>
> > > use POSIX ":sys_wait_h";
> > > $SIG{CHLD} = \&REAPER;
>
> > > # now do something that forks...
> > > ...
>
> > > sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>
> > > --
> > > Charles DeRykus
>
> > Hi Charles - still doesn't wait - in fact the REAPER subroutine never
> > even gets called - I'm beginning to get back the 'flaw' concept...
>
> What's wrong with Xho's code earlier in the thread?
>
> $ perl -wle 'use strict; fork or exec "sleep " . $_*3 foreach 1..3 ;
> \
> my $x; do {$x=wait; print $x} until $x==-1'

You know what mon? Xho's program does indeed work in parallel AND
wait. Which means I really don't understand how to do this forking
( tee-hee) nonsense. I need to fix the flaw in MY architecture...
pg

Re: Wait for background processes to complete

am 17.01.2008 17:38:13 von pgodfrin

Gentle persons,

Xho was right from the start. Apparently all the samples from the
camel book, documentation for fork(), wait() and waitpid() are at the
least misleading - at the worse poorly written.

To begin with, if following the camel book's sample:

if ($pid=fork) { # parent here
} elsif { #child here
} # ...

One can indeed fork - but - the wait loop simply doesn't wait because
it returns -1 upon the first iteration.

The rest of the discussions in perliupc are quite interesting, but
useless in solving my problem here (although I did learn about signal
handlers a little bit).

In sum total here is the answer. Many, many thanks to Xho who was
right from the very beginning. Shame on the Perl documentation for
unnecessary obfuscation.

@fl= ();
foreach (@fl)
{
print "Copying $_ \n";
fork or exec("cp $_ $_.old") ;
}
do {$x=wait;} until $x==-1 ;
#{1 until -1==wait;}
print "\n<<<<< End of exercise >>>>>\n";
exit;

Incidentally Xho's other suggestion {1 until -1==wait; } also work.
pg

Re: Wait for background processes to complete

am 17.01.2008 18:03:57 von pgodfrin

On Jan 17, 10:38 am, pgodfrin wrote:
> Gentle persons,
>
> Xho was right from the start. Apparently all the samples from the
> camel book, documentation for fork(), wait() and waitpid() are at the
> least misleading - at the worse poorly written.
>
> To begin with, if following the camel book's sample:
>
> if ($pid=fork) { # parent here
>
> } elsif { #child here
> } # ...
>
> One can indeed fork - but - the wait loop simply doesn't wait because
> it returns -1 upon the first iteration.
>
> The rest of the discussions in perliupc are quite interesting, but
> useless in solving my problem here (although I did learn about signal
> handlers a little bit).
>
> In sum total here is the answer. Many, many thanks to Xho who was
> right from the very beginning. Shame on the Perl documentation for
> unnecessary obfuscation.
>
> @fl= ();
> foreach (@fl)
> {
> print "Copying $_ \n";
> fork or exec("cp $_ $_.old") ;}
>
> do {$x=wait;} until $x==-1 ;
> #{1 until -1==wait;}
> print "\n<<<<< End of exercise >>>>>\n";
> exit;
>
> Incidentally Xho's other suggestion {1 until -1==wait; } also work.
> pg

and wiat; all by itself also works now...

sheesh!

Re: Wait for background processes to complete

am 17.01.2008 18:14:36 von xhoster

pgodfrin wrote:
> Gentle persons,
>
> Xho was right from the start. Apparently all the samples from the
> camel book, documentation for fork(), wait() and waitpid() are at the
> least misleading - at the worse poorly written.
>
> To begin with, if following the camel book's sample:
>
> if ($pid=fork) { # parent here
> } elsif { #child here
> } # ...
>
> One can indeed fork - but - the wait loop simply doesn't wait because
> it returns -1 upon the first iteration.

I suspect you are misinterpreting something. (For one thing, there is
no "wait loop" in what you have shown!) The code above should be
equivalent to my "fork or exec" code as long as the child block of the "if"
has an exec or a exit, so that the child itself it doesn't fall through
into the invisible wait loop.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 17.01.2008 19:44:25 von pgodfrin

On Jan 17, 11:14 am, xhos...@gmail.com wrote:
> pgodfrin wrote:
> > Gentle persons,
>
> > Xho was right from the start. Apparently all the samples from the
> > camel book, documentation for fork(), wait() and waitpid() are at the
> > least misleading - at the worse poorly written.
>
> > To begin with, if following the camel book's sample:
>
> > if ($pid=fork) { # parent here
> > } elsif { #child here
> > } # ...
>
> > One can indeed fork - but - the wait loop simply doesn't wait because
> > it returns -1 upon the first iteration.
>
> I suspect you are misinterpreting something. (For one thing, there is
> no "wait loop" in what you have shown!) The code above should be
> equivalent to my "fork or exec" code as long as the child block of the "if"
> has an exec or a exit, so that the child itself it doesn't fall through
> into the invisible wait loop.
>
> Xho
>
> --
> --------------------http://NewsReader.Com/------------------ --
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
> this fact.

Hi Xho,
OK - I didn't include the wait loop in the snippet simply for brevity.
I tried using the if logic and it just doesn't wait or fork properly.
Since I'm looping though filenames, this is inadequate. Unless proven
otherwise - I'll say the Camel book code is unclear and incorrect for
my purposes:

@fl= ();
foreach (@fl)
{
if($pid=fork)
{
print "Copying $_ to $_.old\n";
exec("cp $_ $_.old") ;
} elsif (defined $pid) { exit; }
} # end loop
wait;

print "\n<<<<< End of exercise >>>>>\n";
exit;

However, this works AND waits. But - this is very interesting - the
last file in the sample list is 3 or 4 times larger than the first few
files, which permits me to show that the wait does indeed wait for
child processes, but since the first fork replaces the parent process,
the wait command no longer has children to wait on and then continues
to the next statement, while the last cp command is still running.

@fl= ();
foreach (@fl)
{
print "Copying $_ to $_.old\n";
fork or exec("cp $_ $_.old") ;
} # end loop
wait;
print "\n<<<<< End of exercise >>>>>\n";
exit;

But - that's no good - which necessitates the loop you had suggested
from the start. Once again Xho - you da MAN!

@fl= ();
foreach (@fl)
{
print "Copying $_ to $_.old\n";
fork or exec("cp $_ $_.old") ;
} # end loop
do {$x=wait; print "$x\n"} until $x==-1 ;
print "\n<<<<< End of exercise >>>>>\n";
exit;

Re: Wait for background processes to complete

am 18.01.2008 00:05:10 von hjp-usenet2

On 2008-01-17 18:44, pgodfrin wrote:
> On Jan 17, 11:14 am, xhos...@gmail.com wrote:
>> pgodfrin wrote:
>> > Gentle persons,
>>
>> > Xho was right from the start. Apparently all the samples from the
>> > camel book, documentation for fork(), wait() and waitpid() are at the
>> > least misleading - at the worse poorly written.

You may have been mislead, but please tell us what aspect of the
documentation mislead you.


>> > To begin with, if following the camel book's sample:
>>
>> > if ($pid=fork) { # parent here
>> > } elsif { #child here
>> > } # ...
>>
>> > One can indeed fork - but - the wait loop simply doesn't wait because
>> > it returns -1 upon the first iteration.

Compare this with your script:

> @fl= ();
> foreach (@fl)
> {
> if($pid=fork)
> {
> print "Copying $_ to $_.old\n";
> exec("cp $_ $_.old") ;
> } elsif (defined $pid) { exit; }
> } # end loop
> wait;
>
> print "\n<<<<< End of exercise >>>>>\n";
> exit;
>
> However, this works AND waits.

No, it doesn't. It doesn't work and it doesn't wait.

> But - this is very interesting - the
> last file in the sample list is 3 or 4 times larger than the first few
> files, which permits me to show that the wait does indeed wait for
> child processes, but since the first fork replaces the parent process,
> the wait command no longer has children to wait on and then continues
> to the next statement, while the last cp command is still running.

> @fl= ();
> foreach (@fl)
> {
> if($pid=fork)
> {

Here we are in the parent process ($pid != 0).

> print "Copying $_ to $_.old\n";
> exec("cp $_ $_.old") ;

You exec the cp command. This means that the parent process will now
execute "cp" instead of your script - after doing this it will exit.

> } elsif (defined $pid) {

Here we are in the child process.

> exit;

You exit immediately, so the child process does absolutely nothing. Why
fork at all if you don't want your child process to do anything?

> }

In any case, neither process will reach the end of the loop. So it will
copy exactly one file and

> } # end loop

since it never gets here it won't wait.
> wait;

You need to exec your program in the *child* process:


@fl= ();
foreach (@fl)
{
if($pid=fork)
{
# parent - do nothing
} elsif (defined $pid) {
# child
print "Copying $_ to $_.old\n";
exec("cp", $_, "$_.old") ;

# we won't get here if exec worked:
die "exec cp failed: $!";
}
} # end loop

# now wait for all the children:
do {$x=wait; print "$x\n"} until $x==-1 ;

Re: Wait for background processes to complete

am 18.01.2008 00:12:04 von hjp-usenet2

On 2008-01-17 00:52, comp.llang.perl.moderated wrote:
> On Jan 16, 2:29 pm, pgodfrin wrote:
>> On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
>> > ## On Linux, wait returns -1 when there are no living children to wait for.
>> > 1 until -1==wait();
>>
>> > > exit;
>>
>
>> Thanks Xho - I've removed the signal handler, but it seems wait always
>> returns -1 so - the loop is a nop? Where in the code should it go?
>>
>
> I'll pipe in here since the 'quick 'n dirty' solution
> was mangled and diss'ed.
>
> The safest action is an asynchronous wait with a

Before 5.8.0 that was actually unsafe. But while it is now safe in perl
5.8.x (and 5.10.x), it still has the tiny flaw of absolutely *not* doing
what the OP wants. Xho's solution is safe (and was so in all perl
versions and does what the OP wants. Well, almost - wait can return -1
if it is interrupted so one should check $! in addition to the return
value.

hp

Re: Wait for background processes to complete

am 18.01.2008 02:12:07 von xhoster

pgodfrin wrote:
>
> Hi Xho,
> OK - I didn't include the wait loop in the snippet simply for brevity.

A little too much brevity often leads to many cycles of posting, which is
anything but brief!

> I tried using the if logic and it just doesn't wait or fork properly.
> Since I'm looping though filenames, this is inadequate. Unless proven
> otherwise - I'll say the Camel book code is unclear and incorrect for
> my purposes:

Peter has already described what went wrong here--you reversed the
parent and child roles in your code (which I snipped). If there is a
specific part of the perldoc that mislead you into doing that, let's figure
it out what that was and have it fixed. (The Camel book, I don't know how
to get that fixed so am slightly less interested in it.)

....

> However, this works AND waits. But - this is very interesting - the
> last file in the sample list is 3 or 4 times larger than the first few
> files, which permits me to show that the wait does indeed wait for
> child processes, but since the first fork replaces the parent process,
> the wait command no longer has children to wait on and then continues
> to the next statement, while the last cp command is still running.

I don't know what you meant by "the first fork replaces the parent process"
but whatever you meant I suspect it is heading the wrong way. What the
code below does is wait for one child, presumably whichever child
finishes first. Then it prints and exits, while N-1 other children are
still running. Some shells' "wait" can wait for all children. Perl's
"wait", like C's "wait", does not. It waits for one child.

>
> @fl= ();
> foreach (@fl)
> {
> print "Copying $_ to $_.old\n";
> fork or exec("cp $_ $_.old") ;
> } # end loop
> wait;
> print "\n<<<<< End of exercise >>>>>\n";
> exit;
>
> But - that's no good - which necessitates the loop you had suggested
> from the start.

The reason the loop works is that instead of just calling wait once, it
calls "wait" over and over until the OS says "Hey bozo, there is nothing
left to wait for". Well, it actually doesn't say that, it just returns -1,
but "Hey bozo..." is what it means.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 18.01.2008 02:34:31 von Charles DeRykus

On Jan 17, 3:12 pm, "Peter J. Holzer" wrote:
> On 2008-01-17 00:52, comp.llang.perl.moderated wrote:
>
>
>
> > On Jan 16, 2:29 pm, pgodfrin wrote:
> >> On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
> >> > ## On Linux, wait returns -1 when there are no living children to wait for.
> >> > 1 until -1==wait();
>
> >> > > exit;
>
> >> Thanks Xho - I've removed the signal handler, but it seems wait always
> >> returns -1 so - the loop is a nop? Where in the code should it go?
>
> > I'll pipe in here since the 'quick 'n dirty' solution
> > was mangled and diss'ed.
>
> > The safest action is an asynchronous wait with a
>
> Before 5.8.0 that was actually unsafe. But while it is now safe in perl

Huh.. just for clarity, here's what I wrote:

use POSIX ":sys_wait_h";
$SIG{CHLD} = \&REAPER;
# now do something that forks...
...
sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }

In fact, historically, there's a clear recommendation to tight-loop
exactly as shown because you may lose signals occurring in
near concurrency if you don't. Or are you
suggesting that an explicit wait on each of
the child processes would somehow be safer...

> 5.8.x (and 5.10.x), it still has the tiny flaw of absolutely *not* doing
> what the OP wants. Xho's solution is safe (and was so in all perl
> versions and does what the OP wants. Well, almost - wait can return -1
> if it is interrupted so one should check $! in addition to the return
> value.
>
You're right, a asynchronous wait would need to
save child pids and loop until they're reaped.
I believe a such a solution was already shown.

--
Charles DeRykus

Re: Wait for background processes to complete

am 18.01.2008 03:02:16 von grocery_stocker

On Jan 16, 9:37 pm, Ben Morrow wrote:
> Quoth grocery_stocker :
>
>
>
> > On Jan 13, 10:09 pm, Ben Morrow wrote:
>
>
> > > $SIG{CHLD} = sub {
> > > my ($pid, @died);
> > > push @died, $pid while $pid = waitpid -1, WNOHANG;
> > > delete @kids{@died};
> > > };
>
> > > if ($pid) {
> > > $kids{$pid} = 1;
> > > return;
> > > }
>
> > > sub finish {
> > > waitpid $_, 0 for keys %kids;
> > > %kids = ();
> > > }
>
> > I really don't grasp the significance of having $kids{$pid} equal 1.
> > Can some enlighten me o this?
>
> It's one of the standard idioms for using a hash as a set. Every time we
> create a child, we add an entry to the hash; every time one dies on its
> own, we delete its entry. Then at the end we can use keys %pids to
> retrieve the list of pids we still need to wait for. The only thing that
> matters about %kids are its keys: we never use the values, so they can
> be set to anything. I prefer using 1 since then the values are all true;
> you can get slightly better memory use with
>
> $kids{$pid} = ();
>
> which inserts the key but doesn't create a value for it at all, but then
> you have to test with exists, which I find annoying. Since in this case
> I don't test for existance of keys at all, this doesn't matter: using 1 is
> just a habit.
>
> Ben


Okay, why would you have to test for exists if

$kids{$pid} = ();

just creates something like undef.

Re: Wait for background processes to complete

am 18.01.2008 16:11:14 von hjp-usenet2

On 2008-01-18 01:34, comp.llang.perl.moderated wrote:
> On Jan 17, 3:12 pm, "Peter J. Holzer" wrote:
>> On 2008-01-17 00:52, comp.llang.perl.moderated wrote:
>>
>> > On Jan 16, 2:29 pm, pgodfrin wrote:
>> >> On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
>> >> > ## On Linux, wait returns -1 when there are no living children to wait for.
>> >> > 1 until -1==wait();
>>
>> >> > > exit;
>>
>> >> Thanks Xho - I've removed the signal handler, but it seems wait always
>> >> returns -1 so - the loop is a nop? Where in the code should it go?
>>
>> > I'll pipe in here since the 'quick 'n dirty' solution
>> > was mangled and diss'ed.
>>
>> > The safest action is an asynchronous wait with a
>>
>> Before 5.8.0 that was actually unsafe. But while it is now safe in perl
>
> Huh.. just for clarity, here's what I wrote:
>
> use POSIX ":sys_wait_h";
> $SIG{CHLD} = \&REAPER;
> # now do something that forks...
> ...
> sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>
> In fact, historically, there's a clear recommendation to tight-loop
> exactly as shown because you may lose signals occurring in
> near concurrency if you don't.

The problem with this approach is that until 5.8.0 (when "safe signals"
were introduced), the REAPER function would be called as soon as the
signal arrived regardless of what the perl interpreter was doing at the
time - there was a very real risk that this would crash the perl
interpreter (in a real world application which forked about 50,000 to
100,000 times a day, the parent process would crash every few days with
perl 5.6.0). In C the POSIX standard explicitely states which functions
are safe inside a signal handler (and the rest has to be considered
unsafe), but for perl no such list existed (so everything has to be
considered unsafe). In perl 5.8.0, the "real" signal handler only notes
that a signal arrived, and perl will than invoke sub REAPER when it can
safely do this (which may be quite some time later).


> Or are you suggesting that an explicit wait on each of the child
> processes would somehow be safer...

Avoiding signal handles was indeed "somehow safer" before 5.8.0,
because signal handlers were fundamentally unsafe (and arguably broken)
in perl.

But that was only a side note. My real argument was that this code has a
completely different purpose: It "reaps" children as soon as they exit
to avoid zombies. This is useful for long-running server processes which
don't really want to wait for their children - they just want to fire
them off and then forget about them. But the OP explicitely wants to
wait for his children so that is what he should do. There is no need for
any signal handlers - they just add complexity which isn't needed and
obscure the purpose of the code.


>> 5.8.x (and 5.10.x), it still has the tiny flaw of absolutely *not* doing
>> what the OP wants. Xho's solution is safe (and was so in all perl
>> versions and does what the OP wants. Well, almost - wait can return -1
>> if it is interrupted so one should check $! in addition to the return
>> value.
>>
> You're right, a asynchronous wait would need to
> save child pids and loop until they're reaped.

Yes. But why would you want to do that if it can be done a lot simpler
and more straightforward?

hp

Re: Wait for background processes to complete

am 18.01.2008 16:20:57 von hjp-usenet2

On 2008-01-18 02:02, grocery_stocker wrote:
> On Jan 16, 9:37 pm, Ben Morrow wrote:
>> The only thing that matters about %kids are its keys: we never use
>> the values, so they can be set to anything. I prefer using 1 since
>> then the values are all true; you can get slightly better memory use
>> with
>>
>> $kids{$pid} = ();
>>
>> which inserts the key but doesn't create a value for it at all, but
>> then you have to test with exists, which I find annoying. Since in
>> this case I don't test for existance of keys at all, this doesn't
>> matter: using 1 is just a habit.
>
> Okay, why would you have to test for exists if
>
> $kids{$pid} = ();
>
> just creates something like undef.

Consider:

#!/usr/bin/perl
use warnings;
use strict;

my %kids;

$kids{3} = ();
$kids{5} = ();

for (1 .. 9) {
print "$_\n" if is_kid($_);
}

sub is_kid {
return exists($kids{$_[0]});
}
__END__
3
5

Please find an implementation for sub is_kid which doesn't use exists.

hp

Re: Wait for background processes to complete

am 18.01.2008 18:30:22 von Charles DeRykus

On Jan 18, 7:11 am, "Peter J. Holzer" wrote:
> On 2008-01-18 01:34, comp.llang.perl.moderated wrote:
>
>
>
> > On Jan 17, 3:12 pm, "Peter J. Holzer" wrote:
> >> On 2008-01-17 00:52, comp.llang.perl.moderated wrote:
>
> >> > On Jan 16, 2:29 pm, pgodfrin wrote:
> >> >> On Jan 16, 4:17 pm, xhos...@gmail.com wrote:
> >> >> > ## On Linux, wait returns -1 when there are no living children to wait for.
> >> >> > 1 until -1==wait();
>
> >> >> > > exit;
>
> >> >> Thanks Xho - I've removed the signal handler, but it seems wait always
> >> >> returns -1 so - the loop is a nop? Where in the code should it go?
>
> >> > I'll pipe in here since the 'quick 'n dirty' solution
> >> > was mangled and diss'ed.
>
> >> > The safest action is an asynchronous wait with a
>
> >> Before 5.8.0 that was actually unsafe. But while it is now safe in perl
>
> > Huh.. just for clarity, here's what I wrote:
>
> > use POSIX ":sys_wait_h";
> > $SIG{CHLD} = \&REAPER;
> > # now do something that forks...
> > ...
> > sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>
> > In fact, historically, there's a clear recommendation to tight-loop
> > exactly as shown because you may lose signals occurring in
> > near concurrency if you don't.
>
> The problem with this approach is that until 5.8.0 (when "safe signals"
> were introduced), the REAPER function would be called as soon as the
> signal arrived regardless of what the perl interpreter was doing at the
> time - there was a very real risk that this would crash the perl
> interpreter (in a real world application which forked about 50,000 to
> 100,000 times a day, the parent process would crash every few days with
> perl 5.6.0). In C the POSIX standard explicitely states which functions
> are safe inside a signal handler (and the rest has to be considered
> unsafe), but for perl no such list existed (so everything has to be
> considered unsafe). In perl 5.8.0, the "real" signal handler only notes
> that a signal arrived, and perl will than invoke sub REAPER when it can
> safely do this (which may be quite some time later).
>
> > Or are you suggesting that an explicit wait on each of the child
> > processes would somehow be safer...
>
> Avoiding signal handles was indeed "somehow safer" before 5.8.0,
> because signal handlers were fundamentally unsafe (and arguably broken)
> in perl.

Hm, I thought I recalled that, even with
Perl's broken pre-5.8 signal handling, the
issue was most likely to surface with op's
not on POSIX's safe list. Something like
'waitpid', which is on POSIX's safe list,
in a simple loop was unlikely to cause a
problem.


>
> But that was only a side note. My real argument was that this code has a
> completely different purpose: It "reaps" children as soon as they exit
> to avoid zombies. This is useful for long-running server processes which
> don't really want to wait for their children - they just want to fire
> them off and then forget about them. But the OP explicitely wants to
> wait for his children so that is what he should do. There is no need for
> any signal handlers - they just add complexity which isn't needed and
> obscure the purpose of the code.
>
> >> 5.8.x (and 5.10.x), it still has the tiny flaw of absolutely *not* doing
> >> what the OP wants. Xho's solution is safe (and was so in all perl
> >> versions and does what the OP wants. Well, almost - wait can return -1
> >> if it is interrupted so one should check $! in addition to the return
> >> value.
>
> > You're right, a asynchronous wait would need to
> > save child pids and loop until they're reaped.
>
> Yes. But why would you want to do that if it can be done a lot simpler
> and more straightforward?
>

Yes I agree that's probably true here but, as you mention, you'd have
to check for -1 because the process was reaped, stopped, or terminated
by some signal for instance. And, if you're concerned about zombies,
you really need to keep a list of pids to wait on. To me the solutions
are very close modulo the signal function setup. Even outside the
daemon
setting, I like the signal solution which handles
the reaping as soon as it occurs and bundles child cleanup neatly in a
separate sub.

--
Charles DeRykus

Re: Wait for background processes to complete

am 18.01.2008 18:47:37 von pgodfrin

Gentle Persons,
This was a lot of fun. I would like to respond to the various
observations, especially the ones about the Perl Documentation being
misleading.

To begin with, I apologize for my last post (the long one) - upon re-
reading it my language was not clear. I should have said "the
following code works" as opposed to simply "this works AND waits". I
was referring to the code below that sentence. Sorry.

To restate the original task I wanted to solve:

To be able to execute commands in the background and wait for their
completion.

The documentation I am referring to is http://perldoc.perl.org/.

If you search on the concept of "background processes" this
documentation points you to the following in the Language reference >
perlipc > Background Process :

You can run a command in the background with:

system("cmd &");

The command's STDOUT and STDERR (and possibly STDIN, depending on your
shell) will be the same as the parent's. You won't need
to catch SIGCHLD because of the double-fork taking place (see below
for more details).

There is no further reference to a "double fork" (except in the
perlfaq8 and only in the context of zombies, which it says are not an
issue using system("cmd &") ). This is confusing.

The documentation for wait(), waitpid() and fork() do not explain that
the executing code should be placed in the "child" section of the if
construct. Some of the examples in perlipc show code occurring in both
the parent and the child section - so it is still not clear. If it
were, why was I insisting on trying to execute the "cp" command in the
parent section? So, thank you Peter for clearing that up. To be fair,
the only place where the "if" construct is indeed placed on paper (or
virtual paper) is the camel book - but it is also not clear where the
code should be placed. But clearly, that is clear now :).

Furthermore, while the perlipc section is quite verbose, nowhere is
the code snippet do {$x=wait; print "$x\n"} until $x==-1 or any
variation of that wait() call mentioned. There are references to a
while loop and the waitpid() function, but being in the context of a
signal handler and 'reaping' - it is not clear.

So, once again many thanks for the help. I would like to know, Peter,
are you in a position to amend the documentation? Also, the perlfaq8
does come closer to explaining, but it is simply not clear how the
fork process will emulate the Unix background operator (&). So how can
we make this better? How's about this:

In the perlipc, under background tasks. make the statement - "the call
system("cmd &") will run the cmd in the background, but will spawn a
shell to execute the command, dispatch the command, terminate the
shell and return to the calling program. The calling program will lose
the ability to wait for the process as is customary with the shell
script command 'wait'. In order to both execute one or more programs
in the background and have the calling program wait for the executions
to complete it will be necessary to use the fork() function."

Then in the fork section. Show a simple example like the ones we have
been working with AND show a simple approach to using the wait
function. Furthermore - add sample code to the wait() and fork()
functions that are simple and realistic, unlike the code same in the
waitpid() function.

In closing, it is perhaps non-intuitive to me that a fork process
should have the child section actually executing the code, but I ask
you how one can intuit that from the sample in the camel book and the
samples in the http://perldoc.perl.org/. To really drive the point
home, Xho's code:

fork or exec("cp $_ $_.old") ;
do {$x=wait;} until $x==-1 ;

Is STILL not intuitive that the child is executing the code!

All in all, with the collective help of y'all I been able to
successfully accomplish my goal, but man it was an uphill battle that
could have been very easily solved with better documentation. But -
what I can't figure out is why I had to embark on this journey in a
new posting - has no-one needed to do this before? Oh well...

so long and thanks for all the fish,
pg

Re: Wait for background processes to complete

am 18.01.2008 22:17:15 von hjp-usenet2

On 2008-01-18 17:47, pgodfrin wrote:
> This was a lot of fun. I would like to respond to the various
> observations, especially the ones about the Perl Documentation being
> misleading.
[...]
> To restate the original task I wanted to solve:
>
> To be able to execute commands in the background and wait for their
> completion.
>
> The documentation I am referring to is http://perldoc.perl.org/.
>
> If you search on the concept of "background processes" this
> documentation points you to the following in the Language reference >
> perlipc > Background Process :
>
> You can run a command in the background with:
>
> system("cmd &");
>
> The command's STDOUT and STDERR (and possibly STDIN, depending on your
> shell) will be the same as the parent's. You won't need
> to catch SIGCHLD because of the double-fork taking place (see below
> for more details).
>
> There is no further reference to a "double fork" (except in the
> perlfaq8 and only in the context of zombies, which it says are not an
> issue using system("cmd &") ). This is confusing.

I find it more confusing that this seems to be in the section with the
title "Using open() for IPC". I fail to see what one has to do with the
other.

There is a general problem with perl documentation: Perl evolved in the
Unix environment, and a lot of the documentation was written at a time
when the "newbie perl programmer" could be reasonably expected to have
already some programming experience on Unix (in C, most likely) and know
basic Unix concepts like processes, fork(), filehandles, etc. So in a
lot of places the documentation doesn't answer the question "how can I
do X on Unix?" but the question "I already know how to do X in C, now
tell me how I can do it in perl!". When you lern Perl without knowing
Unix first, this can be confusing, because the Perl documentation
generelly explains only Perl, but not Unix.

I am not sure if that should be fixed at all: It's the perl
documentation and not the unix documentation after all, and perl isn't
unix specific, but has been ported to just about every OS.


> The documentation for wait(), waitpid() and fork() do not explain that
> the executing code should be placed in the "child" section of the if
> construct.

Of course not, that would be wrong. There can be code in both (if you
wanted one process to do nothing, why fork?). What the parent should do
and what the child should do depend on what you want them to do. It just
happened that for your particular problem the parent had nothing to do
between forking off children.

> Some of the examples in perlipc show code occurring in both
> the parent and the child section - so it is still not clear. If it
> were, why was I insisting on trying to execute the "cp" command in the
> parent section?

I don't know. What did you expect that would do?

Your problem was:

I want a process to gather a list of files. Then, for each file, it
should start another process which copies the file. These processes
should run in parallel. Finally, it should wait for all these processes
to terminate, and then terminate itself.

Even this description makes it rather implicit, that the original
process creates children and then the children do the copying. If you
add the restriction that a process can only wait for its children, any
other solution becomes extremely awkward.

Besides it is exactly the same what your shell script did: The shell
(the parent process) created child processes in a loop, each child
process executed one cp command, and finally the parent waited for all
the children.


> Furthermore, while the perlipc section is quite verbose, nowhere is
> the code snippet do {$x=wait; print "$x\n"} until $x==-1 or any
> variation of that wait() call mentioned.

Again, this is a solution to your specific problem. You said you wanted
to wait for all children, so Xho wrote a loop which would do that. This
is a rather rare requirement - I think I needed it maybe a handful of
times in 20+ years of programming outside of a child reaper function.


> There are references to a
> while loop and the waitpid() function, but being in the context of a
> signal handler and 'reaping' - it is not clear.

This is the one situation where this construct is frequently used. It
can happen that several children die before the parent can react to the
signal - in this case the reaper function will be called only once, but
it must wait for several children. This isn't obvious, so the docs
mention it.

However, in your case you *know* you started several children and you
*want* to wait for all of them, so it's obvious that you need a loop. This
hasn't anything to do with perl, it's just a requirement of your
problem.


> So, once again many thanks for the help. I would like to know, Peter,
> are you in a position to amend the documentation? Also, the perlfaq8
> does come closer to explaining, but it is simply not clear how the
> fork process will emulate the Unix background operator (&). So how can
> we make this better? How's about this:

fork doesn't "emulate &". fork is one of the functions used by the shell
to execute commands. Basically, the shell does this in a loop:

display the prompt
read the next command
fork
in the child:
execute the command, then exit
in the parent:
if the command did NOT end with &:
wait for the child to terminate

So the Shell always forks before executing commands and it always
executes them in a child process. But normally it waits for the child
process to terminate before displaying the next prompt. If you add the
"&" at the end of the line, it doesn't wait but displays the next prompt
immediately.

(ok, that's a bit too simplistic, but I don't want to go into the arcana
of shell internals here - that would be quite off-topic).



> In the perlipc, under background tasks. make the statement - "the call
> system("cmd &") will run the cmd in the background, but will spawn a
> shell to execute the command, dispatch the command, terminate the
> shell and return to the calling program. The calling program will lose
> the ability to wait for the process as is customary with the shell
> script command 'wait'. In order to both execute one or more programs
> in the background and have the calling program wait for the executions
> to complete it will be necessary to use the fork() function."

I think that would confuse just about anybody who doesn't have exactly
your problem for exactly the same reason you were confused by the
"double fork". It's very specific and should be clear to anyone who
knows what system does and what a unix shell does when you use "&" - I
suspect that sentence about the double fork was added after a discussion
like this one. But I agree that fork should definitely be mentioned here.
system("cmd &") almost never what you would want to do.


> Then in the fork section. Show a simple example like the ones we have
> been working with AND show a simple approach to using the wait
> function.
> Furthermore - add sample code to the wait() and fork()
> functions that are simple and realistic,

fork and wait are very simple building blocks which can be combined in a
lot of different ways. Any sample code here can cover only basic
constructs like

my $pid = fork();
if (!defined $pid) {
die "fork failed: $!";
} elsif ($pid == 0) {
# child does something here
# ...
exit;
} else {
# parent does something here
# ...
# and then waits for child:
waitpid($pid, 0);
}

More complex examples should go into perlipc (which needs serious
cleanup, IMHO).

> unlike the code same in the waitpid() function.

That code is simple and realistic.


> In closing, it is perhaps non-intuitive to me that a fork process
> should have the child section actually executing the code,

If you don't want the child process to do anything, why would you create
it in the first place?

> but I ask you how one can intuit that from the sample in the camel
> book and the samples in the http://perldoc.perl.org/. To really drive
> the point home, Xho's code:
>
> fork or exec("cp $_ $_.old") ;
> do {$x=wait;} until $x==-1 ;
>
> Is STILL not intuitive that the child is executing the code!

True. This code isn't meant to be intuitive. It's meant to be short.
I wouldn't write that in production code, much less in an answer to a
newbie question.

hp

Re: Wait for background processes to complete

am 18.01.2008 22:33:46 von hjp-usenet2

On 2008-01-18 17:30, comp.llang.perl.moderated wrote:
> On Jan 18, 7:11 am, "Peter J. Holzer" wrote:
>> On 2008-01-18 01:34, comp.llang.perl.moderated wrote:
>>
>>
>>
>> > On Jan 17, 3:12 pm, "Peter J. Holzer" wrote:
>> >> On 2008-01-17 00:52, comp.llang.perl.moderated wrote:
>>
>> >> > The safest action is an asynchronous wait with a
>>
>> >> Before 5.8.0 that was actually unsafe. But while it is now safe in
>> >> perl
>>
>> > Huh.. just for clarity, here's what I wrote:
>>
>> > use POSIX ":sys_wait_h";
>> > $SIG{CHLD} = \&REAPER;
>> > # now do something that forks...
>> > ...
>> > sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>>
>> > In fact, historically, there's a clear recommendation to tight-loop
>> > exactly as shown because you may lose signals occurring in
>> > near concurrency if you don't.
>>
>> The problem with this approach is that until 5.8.0 (when "safe signals"
>> were introduced), the REAPER function would be called as soon as the
>> signal arrived regardless of what the perl interpreter was doing at the
>> time - there was a very real risk that this would crash the perl
>> interpreter (in a real world application which forked about 50,000 to
>> 100,000 times a day, the parent process would crash every few days with
>> perl 5.6.0). In C the POSIX standard explicitely states which functions
>> are safe inside a signal handler (and the rest has to be considered
>> unsafe), but for perl no such list existed (so everything has to be
>> considered unsafe). In perl 5.8.0, the "real" signal handler only notes
>> that a signal arrived, and perl will than invoke sub REAPER when it can
>> safely do this (which may be quite some time later).
>>
>> > Or are you suggesting that an explicit wait on each of the child
>> > processes would somehow be safer...
>>
>> Avoiding signal handles was indeed "somehow safer" before 5.8.0,
>> because signal handlers were fundamentally unsafe (and arguably broken)
>> in perl.
>
> Hm, I thought I recalled that, even with
> Perl's broken pre-5.8 signal handling, the
> issue was most likely to surface with op's
> not on POSIX's safe list.

One function not on POSIX's safe list is malloc. And since just about
anything in perl is dynamically allocated ...

> Something like 'waitpid', which is on POSIX's safe list, in a simple
> loop was unlikely to cause a problem.

waitpid probably wasn't a problem. The loop might have been. Simply
calling the reaper function (which needs to allocate a call frame) might
have been. In the program I mentioned, a reaper function deleted an
entry from a hash - this definitely was a problem.


>> > You're right, a asynchronous wait would need to
>> > save child pids and loop until they're reaped.
>>
>> Yes. But why would you want to do that if it can be done a lot simpler
>> and more straightforward?
>>
>
> Yes I agree that's probably true here but, as you mention, you'd have
> to check for -1 because the process was reaped, stopped, or terminated
> by some signal for instance. And, if you're concerned about zombies,
> you really need to keep a list of pids to wait on.

Zombies are not a concern in this case. When you post a followup to a
question of an obvious perl beginner, please try to provide a solution
to his problem. A solution to a completely different problem will just
confuse him, especially, if you don't say that it is a solution for a
completely different problem.


> To me the solutions are very close modulo the signal function setup.

The signal function setup is the difference, yes. For the OP's problem a
signal handler is not only not needed, but does nothing at all to solve
the problem.

hp

Re: Wait for background processes to complete

am 18.01.2008 23:19:02 von Charles DeRykus

On Jan 18, 1:33 pm, "Peter J. Holzer" wrote:
> On 2008-01-18 17:30, comp.llang.perl.moderated wrote:
>
>
>
> > On Jan 18, 7:11 am, "Peter J. Holzer" wrote:
> >> On 2008-01-18 01:34, comp.llang.perl.moderated wrote:
>
> >> > On Jan 17, 3:12 pm, "Peter J. Holzer" wrote:
> >> >> On 2008-01-17 00:52, comp.llang.perl.moderated wrote:
>
> >> >> > The safest action is an asynchronous wait with a
>
> >> >> Before 5.8.0 that was actually unsafe. But while it is now safe in
> >> >> perl
>
> >> > Huh.. just for clarity, here's what I wrote:
>
> >> > use POSIX ":sys_wait_h";
> >> > $SIG{CHLD} = \&REAPER;
> >> > # now do something that forks...
> >> > ...
> >> > sub REAPER { 1 while waitpid(-1, WNOHANG)) > 0; }
>
> >> > In fact, historically, there's a clear recommendation to tight-loop
> >> > exactly as shown because you may lose signals occurring in
> >> > near concurrency if you don't.
>
> >> The problem with this approach is that until 5.8.0 (when "safe signals"
> >> were introduced), the REAPER function would be called as soon as the
> >> signal arrived regardless of what the perl interpreter was doing at the
> >> time - there was a very real risk that this would crash the perl
> >> interpreter (in a real world application which forked about 50,000 to
> >> 100,000 times a day, the parent process would crash every few days with
> >> perl 5.6.0). In C the POSIX standard explicitely states which functions
> >> are safe inside a signal handler (and the rest has to be considered
> >> unsafe), but for perl no such list existed (so everything has to be
> >> considered unsafe). In perl 5.8.0, the "real" signal handler only notes
> >> that a signal arrived, and perl will than invoke sub REAPER when it can
> >> safely do this (which may be quite some time later).
>
> >> > Or are you suggesting that an explicit wait on each of the child
> >> > processes would somehow be safer...
>
> >> Avoiding signal handles was indeed "somehow safer" before 5.8.0,
> >> because signal handlers were fundamentally unsafe (and arguably broken)
> >> in perl.
>
> > Hm, I thought I recalled that, even with
> > Perl's broken pre-5.8 signal handling, the
> > issue was most likely to surface with op's
> > not on POSIX's safe list.
>
> One function not on POSIX's safe list is malloc. And since just about
> anything in perl is dynamically allocated ...
>
> > Something like 'waitpid', which is on POSIX's safe list, in a simple
> > loop was unlikely to cause a problem.
>
> waitpid probably wasn't a problem. The loop might have been. Simply
> calling the reaper function (which needs to allocate a call frame) might
> have been. In the program I mentioned, a reaper function deleted an
> entry from a hash - this definitely was a problem.
>
> >> > You're right, a asynchronous wait would need to
> >> > save child pids and loop until they're reaped.
>
> >> Yes. But why would you want to do that if it can be done a lot simpler
> >> and more straightforward?
>
> > Yes I agree that's probably true here but, as you mention, you'd have
> > to check for -1 because the process was reaped, stopped, or terminated
> > by some signal for instance. And, if you're concerned about zombies,
> > you really need to keep a list of pids to wait on.
>
> Zombies are not a concern in this case. When you post a followup to a
> question of an obvious perl beginner, please try to provide a solution
> to his problem. A solution to a completely different problem will just
> confuse him, especially, if you don't say that it is a solution for a
> completely different problem.

I believe the OP was given a couple of solutions.
Sometimes discussions do segue off in different
directions but the info can be useful at times.
It was to me. And I disagree -- for at least
one of the solutions -- zombies could occur
and need to be dealt with.

>
> > To me the solutions are very close modulo the signal function setup.
>
> The signal function setup is the difference, yes. For the OP's problem a
> signal handler is not only not needed, but does nothing at all to solve
> the problem.
>

Again I disagree. A viable alternative solution could make use of a
SIGCHLD handler. But I think
the thread has now outlived its usefulness.

--
Charles DeRykus

Re: Wait for background processes to complete

am 19.01.2008 12:12:09 von hjp-usenet2

On 2008-01-18 22:19, comp.llang.perl.moderated wrote:
> On Jan 18, 1:33 pm, "Peter J. Holzer" wrote:
>> Zombies are not a concern in this case. When you post a followup to a
>> question of an obvious perl beginner, please try to provide a solution
>> to his problem. A solution to a completely different problem will just
>> confuse him, especially, if you don't say that it is a solution for a
>> completely different problem.
>
> I believe the OP was given a couple of solutions.
> Sometimes discussions do segue off in different
> directions but the info can be useful at times.

True, but there should be some continuity from one posting to the next.
The code in the posting you replied to had a serious problem, but it
didn't produce zombies (Well, one - for a short time). Butting in with a
zombie-avoidance method was a complete non-sequitur. It would have been
ok if you had followed up to an article with code which did actually
produce zombies, or if you had started your article with some indication
that you are not proposing a solution to the problem at hand (something
like "This probably isn't the problem here, but in general ..." works
well).

> It was to me. And I disagree -- for at least
> one of the solutions -- zombies could occur
> and need to be dealt with.

I don't recall seeing any solution which did produce zombies and was
otherwise correct (i.e. met the OP's specification). I can think of
at least two ways to achieve this (both of them wait for all children
but not necessarily in the order they terminate), but I don't agree that
in these cases "zombies need to be dealt with". In the worst case all
zombies will be collected immediately after the last child process
terminated. If the system can deal with $n active processes all
be


>> > To me the solutions are very close modulo the signal function setup.
>>
>> The signal function setup is the difference, yes. For the OP's problem a
>> signal handler is not only not needed, but does nothing at all to solve
>> the problem.
>>
>
> Again I disagree. A viable alternative solution could make use of a
> SIGCHLD handler.

It could. It's just completely useless.

Assume that the loop which forks off the children registers them in
%kids.

Then the SIGCHLD handler could do something like this:

sub REAPER {
for(;;) {
$kid = waitpid(-1, WNOHANG);
last if $kid <= 0;
delete $kids{$kid};
}
}

So, at the end of the program we just need to loop until %kids is empty:

while (keys %kids) {
}

But that's busy-waiting - it will consume 100 % CPU time. We could sleep
inside the loop, but how long? Until the next child terminates. Well, we
already have a function which does sleep until a child terminates - it's
called wait. So the loop turns into:

while (keys %kids) {
my $kid = wait();
delete $kids{$kid};
}

So now we have a loop at the end which waits for all children, and the
REAPER function has no useful function anymore. So we can delete it.

hp

Re: Wait for background processes to complete

am 20.01.2008 00:24:59 von Charles DeRykus

On Jan 19, 3:12 am, "Peter J. Holzer" wrote:
> On 2008-01-18 22:19, comp.llang.perl.moderated wrote:
>
> ...
> > Again I disagree. A viable alternative solution could make use of a
> > SIGCHLD handler.
>
> It could. It's just completely useless.
>
> Assume that the loop which forks off the children registers them in
> %kids.
>
> Then the SIGCHLD handler could do something like this:
>
> sub REAPER {
> for(;;) {
> $kid = waitpid(-1, WNOHANG);
> last if $kid <= 0;
> delete $kids{$kid};
> }
>
> }
>
> So, at the end of the program we just need to loop until %kids is empty:
>
> while (keys %kids) {
>
> }
>
> But that's busy-waiting - it will consume 100 % CPU time. We could sleep
> inside the loop, but how long? Until the next child terminates. Well, we
> already have a function which does sleep until a child terminates - it's
> called wait. So the loop turns into:
>
> while (keys %kids) {
> my $kid = wait();
> delete $kids{$kid};
>
> }
>
> So now we have a loop at the end which waits for all children, and the
> REAPER function has no useful function anymore. So we can delete it.
>

That's true. The SIGCHLD handler has no apparent advantage here
unless
you have some critical reason to want immediate signal delivery. I'm
not sure what likely scenarios in a non-daemon setting might benefit
--
maybe someone decides to bump a reaped process count in the handler
in
order to sleep during the fork loop if some threshold is exceeded...
Dunno.

At any rate though, it's always seemed cleaner to me to reap child
processes
as soon as possible...particularly at the bargain basement price of a
POSIX
declaration and 1 or 2-liner handler. Not a big deal either way but I
think
it's worth mentioning as an altenative.

--
Charles DeRykus

Re: Wait for background processes to complete

am 20.01.2008 00:37:02 von pgodfrin

On Jan 18, 3:17 pm, "Peter J. Holzer" wrote:
> On 2008-01-18 17:47, pgodfrin wrote:
>
>
>
> > This was a lot of fun. I would like to respond to the various
> > observations, especially the ones about the Perl Documentation being
> > misleading.
> [...]
> > To restate the original task I wanted to solve:
>
> > To be able to execute commands in the background and wait for their
> > completion.
>
> > The documentation I am referring to ishttp://perldoc.perl.org/.
>
> > If you search on the concept of "background processes" this
> > documentation points you to the following in the Language reference >
> > perlipc > Background Process :
> >
> > You can run a command in the background with:
>
> > system("cmd &");
>
> > The command's STDOUT and STDERR (and possibly STDIN, depending on your
> > shell) will be the same as the parent's. You won't need
> > to catch SIGCHLD because of the double-fork taking place (see below
> > for more details).
> >
> > There is no further reference to a "double fork" (except in the
> > perlfaq8 and only in the context of zombies, which it says are not an
> > issue using system("cmd &") ). This is confusing.
>
> I find it more confusing that this seems to be in the section with the
> title "Using open() for IPC". I fail to see what one has to do with the
> other.
>
> There is a general problem with perl documentation: Perl evolved in the
> Unix environment, and a lot of the documentation was written at a time
> when the "newbie perl programmer" could be reasonably expected to have
> already some programming experience on Unix (in C, most likely) and know
> basic Unix concepts like processes, fork(), filehandles, etc. So in a
> lot of places the documentation doesn't answer the question "how can I
> do X on Unix?" but the question "I already know how to do X in C, now
> tell me how I can do it in perl!". When you lern Perl without knowing
> Unix first, this can be confusing, because the Perl documentation
> generelly explains only Perl, but not Unix.
>
> I am not sure if that should be fixed at all: It's the perl
> documentation and not the unix documentation after all, and perl isn't
> unix specific, but has been ported to just about every OS.
>
> > The documentation for wait(), waitpid() and fork() do not explain that
> > the executing code should be placed in the "child" section of the if
> > construct.
>
> Of course not, that would be wrong. There can be code in both (if you
> wanted one process to do nothing, why fork?). What the parent should do
> and what the child should do depend on what you want them to do. It just
> happened that for your particular problem the parent had nothing to do
> between forking off children.
>
> > Some of the examples in perlipc show code occurring in both
> > the parent and the child section - so it is still not clear. If it
> > were, why was I insisting on trying to execute the "cp" command in the
> > parent section?
>
> I don't know. What did you expect that would do?
>
> Your problem was:
>
> I want a process to gather a list of files. Then, for each file, it
> should start another process which copies the file. These processes
> should run in parallel. Finally, it should wait for all these processes
> to terminate, and then terminate itself.
>
> Even this description makes it rather implicit, that the original
> process creates children and then the children do the copying. If you
> add the restriction that a process can only wait for its children, any
> other solution becomes extremely awkward.
>
> Besides it is exactly the same what your shell script did: The shell
> (the parent process) created child processes in a loop, each child
> process executed one cp command, and finally the parent waited for all
> the children.
>
> > Furthermore, while the perlipc section is quite verbose, nowhere is
> > the code snippet do {$x=wait; print "$x\n"} until $x==-1 or any
> > variation of that wait() call mentioned.
>
> Again, this is a solution to your specific problem. You said you wanted
> to wait for all children, so Xho wrote a loop which would do that. This
> is a rather rare requirement - I think I needed it maybe a handful of
> times in 20+ years of programming outside of a child reaper function.
>
> > There are references to a
> > while loop and the waitpid() function, but being in the context of a
> > signal handler and 'reaping' - it is not clear.
>
> This is the one situation where this construct is frequently used. It
> can happen that several children die before the parent can react to the
> signal - in this case the reaper function will be called only once, but
> it must wait for several children. This isn't obvious, so the docs
> mention it.
>
> However, in your case you *know* you started several children and you
> *want* to wait for all of them, so it's obvious that you need a loop. This
> hasn't anything to do with perl, it's just a requirement of your
> problem.
>
> > So, once again many thanks for the help. I would like to know, Peter,
> > are you in a position to amend the documentation? Also, the perlfaq8
> > does come closer to explaining, but it is simply not clear how the
> > fork process will emulate the Unix background operator (&). So how can
> > we make this better? How's about this:
>
> fork doesn't "emulate &". fork is one of the functions used by the shell
> to execute commands. Basically, the shell does this in a loop:
>
> display the prompt
> read the next command
> fork
> in the child:
> execute the command, then exit
> in the parent:
> if the command did NOT end with &:
> wait for the child to terminate
>
> So the Shell always forks before executing commands and it always
> executes them in a child process. But normally it waits for the child
> process to terminate before displaying the next prompt. If you add the
> "&" at the end of the line, it doesn't wait but displays the next prompt
> immediately.
>
> (ok, that's a bit too simplistic, but I don't want to go into the arcana
> of shell internals here - that would be quite off-topic).
>
> > In the perlipc, under background tasks. make the statement - "the call
> > system("cmd &") will run the cmd in the background, but will spawn a
> > shell to execute the command, dispatch the command, terminate the
> > shell and return to the calling program. The calling program will lose
> > the ability to wait for the process as is customary with the shell
> > script command 'wait'. In order to both execute one or more programs
> > in the background and have the calling program wait for the executions
> > to complete it will be necessary to use the fork() function."
>
> I think that would confuse just about anybody who doesn't have exactly
> your problem for exactly the same reason you were confused by the
> "double fork". It's very specific and should be clear to anyone who
> knows what system does and what a unix shell does when you use "&" - I
> suspect that sentence about the double fork was added after a discussion
> like this one. But I agree that fork should definitely be mentioned here.
> system("cmd &") almost never what you would want to do.
>
> > Then in the fork section. Show a simple example like the ones we have
> > been working with AND show a simple approach to using the wait
> > function.
> > Furthermore - add sample code to the wait() and fork()
> > functions that are simple and realistic,
>
> fork and wait are very simple building blocks which can be combined in a
> lot of different ways. Any sample code here can cover only basic
> constructs like
>
> my $pid = fork();
> if (!defined $pid) {
> die "fork failed: $!";
> } elsif ($pid == 0) {
> # child does something here
> # ...
> exit;
> } else {
> # parent does something here
> # ...
> # and then waits for child:
> waitpid($pid, 0);
> }
>
> More complex examples should go into perlipc (which needs serious
> cleanup, IMHO).
>
> > unlike the code same in the waitpid() function.
>
> That code is simple and realistic.
>
> > In closing, it is perhaps non-intuitive to me that a fork process
> > should have the child section actually executing the code,
>
> If you don't want the child process to do anything, why would you create
> it in the first place?
>
> > but I ask you how one can intuit that from the sample in the camel
> > book and the samples in thehttp://perldoc.perl.org/. To really drive
> > the point home, Xho's code:
>
> > fork or exec("cp $_ $_.old") ;
> > do {$x=wait;} until $x==-1 ;
>
> > Is STILL not intuitive that the child is executing the code!
>
> True. This code isn't meant to be intuitive. It's meant to be short.
> I wouldn't write that in production code, much less in an answer to a
> newbie question.
>
> hp

HI Peter,
Thanks - you've been quite fair. The only point I would argue is how
many times you needed to wait for background tasks. I guess the
salient point is - I'm not a systems programmer - so I use Perl like
shell scripting and because I think it's stupid to do if-fi and case-
esac pairs.

'nuff said. Good point about the Perl docs being about Perl and not
Unix...

cheers...
pg

p.s. what's an OP ?

Re: Wait for background processes to complete

am 20.01.2008 00:55:47 von Martijn Lievaart

On Sat, 19 Jan 2008 15:37:02 -0800, pgodfrin wrote:

> p.s. what's an OP ?

Original Poster

HTH,
M4

Re: Wait for background processes to complete

am 20.01.2008 01:08:54 von Tad J McClellan

pgodfrin wrote:

> p.s. what's an OP ?


http://catb.org/jargon/html/O/thread-OP.html


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"

Re: Wait for background processes to complete

am 20.01.2008 01:12:48 von hjp-usenet2

On 2008-01-19 23:37, pgodfrin wrote:
> On Jan 18, 3:17 pm, "Peter J. Holzer" wrote:
>> On 2008-01-18 17:47, pgodfrin wrote:
>>
>>
>>
>> > This was a lot of fun. I would like to respond to the various
>> > observations, especially the ones about the Perl Documentation being
>> > misleading.
>> [...]
[76 lines snipped]
>> > Furthermore, while the perlipc section is quite verbose, nowhere is
>> > the code snippet do {$x=wait; print "$x\n"} until $x==-1 or any
>> > variation of that wait() call mentioned.
>>
>> Again, this is a solution to your specific problem. You said you wanted
>> to wait for all children, so Xho wrote a loop which would do that. This
>> is a rather rare requirement - I think I needed it maybe a handful of
>> times in 20+ years of programming outside of a child reaper function.
>>
[109 lines snipped - please quote only the relevant parts of the
articles you reply to]

> Thanks - you've been quite fair. The only point I would argue is how
> many times you needed to wait for background tasks.

I need to wait for background tasks quite often. But usually either I
need to wait for only one of them or some of them, but rarely all of
them. But I don't see how it matters. If you know how to wait for one
process, and you know how to write a loop, it is trivial to wait for all
processes.

> I guess the salient point is - I'm not a systems programmer - so I use
> Perl like shell scripting and because I think it's stupid to do if-fi
> and case- esac pairs.

That's ok. Perl just gives you a lot more flexibility than the shell,
at the price of higher complexity. For example, it would be quite easy
to change your script to do a configurable number of copies in parallel
- this is rather hard to do in the shell.

> p.s. what's an OP ?

"original poster": The person who started a thread, i.e. you in this case.

hp

Re: Wait for background processes to complete

am 20.01.2008 02:50:44 von xhoster

"Peter J. Holzer" wrote:
>
> Assume that the loop which forks off the children registers them in
> %kids.
>
> Then the SIGCHLD handler could do something like this:
>
> sub REAPER {
> for(;;) {
> $kid = waitpid(-1, WNOHANG);
> last if $kid <= 0;
> delete $kids{$kid};
> }
> }
>
> So, at the end of the program we just need to loop until %kids is empty:
>
> while (keys %kids) {
> }
>
> But that's busy-waiting - it will consume 100 % CPU time. We could sleep
> inside the loop, but how long? Until the next child terminates. Well, we
> already have a function which does sleep until a child terminates - it's
> called wait. So the loop turns into:
>
> while (keys %kids) {
> my $kid = wait();
> delete $kids{$kid};
> }
>
> So now we have a loop at the end which waits for all children, and the
> REAPER function has no useful function anymore. So we can delete it.

It might be useful in the highly unlikely event that you want to be
spawning enough jobs that you will run out of processes if you don't reap
any until all are spawned. Reaping early ones that exit even as the later
ones are still being spawning could help that (But I still wouldn't use a
sig handler, I'd just put a "waitpid(-1, WNOHANG);" inside the spawning
loop). But anyway, since the spawning is unthrottled, you are just racing
disaster anyway--you have no guarantee that enough will finish in time to
prevent you from running out of processes. So something throttling, like
ForkManager would be the way to go.


Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 20.01.2008 03:03:33 von xhoster

"Peter J. Holzer" wrote:

> Xho's solution is safe (and was so in all perl
> versions and does what the OP wants. Well, almost - wait can return -1
> if it is interrupted so one should check $! in addition to the return
> value.

Are you sure that that is the case? I figured it would return undef on
error. Upon experimenting (with both >5.8 and <5.8) I found that (on my
system) wait returns neither -1 nor undef due to signals, because it
never returns due to signals. The Perl wait behind the scenes just keeps
recalling the system-level wait either until something is reaped, or until
the program dies, or until a die/eval pair causes it to jump out of the
execution path without actually returning.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Re: Wait for background processes to complete

am 20.01.2008 05:47:14 von pgodfrin

On Jan 19, 6:12 pm, "Peter J. Holzer" wrote:
> On 2008-01-19 23:37, pgodfrin wrote:
>
> > On Jan 18, 3:17 pm, "Peter J. Holzer" wrote:
> >> On 2008-01-18 17:47, pgodfrin wrote:
>
> >> > This was a lot of fun. I would like to respond to the various
> >> > observations, especially the ones about the Perl Documentation being
> >> > misleading.
> >> [...]
> [76 lines snipped]
> >> > Furthermore, while the perlipc section is quite verbose, nowhere is
> >> > the code snippet do {$x=wait; print "$x\n"} until $x==-1 or any
> >> > variation of that wait() call mentioned.
>
> >> Again, this is a solution to your specific problem. You said you wanted
> >> to wait for all children, so Xho wrote a loop which would do that. This
> >> is a rather rare requirement - I think I needed it maybe a handful of
> >> times in 20+ years of programming outside of a child reaper function.
>
> [109 lines snipped - please quote only the relevant parts of the
> articles you reply to]
>
> > Thanks - you've been quite fair. The only point I would argue is how
> > many times you needed to wait for background tasks.
>
> I need to wait for background tasks quite often. But usually either I
> need to wait for only one of them or some of them, but rarely all of
> them. But I don't see how it matters. If you know how to wait for one
> process, and you know how to write a loop, it is trivial to wait for all
> processes.
>
> > I guess the salient point is - I'm not a systems programmer - so I use
> > Perl like shell scripting and because I think it's stupid to do if-fi
> > and case- esac pairs.
>
> That's ok. Perl just gives you a lot more flexibility than the shell,
> at the price of higher complexity. For example, it would be quite easy
> to change your script to do a configurable number of copies in parallel
> - this is rather hard to do in the shell.
>
> > p.s. what's an OP ?
>
> "original poster": The person who started a thread, i.e. you in this case.
>
> hp

cool - that is in fact my next goal - to have an option that throttle
how may concurrent tasks...

regards from the OP :)
pg

Re: Wait for background processes to complete

am 20.01.2008 12:35:40 von hjp-usenet2

On 2008-01-20 02:03, xhoster@gmail.com wrote:
> "Peter J. Holzer" wrote:
>> Xho's solution is safe (and was so in all perl
>> versions and does what the OP wants. Well, almost - wait can return -1
>> if it is interrupted so one should check $! in addition to the return
>> value.
>
> Are you sure that that is the case?

No. It's more a "it does happen in C and it isn't documented so better
add an extra check just in case" kind of reaction.

> I figured it would return undef on
> error. Upon experimenting (with both >5.8 and <5.8) I found that (on my
> system) wait returns neither -1 nor undef due to signals, because it
> never returns due to signals. The Perl wait behind the scenes just keeps
> recalling the system-level wait either until something is reaped, or until
> the program dies, or until a die/eval pair causes it to jump out of the
> execution path without actually returning.

Good to know. If this is generally the case (not just on your system) it
would be good if this information could be added to perlfunc.

hp

Re: Wait for background processes to complete

am 21.01.2008 18:39:34 von dougie.stevenson

On Jan 13, 9:08 pm, pgodfrin wrote:
> Greetings,
> Well - I've spent a bunch of time trying to figure this out - to no
> avail.
>
> Here's what I want to do - run several commands in the background and
> have the perl program wait for the commands to complete. Fork doesn't
> do it, nor does wait nor waitpid.
>
> Any thoughts?
>
> Here's a sample program which starts the processes:
>
> while (<*.txt>)
> {
> print "Copying $_ \n";
> system("cp $_ $_.old &") ;
> }
> print "End of excercise\n";
> exit;
>
> I mean if this were a shell program - this would work:
>
> for x in `ls *.txt`
> do
> print "Copying $_ \n"
> cp $_ $_.old &
> done
> wait
>
> thanks,
> pg

I know you're probably looking for something very simple. Amazing how
simple things get really complex.

Take a look at http://poe.perl.org/ - specifically the Job Server in
the Cookbook section.
anding framework for doing stuff like what you are looking to do.

Dougie!!!



POE Rocks... It is an outst