PHP CLI & Forking children
PHP CLI & Forking children
am 29.09.2007 12:12:19 von qwertycat
I'm new to multi-process programming, should one avoid forking
children from children of a parent?
I'd like to spawn 10 children from the parent and each of those
children spawns another 5 children which process chunks of data (200
rows) with heavy usage of CPU and regexp
Re: PHP CLI & Forking children
am 29.09.2007 16:49:53 von Andy Hassall
On Sat, 29 Sep 2007 03:12:19 -0700, qwertycat@googlemail.com wrote:
>I'm new to multi-process programming, should one avoid forking
>children from children of a parent?
>
>I'd like to spawn 10 children from the parent and each of those
>children spawns another 5 children which process chunks of data (200
>rows) with heavy usage of CPU and regexp
So you're spawning 500 processes? Do you have a very large number of CPUs to
run them on? Otherwise only a few will actually be running at any time, and
you'll be losing useful throughput to overhead, surely.
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Re: PHP CLI & Forking children
am 29.09.2007 17:00:31 von qwertycat
On Sep 29, 3:49 pm, Andy Hassall wrote:
> So you're spawning 500 processes? Do you have a very large number of CPUs to
> run them on? Otherwise only a few will actually be running at any time, and
> you'll be losing useful throughput to overhead, surely.
>
Maybe the example I gave was bad :) How about PHP script with launches
4 children, with each child forking another 5 children (20 processes)
Would this development headaches or possible extra bugs?
Re: PHP CLI & Forking children
am 29.09.2007 20:51:29 von Jerry Stuckle
qwertycat@googlemail.com wrote:
> On Sep 29, 3:49 pm, Andy Hassall wrote:
>> So you're spawning 500 processes? Do you have a very large number of CPUs to
>> run them on? Otherwise only a few will actually be running at any time, and
>> you'll be losing useful throughput to overhead, surely.
>>
>
> Maybe the example I gave was bad :) How about PHP script with launches
> 4 children, with each child forking another 5 children (20 processes)
>
> Would this development headaches or possible extra bugs?
>
Just wondering - why do you need to fork processes, anyway? There's a
lot of overhead in doing it, and if they're all CPU bound anyway you
aren't going to gain anything (unless you have a potload of CPU's).
Forking is good if you have different processes using different
resources. But when they have to contend for the same resource,
performance often goes down.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Re: PHP CLI & Forking children
am 29.09.2007 21:32:24 von qwertycat
On Sep 29, 7:51 pm, Jerry Stuckle wrote:
> Just wondering - why do you need to fork processes, anyway? There's a
> lot of overhead in doing it, and if they're all CPU bound anyway you
> aren't going to gain anything (unless you have a potload of CPU's).
>
> Forking is good if you have different processes using different
> resources. But when they have to contend for the same resource,
> performance often goes down.
Instead of writing a PHP script that downloads 2 million headers from
a newsgroup in a single connection (which will cause PHP to crash
anyway as it'll reach 500MB+ memory usage), I thought it would be
better to launch 4 processes do download it in chunks of 50,000
headers - with 4 connections to the same NNTP server.
Re: PHP CLI & Forking children
am 29.09.2007 21:35:57 von qwertycat
On Sep 29, 8:32 pm, qwerty...@googlemail.com wrote:
> Instead of writing a PHP script that downloads 2 million headers from
> a newsgroup in a single connection (which will cause PHP to crash
> anyway as it'll reach 500MB+ memory usage), I thought it would be
> better to launch 4 processes do download it in chunks of 50,000
> headers - with 4 connections to the same NNTP server.
I admit I should be using Perl or C for these tasks, but I know PHP
and I'm used to using its functions.
Re: PHP CLI & Forking children
am 29.09.2007 21:50:52 von Jerry Stuckle
qwertycat@googlemail.com wrote:
> On Sep 29, 7:51 pm, Jerry Stuckle wrote:
>> Just wondering - why do you need to fork processes, anyway? There's a
>> lot of overhead in doing it, and if they're all CPU bound anyway you
>> aren't going to gain anything (unless you have a potload of CPU's).
>>
>> Forking is good if you have different processes using different
>> resources. But when they have to contend for the same resource,
>> performance often goes down.
>
> Instead of writing a PHP script that downloads 2 million headers from
> a newsgroup in a single connection (which will cause PHP to crash
> anyway as it'll reach 500MB+ memory usage), I thought it would be
> better to launch 4 processes do download it in chunks of 50,000
> headers - with 4 connections to the same NNTP server.
>
Which means you'll be downloading 500MB+ anyway - just in different
processes.
Or you could get some headers and cache them to disk, processing them later.
But which newsgroup has 2M+ headers? Glad I don't have to read that
one! :-)
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Re: PHP CLI & Forking children
am 29.09.2007 21:51:53 von Jerry Stuckle
qwertycat@googlemail.com wrote:
> On Sep 29, 8:32 pm, qwerty...@googlemail.com wrote:
>> Instead of writing a PHP script that downloads 2 million headers from
>> a newsgroup in a single connection (which will cause PHP to crash
>> anyway as it'll reach 500MB+ memory usage), I thought it would be
>> better to launch 4 processes do download it in chunks of 50,000
>> headers - with 4 connections to the same NNTP server.
>
> I admit I should be using Perl or C for these tasks, but I know PHP
> and I'm used to using its functions.
>
Nothing wrong with using PHP for this. It will be slower than a
compiled language like C, but most of your time will be spent waiting on
I/O anyway. So it shouldn't be that much slower.
--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
Re: PHP CLI & Forking children
am 29.09.2007 22:04:03 von Andy Hassall
On Sat, 29 Sep 2007 12:32:24 -0700, qwertycat@googlemail.com wrote:
>On Sep 29, 7:51 pm, Jerry Stuckle wrote:
>> Just wondering - why do you need to fork processes, anyway? There's a
>> lot of overhead in doing it, and if they're all CPU bound anyway you
>> aren't going to gain anything (unless you have a potload of CPU's).
>>
>> Forking is good if you have different processes using different
>> resources. But when they have to contend for the same resource,
>> performance often goes down.
>
>Instead of writing a PHP script that downloads 2 million headers from
>a newsgroup in a single connection (which will cause PHP to crash
>anyway as it'll reach 500MB+ memory usage),
Well, presumably you're doing something with this data, like saving it to a
file or database? In which case you stream it from the network into the
database, rather than read it *all* into memory, and only *then* start saving
it?
>I thought it would be
>better to launch 4 processes do download it in chunks of 50,000
>headers - with 4 connections to the same NNTP server.
Yes, it may well be worth doing this to get better throughput (depending where
the bottleneck is), but I wouldn't have thought that the memory limit's the
issue, so long as you're streaming the data through.
I'm still not quite sure about the second level of forking you have in there
though; so there's 1 initial parent, 4 children reading from the server, but
then each has multiple children processing this data? Unless you have masses of
CPUs, you're unlikely to gain anything at that level; the 4 2nd level processes
may as well do the processing as they stream the data in from the network?
(As always, It Depends).
Back to the general question though, when you start forking, you've got child
process management to work out. One child process is relatively easy, more than
one means you have to do a bit more work to send (and receive) signals and
other IPC stuff (since you have to work out *which* child process you're
talking to), and work out what happens if either a child, or a parent process
terminates unexpectedly, or hangs. More than two processes and more than one
level of parent/child doesn't really get any more complicated as such, but
there's more processes to go wrong :-)
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Re: PHP CLI & Forking children
am 04.10.2007 02:00:58 von Manuel Lemos
Hello,
on 09/29/2007 07:12 AM qwertycat@googlemail.com said the following:
> I'm new to multi-process programming, should one avoid forking
> children from children of a parent?
>
> I'd like to spawn 10 children from the parent and each of those
> children spawns another 5 children which process chunks of data (200
> rows) with heavy usage of CPU and regexp
Here you may find several classes that can simplify that task for you:
http://www.phpclasses.org/php_fork
http://www.phpclasses.org/daemon
http://www.phpclasses.org/clsdaemonize
--
Regards,
Manuel Lemos
Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/
PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Re: PHP CLI & Forking children
am 04.10.2007 17:20:25 von qwertycat
On Oct 4, 1:00 am, Manuel Lemos wrote:
> Here you may find several classes that can simplify that task for you:
>
> http://www.phpclasses.org/php_fork
>
> http://www.phpclasses.org/daemon
>
> http://www.phpclasses.org/clsdaemonize
Thanks Manuel for the good links.