Forking and database connections

Forking and database connections

am 21.04.2007 19:21:05 von Chris Verges

Hey all,

I'm writing a PHP script that queries a database for some records, splits
the records into N number of groups of equal size, and then creates N number
of forks where each child handles one of the groups. During the execution
of each of the child processes, I'd like the parent process to update the
status of the job in the database.

The problem is regarding my database connection pre- and post- fork. After
reading the pcntl_fork() page on the PHP manual, I realize that the child
process inherits the file descriptor, and if the child process closes the
connection, then it is closed in the parent process. So for each child
process (because I have more than one), I reinitialize the database link. I
also reinitialize the database link for the parent process immediately after
the fork.

However, when a child process finishes, it seems like the database link that
I reinitialized in the parent process also disconnects. I thought a fork
copied the entire heap, and therefore would make two copies of the object
instances that would remain segmented for the life of the processes.
Changes made to one copy of the heap wouldn't affect others. However, this
doesn't seem to be the case.

So at this point, my workaround is to wait until all of the child processes
are finished, then re-initialize the database link, and give an updated
status message at the end rather than incrementally as child processes
finish.

Here's some proof-of-concept code that explains what I mean:


/* Include PEAR::DB */
require_once('DB.php');

# Database table definition
# -------------------------
# CREATE TABLE `logs` (
# `message` VARCHAR(128) NOT NULL
# );

/* Create the initial database connection for the parent process */
$dsn = 'mysql://test:test@localhost/testdb';
$db = DB::connect($dsn);
if ( PEAR::isError($db) ) {
die($db->getMessage() . "\n");
}

/* This will be the common SQL statement for all inserts */
$sql = "INSERT INTO `logs` (`message`) VALUES (?);";
$stmt = $db->prepare($sql);

/* Perform a DB update */
$data = array('Started parent process');
$db->execute($stmt, $data);

/* Create the child processes */
$childPids = array();
for ( $i = 0; $i < 5; $i++ ) {
$pid = pcntl_fork();
if ( $pid == -1 ) {
die("\nUnable to fork!\n");
} else if ( $pid ) {
/* Parent process */
echo "Child process $pid created\n";
array_push($childPids, $pid);
} else {
/* Child process */
$myPid = posix_getpid();

/* Create a new database connection for the child process */
$db = DB::connect($dsn);
if ( PEAR::isError($db) ) {
die("\nChild process $myPid: " . $db->getMessage() .
"\n" . $db->getDebugInfo() . "\n");
}

$data = array("Child process $myPid");
$stmt = $db->prepare($sql);
$db->execute($stmt, $data);

/* Add some latency for testing purposes */
sleep(5);
exit;
}
}

/* Create a new database connection for the parent process */
$db = DB::connect($dsn);
if ( PEAR::isError($db) ) {
die("\nParent process: " . $db->getMessage() . "\n" .
$db->getDebugInfo() . "\n");
}

/* Wait for the children to finish */
foreach ( $childPids as $pid ) {
$data = array("Parent process waiting on child process $pid");
$db->execute($stmt, $data);
pcntl_waitpid($pid, $status);
$data = array("Child process $pid is finished");
$db->execute($stmt, $data);
}

$data = array("Parent process is finished");
$db->execute($stmt, $data);

?>

The command-line output of this code:

$ php forking-proof-of-concept.php
Child process 27012 created
Child process 27013 created
Child process 27014 created
Child process 27015 created
Child process 27016 created

Child process 27016: DB Error: unknown error
[nativecode=2013 ** Lost connection to MySQL server during query] **
mysql://test:test@localhost/testdb

And finally the database entries after running the code:

mysql> select * from logs;
+------------------------+
| message |
+------------------------+
| Started parent process |
| Child process 27012 |
| Child process 27013 |
| Child process 27014 |
| Child process 27015 |
+------------------------+
5 rows in set (0.00 sec)

Any help in understanding this is appreciated!

Thanks!
Chris

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Forking and database connections

am 23.04.2007 17:09:16 von Dwight Altman

http://php.he.net/manual/en/function.pcntl-fork.php says 'The reason for the
MySQL "Lost Connection during query"...' like what you concluded, although
they grab a new $db connection in the first for loop
"} else if ( $pid ) {" I bet a child closes the one you create after the for
loop while the parent is waiting in the foreach loop.

In your foreach when you wait for each child, can you $db->execute($stmt,
$data) AFTER you pcntl_waitpid($pid, $status) ? Just reverse the lines?

I think after the wait is when the child closes the connection (since I
suppose you are reusing the same connection), so it is already closed.


Regards,
Dwight
> -----Original Message-----
> From: Chris Verges [mailto:chrisv@cyberswitching.com]
> Sent: Saturday, April 21, 2007 12:21 PM
> To: php-db@lists.php.net
> Subject: [PHP-DB] Forking and database connections
>
> Hey all,
>
> I'm writing a PHP script that queries a database for some records, splits
> the records into N number of groups of equal size, and then creates N
> number
> of forks where each child handles one of the groups. During the execution
> of each of the child processes, I'd like the parent process to update the
> status of the job in the database.
>
> The problem is regarding my database connection pre- and post- fork.
> After
> reading the pcntl_fork() page on the PHP manual, I realize that the child
> process inherits the file descriptor, and if the child process closes the
> connection, then it is closed in the parent process. So for each child
> process (because I have more than one), I reinitialize the database link.
> I
> also reinitialize the database link for the parent process immediately
> after
> the fork.
>
> However, when a child process finishes, it seems like the database link
> that
> I reinitialized in the parent process also disconnects. I thought a fork
> copied the entire heap, and therefore would make two copies of the object
> instances that would remain segmented for the life of the processes.
> Changes made to one copy of the heap wouldn't affect others. However,
> this
> doesn't seem to be the case.
>
> So at this point, my workaround is to wait until all of the child
> processes
> are finished, then re-initialize the database link, and give an updated
> status message at the end rather than incrementally as child processes
> finish.
>
> Here's some proof-of-concept code that explains what I mean:
>
> >
> /* Include PEAR::DB */
> require_once('DB.php');
>
> # Database table definition
> # -------------------------
> # CREATE TABLE `logs` (
> # `message` VARCHAR(128) NOT NULL
> # );
>
> /* Create the initial database connection for the parent process */
> $dsn = 'mysql://test:test@localhost/testdb';
> $db = DB::connect($dsn);
> if ( PEAR::isError($db) ) {
> die($db->getMessage() . "\n");
> }
>
> /* This will be the common SQL statement for all inserts */
> $sql = "INSERT INTO `logs` (`message`) VALUES (?);";
> $stmt = $db->prepare($sql);
>
> /* Perform a DB update */
> $data = array('Started parent process');
> $db->execute($stmt, $data);
>
> /* Create the child processes */
> $childPids = array();
> for ( $i = 0; $i < 5; $i++ ) {
> $pid = pcntl_fork();
> if ( $pid == -1 ) {
> die("\nUnable to fork!\n");
> } else if ( $pid ) {
> /* Parent process */
> echo "Child process $pid created\n";
> array_push($childPids, $pid);
> } else {
> /* Child process */
> $myPid = posix_getpid();
>
> /* Create a new database connection for the child process
> */
> $db = DB::connect($dsn);
> if ( PEAR::isError($db) ) {
> die("\nChild process $myPid: " . $db->getMessage()
> .
> "\n" . $db->getDebugInfo() . "\n");
> }
>
> $data = array("Child process $myPid");
> $stmt = $db->prepare($sql);
> $db->execute($stmt, $data);
>
> /* Add some latency for testing purposes */
> sleep(5);
> exit;
> }
> }
>
> /* Create a new database connection for the parent process */
> $db = DB::connect($dsn);
> if ( PEAR::isError($db) ) {
> die("\nParent process: " . $db->getMessage() . "\n" .
> $db->getDebugInfo() . "\n");
> }
>
> /* Wait for the children to finish */
> foreach ( $childPids as $pid ) {
> $data = array("Parent process waiting on child process $pid");
> $db->execute($stmt, $data);
> pcntl_waitpid($pid, $status);
> $data = array("Child process $pid is finished");
> $db->execute($stmt, $data);
> }
>
> $data = array("Parent process is finished");
> $db->execute($stmt, $data);
>
> ?>
>
> The command-line output of this code:
>
> $ php forking-proof-of-concept.php
> Child process 27012 created
> Child process 27013 created
> Child process 27014 created
> Child process 27015 created
> Child process 27016 created
>
> Child process 27016: DB Error: unknown error
> [nativecode=2013 ** Lost connection to MySQL server during query] **
> mysql://test:test@localhost/testdb
>
> And finally the database entries after running the code:
>
> mysql> select * from logs;
> +------------------------+
> | message |
> +------------------------+
> | Started parent process |
> | Child process 27012 |
> | Child process 27013 |
> | Child process 27014 |
> | Child process 27015 |
> +------------------------+
> 5 rows in set (0.00 sec)
>
> Any help in understanding this is appreciated!
>
> Thanks!
> Chris
>
> --
> PHP Database Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Forking and database connections

am 23.04.2007 17:12:54 von Dwight Altman

Actually I suppose you need to loop with foreach to wait on all children
before attempting any $db->execute, then after your foreach loop, get your
$db and execute.

Regards,
Dwight

> -----Original Message-----
> From: Dwight Altman [mailto:dwight@multicam.com]
> Sent: Monday, April 23, 2007 10:09 AM
> To: 'php-db@lists.php.net'
> Subject: RE: [PHP-DB] Forking and database connections
>
> http://php.he.net/manual/en/function.pcntl-fork.php says 'The reason for
> the MySQL "Lost Connection during query"...' like what you concluded,
> although they grab a new $db connection in the first for loop
> "} else if ( $pid ) {" I bet a child closes the one you create after the
> for loop while the parent is waiting in the foreach loop.
>
> In your foreach when you wait for each child, can you $db->execute($stmt,
> $data) AFTER you pcntl_waitpid($pid, $status) ? Just reverse the lines?
>
> I think after the wait is when the child closes the connection (since I
> suppose you are reusing the same connection), so it is already closed.
>
>
> Regards,
> Dwight
> > -----Original Message-----
> > From: Chris Verges [mailto:chrisv@cyberswitching.com]
> > Sent: Saturday, April 21, 2007 12:21 PM
> > To: php-db@lists.php.net
> > Subject: [PHP-DB] Forking and database connections
> >
> > Hey all,
> >
> > I'm writing a PHP script that queries a database for some records,
> splits
> > the records into N number of groups of equal size, and then creates N
> > number
> > of forks where each child handles one of the groups. During the
> execution
> > of each of the child processes, I'd like the parent process to update
> the
> > status of the job in the database.
> >
> > The problem is regarding my database connection pre- and post- fork.
> > After
> > reading the pcntl_fork() page on the PHP manual, I realize that the
> child
> > process inherits the file descriptor, and if the child process closes
> the
> > connection, then it is closed in the parent process. So for each child
> > process (because I have more than one), I reinitialize the database
> link.
> > I
> > also reinitialize the database link for the parent process immediately
> > after
> > the fork.
> >
> > However, when a child process finishes, it seems like the database link
> > that
> > I reinitialized in the parent process also disconnects. I thought a
> fork
> > copied the entire heap, and therefore would make two copies of the
> object
> > instances that would remain segmented for the life of the processes.
> > Changes made to one copy of the heap wouldn't affect others. However,
> > this
> > doesn't seem to be the case.
> >
> > So at this point, my workaround is to wait until all of the child
> > processes
> > are finished, then re-initialize the database link, and give an updated
> > status message at the end rather than incrementally as child processes
> > finish.
> >
> > Here's some proof-of-concept code that explains what I mean:
> >
> > > >
> > /* Include PEAR::DB */
> > require_once('DB.php');
> >
> > # Database table definition
> > # -------------------------
> > # CREATE TABLE `logs` (
> > # `message` VARCHAR(128) NOT NULL
> > # );
> >
> > /* Create the initial database connection for the parent process */
> > $dsn = 'mysql://test:test@localhost/testdb';
> > $db = DB::connect($dsn);
> > if ( PEAR::isError($db) ) {
> > die($db->getMessage() . "\n");
> > }
> >
> > /* This will be the common SQL statement for all inserts */
> > $sql = "INSERT INTO `logs` (`message`) VALUES (?);";
> > $stmt = $db->prepare($sql);
> >
> > /* Perform a DB update */
> > $data = array('Started parent process');
> > $db->execute($stmt, $data);
> >
> > /* Create the child processes */
> > $childPids = array();
> > for ( $i = 0; $i < 5; $i++ ) {
> > $pid = pcntl_fork();
> > if ( $pid == -1 ) {
> > die("\nUnable to fork!\n");
> > } else if ( $pid ) {
> > /* Parent process */
> > echo "Child process $pid created\n";
> > array_push($childPids, $pid);
> > } else {
> > /* Child process */
> > $myPid = posix_getpid();
> >
> > /* Create a new database connection for the child
> process
> > */
> > $db = DB::connect($dsn);
> > if ( PEAR::isError($db) ) {
> > die("\nChild process $myPid: " . $db-
> >getMessage()
> > .
> > "\n" . $db->getDebugInfo() . "\n");
> > }
> >
> > $data = array("Child process $myPid");
> > $stmt = $db->prepare($sql);
> > $db->execute($stmt, $data);
> >
> > /* Add some latency for testing purposes */
> > sleep(5);
> > exit;
> > }
> > }
> >
> > /* Create a new database connection for the parent process */
> > $db = DB::connect($dsn);
> > if ( PEAR::isError($db) ) {
> > die("\nParent process: " . $db->getMessage() . "\n" .
> > $db->getDebugInfo() . "\n");
> > }
> >
> > /* Wait for the children to finish */
> > foreach ( $childPids as $pid ) {
> > $data = array("Parent process waiting on child process $pid");
> > $db->execute($stmt, $data);
> > pcntl_waitpid($pid, $status);
> > $data = array("Child process $pid is finished");
> > $db->execute($stmt, $data);
> > }
> >
> > $data = array("Parent process is finished");
> > $db->execute($stmt, $data);
> >
> > ?>
> >
> > The command-line output of this code:
> >
> > $ php forking-proof-of-concept.php
> > Child process 27012 created
> > Child process 27013 created
> > Child process 27014 created
> > Child process 27015 created
> > Child process 27016 created
> >
> > Child process 27016: DB Error: unknown error
> > [nativecode=2013 ** Lost connection to MySQL server during query] **
> > mysql://test:test@localhost/testdb
> >
> > And finally the database entries after running the code:
> >
> > mysql> select * from logs;
> > +------------------------+
> > | message |
> > +------------------------+
> > | Started parent process |
> > | Child process 27012 |
> > | Child process 27013 |
> > | Child process 27014 |
> > | Child process 27015 |
> > +------------------------+
> > 5 rows in set (0.00 sec)
> >
> > Any help in understanding this is appreciated!
> >
> > Thanks!
> > Chris
> >
> > --
> > PHP Database Mailing List (http://www.php.net/)
> > To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Forking and database connections

am 23.04.2007 19:00:58 von Chris Verges

Hey Dwight,

Thanks for the replies! When the first child thread closes, all of the DB
connections seem to close. That is, the $db->execute() statements will fail
in the child threads after the first child thread closes. It's almost as
though separate connections are not being made, but they are instead sharing
the same connection. This would make sense if persistent connection pooling
was enabled, but I assumed there to be a "garbage collection" mechanism in
place to avoid prematurely closing a persistent connection if multiple
resources are using it.

Again, thanks for the help! Any other ideas on what might be wrong, or
should I open a bug against it?

Thanks!
Chris


On 4/23/07 8:12 AM, "Dwight Altman" wrote:

> Actually I suppose you need to loop with foreach to wait on all children
> before attempting any $db->execute, then after your foreach loop, get your
> $db and execute.
>
> Regards,
> Dwight
>
>> -----Original Message-----
>> From: Dwight Altman [mailto:dwight@multicam.com]
>> Sent: Monday, April 23, 2007 10:09 AM
>> To: 'php-db@lists.php.net'
>> Subject: RE: [PHP-DB] Forking and database connections
>>
>> http://php.he.net/manual/en/function.pcntl-fork.php says 'The reason for
>> the MySQL "Lost Connection during query"...' like what you concluded,
>> although they grab a new $db connection in the first for loop
>> "} else if ( $pid ) {" I bet a child closes the one you create after the
>> for loop while the parent is waiting in the foreach loop.
>>
>> In your foreach when you wait for each child, can you $db->execute($stmt,
>> $data) AFTER you pcntl_waitpid($pid, $status) ? Just reverse the lines?
>>
>> I think after the wait is when the child closes the connection (since I
>> suppose you are reusing the same connection), so it is already closed.
>>
>>
>> Regards,
>> Dwight
>>> -----Original Message-----
>>> From: Chris Verges [mailto:chrisv@cyberswitching.com]
>>> Sent: Saturday, April 21, 2007 12:21 PM
>>> To: php-db@lists.php.net
>>> Subject: [PHP-DB] Forking and database connections
>>>
>>> Hey all,
>>>
>>> I'm writing a PHP script that queries a database for some records,
>> splits
>>> the records into N number of groups of equal size, and then creates N
>>> number
>>> of forks where each child handles one of the groups. During the
>> execution
>>> of each of the child processes, I'd like the parent process to update
>> the
>>> status of the job in the database.
>>>
>>> The problem is regarding my database connection pre- and post- fork.
>>> After
>>> reading the pcntl_fork() page on the PHP manual, I realize that the
>> child
>>> process inherits the file descriptor, and if the child process closes
>> the
>>> connection, then it is closed in the parent process. So for each child
>>> process (because I have more than one), I reinitialize the database
>> link.
>>> I
>>> also reinitialize the database link for the parent process immediately
>>> after
>>> the fork.
>>>
>>> However, when a child process finishes, it seems like the database link
>>> that
>>> I reinitialized in the parent process also disconnects. I thought a
>> fork
>>> copied the entire heap, and therefore would make two copies of the
>> object
>>> instances that would remain segmented for the life of the processes.
>>> Changes made to one copy of the heap wouldn't affect others. However,
>>> this
>>> doesn't seem to be the case.
>>>
>>> So at this point, my workaround is to wait until all of the child
>>> processes
>>> are finished, then re-initialize the database link, and give an updated
>>> status message at the end rather than incrementally as child processes
>>> finish.
>>>
>>> Here's some proof-of-concept code that explains what I mean:
>>>
>>> >>>
>>> /* Include PEAR::DB */
>>> require_once('DB.php');
>>>
>>> # Database table definition
>>> # -------------------------
>>> # CREATE TABLE `logs` (
>>> # `message` VARCHAR(128) NOT NULL
>>> # );
>>>
>>> /* Create the initial database connection for the parent process */
>>> $dsn = 'mysql://test:test@localhost/testdb';
>>> $db = DB::connect($dsn);
>>> if ( PEAR::isError($db) ) {
>>> die($db->getMessage() . "\n");
>>> }
>>>
>>> /* This will be the common SQL statement for all inserts */
>>> $sql = "INSERT INTO `logs` (`message`) VALUES (?);";
>>> $stmt = $db->prepare($sql);
>>>
>>> /* Perform a DB update */
>>> $data = array('Started parent process');
>>> $db->execute($stmt, $data);
>>>
>>> /* Create the child processes */
>>> $childPids = array();
>>> for ( $i = 0; $i < 5; $i++ ) {
>>> $pid = pcntl_fork();
>>> if ( $pid == -1 ) {
>>> die("\nUnable to fork!\n");
>>> } else if ( $pid ) {
>>> /* Parent process */
>>> echo "Child process $pid created\n";
>>> array_push($childPids, $pid);
>>> } else {
>>> /* Child process */
>>> $myPid = posix_getpid();
>>>
>>> /* Create a new database connection for the child
>> process
>>> */
>>> $db = DB::connect($dsn);
>>> if ( PEAR::isError($db) ) {
>>> die("\nChild process $myPid: " . $db-
>>> getMessage()
>>> .
>>> "\n" . $db->getDebugInfo() . "\n");
>>> }
>>>
>>> $data = array("Child process $myPid");
>>> $stmt = $db->prepare($sql);
>>> $db->execute($stmt, $data);
>>>
>>> /* Add some latency for testing purposes */
>>> sleep(5);
>>> exit;
>>> }
>>> }
>>>
>>> /* Create a new database connection for the parent process */
>>> $db = DB::connect($dsn);
>>> if ( PEAR::isError($db) ) {
>>> die("\nParent process: " . $db->getMessage() . "\n" .
>>> $db->getDebugInfo() . "\n");
>>> }
>>>
>>> /* Wait for the children to finish */
>>> foreach ( $childPids as $pid ) {
>>> $data = array("Parent process waiting on child process $pid");
>>> $db->execute($stmt, $data);
>>> pcntl_waitpid($pid, $status);
>>> $data = array("Child process $pid is finished");
>>> $db->execute($stmt, $data);
>>> }
>>>
>>> $data = array("Parent process is finished");
>>> $db->execute($stmt, $data);
>>>
>>> ?>
>>>
>>> The command-line output of this code:
>>>
>>> $ php forking-proof-of-concept.php
>>> Child process 27012 created
>>> Child process 27013 created
>>> Child process 27014 created
>>> Child process 27015 created
>>> Child process 27016 created
>>>
>>> Child process 27016: DB Error: unknown error
>>> [nativecode=2013 ** Lost connection to MySQL server during query] **
>>> mysql://test:test@localhost/testdb
>>>
>>> And finally the database entries after running the code:
>>>
>>> mysql> select * from logs;
>>> +------------------------+
>>> | message |
>>> +------------------------+
>>> | Started parent process |
>>> | Child process 27012 |
>>> | Child process 27013 |
>>> | Child process 27014 |
>>> | Child process 27015 |
>>> +------------------------+
>>> 5 rows in set (0.00 sec)
>>>
>>> Any help in understanding this is appreciated!
>>>
>>> Thanks!
>>> Chris
>>>
>>> --
>>> PHP Database Mailing List (http://www.php.net/)
>>> To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Forking and database connections

am 23.04.2007 19:28:01 von Dwight Altman

Good point. Anything (children or parent) sharing the connection will get
it closed on them. I was looking up persistent connections, then I thought
you could just address the way you used the pcntl_* functions.

I use mysql_* directly (where mysql_connect "Opens or reuses a connection to
a MySQL server."), so you may need to look into PEAR::DB for persistent
connections.

Where in your code does it say you are using persistent connections? I
haven't had the need to use them yet.

For instance mysql_pconnect (the persistent counterpart) says "Second, the
connection to the SQL server will not be closed when the execution of the
script ends."

If you are using the PEAR::DB persistent connection as documented, then I'd
open a bug report. Again, I haven't used them.



Regards,
Dwight
> -----Original Message-----
> From: Chris Verges [mailto:chrisv@cyberswitching.com]
> Sent: Monday, April 23, 2007 12:01 PM
> To: Dwight Altman; php-db@lists.php.net
> Subject: Re: [PHP-DB] Forking and database connections
>
> Hey Dwight,
>
> Thanks for the replies! When the first child thread closes, all of the DB
> connections seem to close. That is, the $db->execute() statements will
> fail
> in the child threads after the first child thread closes. It's almost as
> though separate connections are not being made, but they are instead
> sharing
> the same connection. This would make sense if persistent connection
> pooling
> was enabled, but I assumed there to be a "garbage collection" mechanism in
> place to avoid prematurely closing a persistent connection if multiple
> resources are using it.
>
> Again, thanks for the help! Any other ideas on what might be wrong, or
> should I open a bug against it?
>
> Thanks!
> Chris
>
>
> On 4/23/07 8:12 AM, "Dwight Altman" wrote:
>
> > Actually I suppose you need to loop with foreach to wait on all children
> > before attempting any $db->execute, then after your foreach loop, get
> your
> > $db and execute.
> >
> > Regards,
> > Dwight
> >
> >> -----Original Message-----
> >> From: Dwight Altman [mailto:dwight@multicam.com]
> >> Sent: Monday, April 23, 2007 10:09 AM
> >> To: 'php-db@lists.php.net'
> >> Subject: RE: [PHP-DB] Forking and database connections
> >>
> >> http://php.he.net/manual/en/function.pcntl-fork.php says 'The reason
> for
> >> the MySQL "Lost Connection during query"...' like what you concluded,
> >> although they grab a new $db connection in the first for loop
> >> "} else if ( $pid ) {" I bet a child closes the one you create after
> the
> >> for loop while the parent is waiting in the foreach loop.
> >>
> >> In your foreach when you wait for each child, can you $db-
> >execute($stmt,
> >> $data) AFTER you pcntl_waitpid($pid, $status) ? Just reverse the
> lines?
> >>
> >> I think after the wait is when the child closes the connection (since I
> >> suppose you are reusing the same connection), so it is already closed.
> >>
> >>
> >> Regards,
> >> Dwight
> >>> -----Original Message-----
> >>> From: Chris Verges [mailto:chrisv@cyberswitching.com]
> >>> Sent: Saturday, April 21, 2007 12:21 PM
> >>> To: php-db@lists.php.net
> >>> Subject: [PHP-DB] Forking and database connections
> >>>
> >>> Hey all,
> >>>
> >>> I'm writing a PHP script that queries a database for some records,
> >> splits
> >>> the records into N number of groups of equal size, and then creates N
> >>> number
> >>> of forks where each child handles one of the groups. During the
> >> execution
> >>> of each of the child processes, I'd like the parent process to update
> >> the
> >>> status of the job in the database.
> >>>
> >>> The problem is regarding my database connection pre- and post- fork.
> >>> After
> >>> reading the pcntl_fork() page on the PHP manual, I realize that the
> >> child
> >>> process inherits the file descriptor, and if the child process closes
> >> the
> >>> connection, then it is closed in the parent process. So for each
> child
> >>> process (because I have more than one), I reinitialize the database
> >> link.
> >>> I
> >>> also reinitialize the database link for the parent process immediately
> >>> after
> >>> the fork.
> >>>
> >>> However, when a child process finishes, it seems like the database
> link
> >>> that
> >>> I reinitialized in the parent process also disconnects. I thought a
> >> fork
> >>> copied the entire heap, and therefore would make two copies of the
> >> object
> >>> instances that would remain segmented for the life of the processes.
> >>> Changes made to one copy of the heap wouldn't affect others. However,
> >>> this
> >>> doesn't seem to be the case.
> >>>
> >>> So at this point, my workaround is to wait until all of the child
> >>> processes
> >>> are finished, then re-initialize the database link, and give an
> updated
> >>> status message at the end rather than incrementally as child processes
> >>> finish.
> >>>
> >>> Here's some proof-of-concept code that explains what I mean:
> >>>
> >>> > >>>
> >>> /* Include PEAR::DB */
> >>> require_once('DB.php');
> >>>
> >>> # Database table definition
> >>> # -------------------------
> >>> # CREATE TABLE `logs` (
> >>> # `message` VARCHAR(128) NOT NULL
> >>> # );
> >>>
> >>> /* Create the initial database connection for the parent process */
> >>> $dsn = 'mysql://test:test@localhost/testdb';
> >>> $db = DB::connect($dsn);
> >>> if ( PEAR::isError($db) ) {
> >>> die($db->getMessage() . "\n");
> >>> }
> >>>
> >>> /* This will be the common SQL statement for all inserts */
> >>> $sql = "INSERT INTO `logs` (`message`) VALUES (?);";
> >>> $stmt = $db->prepare($sql);
> >>>
> >>> /* Perform a DB update */
> >>> $data = array('Started parent process');
> >>> $db->execute($stmt, $data);
> >>>
> >>> /* Create the child processes */
> >>> $childPids = array();
> >>> for ( $i = 0; $i < 5; $i++ ) {
> >>> $pid = pcntl_fork();
> >>> if ( $pid == -1 ) {
> >>> die("\nUnable to fork!\n");
> >>> } else if ( $pid ) {
> >>> /* Parent process */
> >>> echo "Child process $pid created\n";
> >>> array_push($childPids, $pid);
> >>> } else {
> >>> /* Child process */
> >>> $myPid = posix_getpid();
> >>>
> >>> /* Create a new database connection for the child
> >> process
> >>> */
> >>> $db = DB::connect($dsn);
> >>> if ( PEAR::isError($db) ) {
> >>> die("\nChild process $myPid: " . $db-
> >>> getMessage()
> >>> .
> >>> "\n" . $db->getDebugInfo() . "\n");
> >>> }
> >>>
> >>> $data = array("Child process $myPid");
> >>> $stmt = $db->prepare($sql);
> >>> $db->execute($stmt, $data);
> >>>
> >>> /* Add some latency for testing purposes */
> >>> sleep(5);
> >>> exit;
> >>> }
> >>> }
> >>>
> >>> /* Create a new database connection for the parent process */
> >>> $db = DB::connect($dsn);
> >>> if ( PEAR::isError($db) ) {
> >>> die("\nParent process: " . $db->getMessage() . "\n" .
> >>> $db->getDebugInfo() . "\n");
> >>> }
> >>>
> >>> /* Wait for the children to finish */
> >>> foreach ( $childPids as $pid ) {
> >>> $data = array("Parent process waiting on child process $pid");
> >>> $db->execute($stmt, $data);
> >>> pcntl_waitpid($pid, $status);
> >>> $data = array("Child process $pid is finished");
> >>> $db->execute($stmt, $data);
> >>> }
> >>>
> >>> $data = array("Parent process is finished");
> >>> $db->execute($stmt, $data);
> >>>
> >>> ?>
> >>>
> >>> The command-line output of this code:
> >>>
> >>> $ php forking-proof-of-concept.php
> >>> Child process 27012 created
> >>> Child process 27013 created
> >>> Child process 27014 created
> >>> Child process 27015 created
> >>> Child process 27016 created
> >>>
> >>> Child process 27016: DB Error: unknown error
> >>> [nativecode=2013 ** Lost connection to MySQL server during query] **
> >>> mysql://test:test@localhost/testdb
> >>>
> >>> And finally the database entries after running the code:
> >>>
> >>> mysql> select * from logs;
> >>> +------------------------+
> >>> | message |
> >>> +------------------------+
> >>> | Started parent process |
> >>> | Child process 27012 |
> >>> | Child process 27013 |
> >>> | Child process 27014 |
> >>> | Child process 27015 |
> >>> +------------------------+
> >>> 5 rows in set (0.00 sec)
> >>>
> >>> Any help in understanding this is appreciated!
> >>>
> >>> Thanks!
> >>> Chris
> >>>
> >>> --
> >>> PHP Database Mailing List (http://www.php.net/)
> >>> To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Forking and database connections

am 23.04.2007 19:43:53 von Dwight Altman

To use a PEAR::DB persistent connection, try
$db = DB::connect($dsn, TRUE);
or
$db = DB::connect($dsn, true);

Googled for "pear::db persistent connection" and got
http://vulcanonet.com/soft/?pack=pear_tut


Regards,
Dwight

> -----Original Message-----
> From: Dwight Altman [mailto:dwight@multicam.com]
>
> Where in your code does it say you are using persistent connections?

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Forking and database connections

am 23.04.2007 20:08:53 von Chris Verges

Hey Dwight,

After getting your first e-mail, I started adding the PEAR::DB persistent
connection code. Unfortunately, it yielded the same results that I was
getting before.

At a hunch, I created a second proof-of-concept script that uses the mysql_*
functions in the PHP base. For each of these, I set the $new_link parameter
in the mysql_connect() function to "true". As it turns out, the primary $db
link that was originally created in the parent process *is* destroyed when
the first child process closes -- this makes sense, however. If I re-create
that thread after spawning all the child processes, though, it works
beautifully!

So it looks like this is a bug with PEAR::DB in handling persistent
connections across forked processes. I'll go ahead and file it with the
PEAR::DB folk and include the two scripts in the bug request.

Again, thanks for your help! It was really beneficial to have a second set
of eyes and a sounding board for getting through this problem.

In case anyone wants the second proof-of-concept script:


# Database table definition
# -------------------------
# CREATE TABLE `logs` (
# `message` VARCHAR(128) NOT NULL
# );

/* Create the initial database connection for the parent process */
$db = mysql_connect('localhost', 'test', 'test', true);
mysql_select_db('testdb', $db);

/* Perform a DB update */
mysql_query("INSERT INTO `logs` (`message`) VALUES ('Started parent
process')", $db);

/* Create the child processes */
$childPids = array();
for ( $i = 0; $i < 5; $i++ ) {
$pid = pcntl_fork();
if ( $pid == -1 ) {
die("\nUnable to fork!\n");
} else if ( $pid ) {
/* Parent process */
echo "Child process $pid created\n";
array_push($childPids, $pid);
} else {
/* Child process */
$myPid = posix_getpid();

/* Create a new database connection for the child process */
$db2 = mysql_connect('localhost', 'test', 'test', true);
mysql_select_db('testdb', $db2);

mysql_query("INSERT INTO `logs` (`message`) VALUES ('Child
process $myPid started')", $db2);

/* Add some latency for testing purposes */
sleep(5);

mysql_query("INSERT INTO `logs` (`message`) VALUES ('Child
process $myPid ended')", $db2);

exit;
}
}

$db3 = mysql_connect('localhost', 'test', 'test', true);
mysql_select_db('testdb', $db3);

/* Wait for the children to finish */
foreach ( $childPids as $pid ) {
$msg = "Parent process waiting on child process $pid";
echo $msg . "\n";
mysql_query("INSERT INTO `logs` (`message`) VALUES ('$msg')", $db3);

pcntl_waitpid($pid, $status);

$msg = "Parent process detected child process $pid is finished";
echo $msg . "\n";
mysql_query("INSERT INTO `logs` (`message`) VALUES ('$msg')", $db3);
}

mysql_query("INSERT INTO `logs` (`message`) VALUES ('Parent process is
finished')", $db3);

?>

Thanks!
Chris





On 4/23/07 10:43 AM, "Dwight Altman" wrote:

> To use a PEAR::DB persistent connection, try
> $db = DB::connect($dsn, TRUE);
> or
> $db = DB::connect($dsn, true);
>
> Googled for "pear::db persistent connection" and got
> http://vulcanonet.com/soft/?pack=pear_tut
>
>
> Regards,
> Dwight
>
>> -----Original Message-----
>> From: Dwight Altman [mailto:dwight@multicam.com]
>>
>> Where in your code does it say you are using persistent connections?

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Forking and database connections

am 23.04.2007 20:28:52 von Chris Verges

For anyone interested in following the progress of this forking/PEAR::DB
issue, see http://pear.php.net/bugs/bug.php?id=10813.

Thanks,
Chris

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Forking and database connections

am 23.04.2007 21:19:10 von Dwight Altman

Glad to help.

I was actually interested in your post because of forking versus threading.
I still haven't found anything except for PHP-GTK but it may help your
database related performance if it is added/exists in PHP.

Regards,
Dwight
> -----Original Message-----
> From: Chris Verges [mailto:chrisv@cyberswitching.com]
>
> Again, thanks for your help! It was really beneficial to have a second
> set
> of eyes and a sounding board for getting through this problem.

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php