RE: Fatal error (SIGTERM) Garbage Collecting DBI handle [More Details]

RE: Fatal error (SIGTERM) Garbage Collecting DBI handle [More Details]

am 31.01.2006 01:47:49 von lsosborn

We have a web based Apache/Perl/DBI/mysql application running on a Red
Hat Enterprise Linux 4 server that we are in the last stages of
converting to FastCGI. Our biggest remaining issue is a recurring
sporadic problem that our server-scripts are receiving usually a SIGTERM
(but occasionally a SIGPIPE) appearing to NOT originate from
mod_fastcgi. This results in no output to mod_fastcgi, an "incomplete
headers (0 bytes)" message in FastCGI's log file, and an "Internal
Server Error" reported by apache to our users. This is unacceptable in
a production web application like ours. After much trial and error, and
process of elimination, we have tracked the characteristics of the
problem down with strace and other tools, and have determined that ...
that our Perl script appears to be dying from a SIGTERM after
Autoloader.pm fails to load DESTROY.al for DBI::Destroy(). I have no
definitive proof that one issue causes the other, but they do always
appear to happen together in sequence.

We initially were having problems with our scripts dying as a result of
trying to use stale database handles. This was resolved with the code I
am attaching below, which is called at the beginning of each page
request via VerifyDbHandlesAreStillValid(). In the event that talking
to either of our database handles causes an error, we reconnect. This
is when the reference to the old database handles are severed, and I
suspect when the GC happens that is causing us such grief. Like I said
before, we are experiencing this only a small percent of the time, but
it is much too frequent to consider deploying this application until
this issue is resolved.






#-------------------------------
# Create database connection string, login and password variables
#-------------------------------
my $dbname =3D 'dsi';
my @slave_hosts =3D ('db-2.our-domain.com');
my $strConn_master =3D "DBI:mysql:database=3D$dbname;host=3Ddb-1.
our-domain.com";
my @astrConn_slaves =3D map { "DBI:mysql:database=3D$dbname;host=3D$_" }
@slave_hosts;

#-------------------------------
# Open the connection
#-------------------------------
my @aDbiInitParamsMaster =3D (
$strConn_master,
$username,
$password,
{
PrintError =3D> $bDebug,
RaiseError =3D> $bDebug
}
);

my @aDbiInitParamsSlaves =3D map {
[
$_,
$username,
$password,
{
PrintError =3D> $bDebug,
RaiseError =3D> $bDebug
}
]
} @astrConn_slaves;

my $dbh_master;
my $dbh_slave;

# LSO: Borrowing O'reilly code snippet
# Function source:
http://www.unix.org.ua/orelly/perl/cookbook/ch04_18.htm
##################################################
# fisher_yates_shuffle( \@array ) : generate a random permutation
# of @array in place
sub fisher_yates_shuffle {
my $array =3D shift;
my $i;
for ( $i =3D @$array ; --$i ; ) {
my $j =3D int rand( $i + 1 );
next if $i == $j;
@$array[ $i, $j ] =3D @$array[ $j, $i ];
}
}
##################################################
# LSO: END OF CODE SNIPPET

sub SelectPrimaryDbHandle {
if ($Common::ReportingMode) {
$dbh =3D $dbh_slave;
} else {
$dbh =3D $dbh_master;
}
}

sub InitDatabaseConnection {
$dbh =3D undef();
$dbh_slave =3D undef();
$dbh_master =3D undef();

eval { $dbh_master =3D DBI->connect(@aDbiInitParamsMaster); };
unless ( ref $dbh_master ) {
warn "MASTER MySQL SERVER: Appears to be down " .
Dumper($dbh_master);
confess "$@ $DBI::err_str \n";
}
my @aDbiInitParamsSlavesCopy =3D @aDbiInitParamsSlaves;
fisher_yates_shuffle( \@aDbiInitParamsSlavesCopy );
until ( ref $dbh_slave ) {
if ( scalar(@aDbiInitParamsSlavesCopy) ) {
eval {
$dbh_slave =3D
DBI->connect( @{ shift @aDbiInitParamsSlavesCopy } );
};
unless ( ref $dbh_slave ) {
warn "SLAVE MySQL SERVER: Appears to be down "
. Dumper($dbh_slave);
}
} else {
warn
"ALERT: Unable to connect to ANY slave database sever. Falling back to
master.";
eval { $dbh_slave =3D DBI->connect(@aDbiInitParamsMaster); =
};
unless ( ref $dbh_slave ) {
warn "MASTER: " . Dumper($dbh_slave);
confess "$@ $DBI::err_str \n";
}
}
}
SelectPrimaryDbHandle();
}

sub VerifyDbHandlesAreStillValid {
my $bConnectionVerified =3D 0;
my $bConnectionAttempts =3D 0;
while ( !$bConnectionVerified ) {
eval {

# We don't care about the return value,
# just if they cause a fatal error
DLookUp( "practices", "COUNT(id)", "1", $dbh_slave,
"DONT_CATCH" );
DLookUp( "practices", "COUNT(id)", "1", $dbh_master,
"DONT_CATCH" );

# Don't have to check $dbh, it's just a copy
};
if ($@) {
warn
"Caught a fatal error quereying the database. Trying to reconnect.";

# Disconnect if we can... we don't care if we fail, but one
failure
# shouldn't prevent the second attempt
eval { $dbh_slave->disconnect(); };
eval { $dbh_master->disconnect(); };
InitDatabaseConnection();
Time::HiRes::usleep(100);
$bConnectionAttempts++;
} else {
$bConnectionVerified =3D 1;
}
}
if($bConnectionAttempts > 0) {
warn("Database connection verified after $bConnectionAttempts
reconnection attempts");
}
}

InitDatabaseConnection();