out of memory

out of memory

am 28.12.2006 22:48:28 von loufridkis

------=_Part_63761_2831766.1167342508595
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

I am running out of memory when I try to select about 100K rows. Here is
some code:

my $cel_sth = $dbh_from->prepare($sqlstring);
$cel_sth->execute();
my @rowvalues;
my $rv;
my $elementresult_id;
my $element_id;
my $quantification_id;
my $signal;

$rv = $cel_sth->bind_columns(\($elementresult_id, $element_id,
$quantification_id, $signal));

while($cel_sth->fetch()){
@rowvalues = ($elementresult_id, $element_id, $quantification_id,
$signal);
my $printstring= "insert into $table_name values(";
foreach my $column (@rowvalues){
if (defined $column){
if($column =~ /\D/){
$column = "'".$column."'"
unless ($column =~ /\d*\.\d+/);
}
$printstring.= "$column,";
} else {
$printstring.= "NULL,";
}
}
$printstring =~ s/,$/);/;
print "$printstring\n";
}

I guess the problem is it tries to hold all the rows in memory. Is there a
way to just get 1 or a few rows at a time?
--
Lou Fridkis
Human Genetics
57920

------=_Part_63761_2831766.1167342508595--

Re: out of memory

am 29.12.2006 00:30:28 von ron

On Thu, 28 Dec 2006 13:48:28 -0800, louis fridkis wrote:

Hi Louis

> while($cel_sth->fetch()){

What makes you think this is not returning one row at a time?

I assume the real problem is in the code you've suppressed.

> $printstring =3D~ s/,$/);/; print "$printstring\n"; }

Using s// to replace a char is shooting a mouse with an elephant gun. Try:

substr($x, length($x) - 1, 1) =3D ');';

--
Cheers
Ron Savage, ron@savage.net.au on 29/12/2006
http://savage.net.au/index.html
Let the record show: Microsoft is not an Australian company

Re: out of memory

am 29.12.2006 04:08:48 von altblue

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ron Savage wrote:
> On Thu, 28 Dec 2006 13:48:28 -0800, louis fridkis wrote:
>> $printstring =~ s/,$/);/; print "$printstring\n"; }
> Using s// to replace a char is shooting a mouse with an elephant gun. Try:
> substr($x, length($x) - 1, 1) = ');';

You forgot to notice Louis' ugly way (and error prone) of determining
the right quoting (hint: $dbh->quote accepts two parameters) or the
horrid "optimization" of using "bind_columns" (destroyed afterwards by
populating "@rowvalues" in the loop).

Ugly, too ugly code for the new year's eve ;-)

Louis, as Ron already hinted, I'd skip debugging this code and start off
with some clean new snippet, free of all these feeble "optimizations" (i
suppose this is what you wanted - "optimizing").

Here is something for starters (take care, I'm just baking it, no
testing done):

# your prepare and execute
my $cel_sth = $dbh_from->prepare($sqlstring);
$cel_sth->execute();

### get/set info necessary to the INSERT generator
# get columns types
# ! sth attributes should help, read your DBD's docs for availability
# ! (else you'll have to fallback to using "table_info / column_info,
# ! yet another soft spot for some DBDs)
my $types = $cel_sth->{TYPE};

# prepare INSERT generator (previous issues apply here too)
my $insert_proto
= q{INSERT INTO }
. $dbh_from->quote_identifier($table_name)
. q{ (}
. join(
q{, },
map { $dbh_from->quote_identifier($_) } @{ $cel_sth->{NAME} }
)
. q{) VALUES (};

# loop through rows and print generated INSERT statements
while ( my $row = $cel_sth->fetchrow_arrayref ) {
my $col_idx = 0;
print
$insert_proto,
join(
q{, },
map { $dbh_from->quote( $_, $types->[ $col_idx++ ] ) } @{$row}
),
");\n";
}

Yes, lotta dbi-quoting... but remember that quoting may fail too (OK,
rarely, but it does - it's documented) ;-)
The only (apparently) tricky thing in fact is how to get that
"data_type" parameter used for properly quoting your generated
statements, but hopefully you use some DBD that supports at least
table_info / column_info (if sth attributes aren't available). I wrote
this snippet with DBD::mysql and DBD::Pg in mind.

cheers
- --
Marius Feraru
-----BEGIN PGP SIGNATURE-----

iD8DBQFFlIbAtZHp/AYZiNkRAjm/AKDuD+0jA48Eqh2KIpBB6chidyuK+wCg iD0s
iLl8SgRHW7bn6MwWcrHFm4k=
=YaLE
-----END PGP SIGNATURE-----

Re: out of memory

am 29.12.2006 05:24:53 von ron

On Fri, 29 Dec 2006 05:08:48 +0200, Marius Feraru wrote:

Hi Marius

> You forgot to notice Louis' ugly way (and error prone) of

No I didn't - I was just too polite to hammer OP despite the low quality of the
code, so I chose a gentle hint or two.
--
Cheers
Ron Savage, ron@savage.net.au on 29/12/2006
http://savage.net.au/index.html
Let the record show: Microsoft is not an Australian company

Re: out of memory

am 29.12.2006 06:08:25 von altblue

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ron Savage wrote:
> On Fri, 29 Dec 2006 05:08:48 +0200, Marius Feraru wrote:
>> You forgot to notice Louis' ugly way (and error prone) of
> No I didn't - I was just too polite to hammer OP despite the low
> quality of the code, so I chose a gentle hint or two.

Oops, does this mean I was rude to point out some more hints? My
apologies then ;-)

happy coding everyone :)
- --
Marius Feraru
-----BEGIN PGP SIGNATURE-----

iD8DBQFFlKLJtZHp/AYZiNkRAvthAJ9h4GLmwA1CTgMUPooF+PnZVjAD2ACg jFD/
63pWRvMIKO1YncCHZhwxdYY=
=IA9T
-----END PGP SIGNATURE-----

Re: out of memory

am 29.12.2006 17:43:10 von kev.spencer

On 12/28/06, Marius Feraru wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Ron Savage wrote:
> > On Fri, 29 Dec 2006 05:08:48 +0200, Marius Feraru wrote:
> >> You forgot to notice Louis' ugly way (and error prone) of
> > No I didn't - I was just too polite to hammer OP despite the low
> > quality of the code, so I chose a gentle hint or two.
>
> Oops, does this mean I was rude to point out some more hints? My
> apologies then ;-)

Point out hints yes. But tactfully ;-)

--
Kevin

Re: out of memory

am 29.12.2006 17:46:03 von loufridkis

------=_Part_68195_3956230.1167410763488
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Thanks for your reply's. No, I'm not insulted. This is a quick-and-dirty to
generate some test data. But, I do appreciate the comments since I am eager
to improve.

There is no suppressed code. I showed you the whole program except for the
header. Here is the full program (except where I substituted foo, bar, jar):

#!/usr/bin/perl
use strict;
use DBI;

my $dbh_from = DBI->connect(
'dbi:Pg:dbname=jar;host=foo');

my ($table_name, $sqlstring) = @ARGV;

$dbh_from->do('SET search_path TO cel, part_elementresult, public');

my $cel_sth = $dbh_from->prepare($sqlstring);
$cel_sth->execute();
#
# added & replaced:
#while(my (@rowvalues) = $cel_sth->fetchrow_array()){
#
my @rowvalues;
my $rv;
my $elementresult_id;
my $element_id;
my $quantification_id;
my $signal;

$rv = $cel_sth->bind_columns(\($elementresult_id, $element_id,
$quantification_id, $signal));

while($cel_sth->fetch()){
@rowvalues = ($elementresult_id, $element_id, $quantification_id,
$signal);
#
# end
# added & replaced
#
my $printstring= "insert into $table_name values(";
foreach my $column (@rowvalues){
if (defined $column){
if($column =~ /\D/){
$column = "'".$column."'"
unless ($column =~ /\d*\.\d+/);
}
$printstring.= "$column,";
} else {
$printstring.= "NULL,";
}
}
$printstring =~ s/,$/);/;
print "$printstring\n";
#my $dbh_to = DBI->connect(
# 'dbi:Pg:dbname=TestIO;host=bar');
# $dbh_to->do($printstring);
}

As you can see from the comments I originally used: fetchrow_array(). I
changed in the hopes that fetch() would work better. The reason I think it
tries to hold all the rows in memory is because it works when the query in
$sqlstring returns a small number of rows, but fails with the "out of
memory" error when it returns a large number of rows.

On 12/28/06, Marius Feraru wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Ron Savage wrote:
> > On Fri, 29 Dec 2006 05:08:48 +0200, Marius Feraru wrote:
> >> You forgot to notice Louis' ugly way (and error prone) of
> > No I didn't - I was just too polite to hammer OP despite the low
> > quality of the code, so I chose a gentle hint or two.
>
> Oops, does this mean I was rude to point out some more hints? My
> apologies then ;-)
>
> happy coding everyone :)
> - --
> Marius Feraru
> -----BEGIN PGP SIGNATURE-----
>
> iD8DBQFFlKLJtZHp/AYZiNkRAvthAJ9h4GLmwA1CTgMUPooF+PnZVjAD2ACg jFD/
> 63pWRvMIKO1YncCHZhwxdYY=
> =IA9T
> -----END PGP SIGNATURE-----
>



--
Lou Fridkis
Human Genetics
57920

------=_Part_68195_3956230.1167410763488--

Re: out of memory

am 29.12.2006 22:53:18 von ron

On Fri, 29 Dec 2006 08:46:03 -0800, louis fridkis wrote:

Hi Louis

> Thanks for your reply's. No, I'm not insulted. This is a quick-and-
> dirty to generate some test data. But, I do appreciate the comments
> since I am eager to improve.

Glad you're taking it the right way!

> $rv =3D $cel_sth->bind_columns(\($elementresult_id, $element_id,
> $quantification_id, $signal));

Why can't you use @rowvalues here?

> while($cel_sth->fetch()){
> @rowvalues =3D ($elementresult_id, $element_id, $quantification_id,
> $signal); # # end # added & replaced #
> my $printstring=3D "insert into $table_name values("; foreach my
> $column (@rowvalues){ if (defined $column){ if($column =3D~

If you don't change the call to bind_columns(), why can't you use:
=09for my $column ($elementresult_id, $element_id, $quantification_id,=
$signal)
here? That saves copying the info into @rowvalues.

> fetchrow_array(). I changed in the hopes that fetch() would work
> better. The reason I think it tries to hold all the rows in memory
> is because it works when the query in $sqlstring returns a small
> number of rows, but fails with the "out of memory" error when it
> returns a large number of rows.

That makes me think the problem really is you lack of RAM.
--
Cheers
Ron Savage, ron@savage.net.au on 30/12/2006
http://savage.net.au/index.html
Let the record show: Microsoft is not an Australian company

Re: out of memory

am 29.12.2006 23:28:32 von loufridkis

------=_Part_70213_15752058.1167431312436
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Ron,

Thanks for the suggestions. But, wouldn't it just be better to switch back
to using fetchrow_array()? Is there any advantage to using bind_columns in
this case?

On 12/29/06, Ron Savage wrote:
>
> On Fri, 29 Dec 2006 08:46:03 -0800, louis fridkis wrote:
>
> Hi Louis
>
> > Thanks for your reply's. No, I'm not insulted. This is a quick-and-
> > dirty to generate some test data. But, I do appreciate the comments
> > since I am eager to improve.
>
> Glad you're taking it the right way!
>
> > $rv = $cel_sth->bind_columns(\($elementresult_id, $element_id,
> > $quantification_id, $signal));
>
> Why can't you use @rowvalues here?
>
> > while($cel_sth->fetch()){
> > @rowvalues = ($elementresult_id, $element_id, $quantification_id,
> > $signal); # # end # added & replaced #
> > my $printstring= "insert into $table_name values("; foreach my
> > $column (@rowvalues){ if (defined $column){ if($column =~
>
> If you don't change the call to bind_columns(), why can't you use:
> for my $column ($elementresult_id, $element_id,
> $quantification_id, $signal)
> here? That saves copying the info into @rowvalues.
>
> > fetchrow_array(). I changed in the hopes that fetch() would work
> > better. The reason I think it tries to hold all the rows in memory
> > is because it works when the query in $sqlstring returns a small
> > number of rows, but fails with the "out of memory" error when it
> > returns a large number of rows.
>
> That makes me think the problem really is you lack of RAM.
> --
> Cheers
> Ron Savage, ron@savage.net.au on 30/12/2006
> http://savage.net.au/index.html
> Let the record show: Microsoft is not an Australian company
>
>


--
Lou Fridkis
Human Genetics
57920

------=_Part_70213_15752058.1167431312436--

Re: out of memory

am 30.12.2006 03:28:30 von ron

On Fri, 29 Dec 2006 14:28:32 -0800, louis fridkis wrote:

Hi Louis

> Thanks for the suggestions. But, wouldn't it just be better to
> switch back to using fetchrow_array()? Is there any advantage to
> using bind_columns in this case?

Sometimes 'better' refers to personal preference rather than a clear-cut=
matter
of coding convention or style. I'd say suit yourself, unless there is a=
clear
advantage doing things in one particular way.

This is from my tiny list of DBI best practices:

Use prepare_cached(), and then:

$sth->execute;
my %row;
$sth->bind_columns( \( @row{ @{$sth->{NAME_lc} } } ));
while ($sth->fetch)
{
=09print "$row{region}: $row{sales}\n";
}

Actually, I'd like to see a best practices doc alongside the DBI doc, unless=

there is already such a doc (and I'm sure I'll get told if there is).

--
Cheers
Ron Savage, ron@savage.net.au on 30/12/2006
http://savage.net.au/index.html
Let the record show: Microsoft is not an Australian company

Re: out of memory

am 30.12.2006 14:03:09 von Tim.Bunce

Assuming you're using mysql, take a look at the mysql_use_result attribute.

Tim.

On Thu, Dec 28, 2006 at 01:48:28PM -0800, louis fridkis wrote:
> I am running out of memory when I try to select about 100K rows. Here is
> some code:
>
> my $cel_sth = $dbh_from->prepare($sqlstring);
> $cel_sth->execute();
> my @rowvalues;
> my $rv;
> my $elementresult_id;
> my $element_id;
> my $quantification_id;
> my $signal;
>
> $rv = $cel_sth->bind_columns(\($elementresult_id, $element_id,
> $quantification_id, $signal));
>
> while($cel_sth->fetch()){
> @rowvalues = ($elementresult_id, $element_id, $quantification_id,
> $signal);
> my $printstring= "insert into $table_name values(";
> foreach my $column (@rowvalues){
> if (defined $column){
> if($column =~ /\D/){
> $column = "'".$column."'"
> unless ($column =~ /\d*\.\d+/);
> }
> $printstring.= "$column,";
> } else {
> $printstring.= "NULL,";
> }
> }
> $printstring =~ s/,$/);/;
> print "$printstring\n";
> }
>
> I guess the problem is it tries to hold all the rows in memory. Is there a
> way to just get 1 or a few rows at a time?
> --
> Lou Fridkis
> Human Genetics
> 57920

Re: out of memory

am 02.01.2007 17:22:46 von loufridkis

------=_Part_100111_30560428.1167754966790
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Ron, Thanks again.

TIm, I am using psql.

On 12/30/06, Tim Bunce wrote:
>
> Assuming you're using mysql, take a look at the mysql_use_result
> attribute.
>
> Tim.
>
> On Thu, Dec 28, 2006 at 01:48:28PM -0800, louis fridkis wrote:
> > I am running out of memory when I try to select about 100K rows. Here is
> > some code:
> >
> > my $cel_sth = $dbh_from->prepare($sqlstring);
> > $cel_sth->execute();
> > my @rowvalues;
> > my $rv;
> > my $elementresult_id;
> > my $element_id;
> > my $quantification_id;
> > my $signal;
> >
> > $rv = $cel_sth->bind_columns(\($elementresult_id, $element_id,
> > $quantification_id, $signal));
> >
> > while($cel_sth->fetch()){
> > @rowvalues = ($elementresult_id, $element_id, $quantification_id,
> > $signal);
> > my $printstring= "insert into $table_name values(";
> > foreach my $column (@rowvalues){
> > if (defined $column){
> > if($column =~ /\D/){
> > $column = "'".$column."'"
> > unless ($column =~ /\d*\.\d+/);
> > }
> > $printstring.= "$column,";
> > } else {
> > $printstring.= "NULL,";
> > }
> > }
> > $printstring =~ s/,$/);/;
> > print "$printstring\n";
> > }
> >
> > I guess the problem is it tries to hold all the rows in memory. Is there
> a
> > way to just get 1 or a few rows at a time?
> > --
> > Lou Fridkis
> > Human Genetics
> > 57920
>



--
Lou Fridkis
Human Genetics
57920

------=_Part_100111_30560428.1167754966790--