prepare_cached - just how much quicker should I expect?

am 25.07.2006 19:35:59 von Martin.Evans

I know this is one of those how long is a piece of string questions but
I cannot see any difference using prepare_cached with a remote mysql or
oracle database. I have code which does inserts/updates/selects - around
20 - 30 different pieces of SQL and most of them are run between a 100
and 1000 times per file I process. When I process around 10 files it
takes around 4 minutes and this drops about 5s when using prepare_cached
(but I thought this was too small so was within the bounds of variance
running the same script multiple times anyway).

Most of the uses are like this:

my $sql = q/select column from table where column2 = ?/;
my $val = selectrow_array($sql, undef, $a_value);

where the select returns one row. I changed this to:

$s = $h->prepare_cached($sql);
$s->execute($a_value);
$val = $s->fetchrow_array;
$s->finish;

There is was specific change to prepare_cached which actually doubles
the time when run to mysql compared with using prepare:

select LAST_INSERT_ID();

Am I doing something wrong or are my expectations wrong. I had thought
using prepare_cached would shave a lot more than 5s in 240s off.

Has anyone got a concrete example of where prepare_cached is
significantly faster? or I am flogging a dead horse?

Martin
--
Martin J. Evans
Easysoft Limited
http://www.easysoft.com

Re: prepare_cached - just how much quicker should I expect?

am 25.07.2006 23:50:51 von Tim.Bunce

Use DBI::Profile to see where the time is being spent.

Tim.

On Tue, Jul 25, 2006 at 06:35:59PM +0100, Martin J. Evans wrote:
> I know this is one of those how long is a piece of string questions but
> I cannot see any difference using prepare_cached with a remote mysql or
> oracle database. I have code which does inserts/updates/selects - around
> 20 - 30 different pieces of SQL and most of them are run between a 100
> and 1000 times per file I process. When I process around 10 files it
> takes around 4 minutes and this drops about 5s when using prepare_cached
> (but I thought this was too small so was within the bounds of variance
> running the same script multiple times anyway).
>
> Most of the uses are like this:
>
> my $sql = q/select column from table where column2 = ?/;
> my $val = selectrow_array($sql, undef, $a_value);
>
> where the select returns one row. I changed this to:
>
> $s = $h->prepare_cached($sql);
> $s->execute($a_value);
> $val = $s->fetchrow_array;
> $s->finish;
>
> There is was specific change to prepare_cached which actually doubles
> the time when run to mysql compared with using prepare:
>
> select LAST_INSERT_ID();
>
> Am I doing something wrong or are my expectations wrong. I had thought
> using prepare_cached would shave a lot more than 5s in 240s off.
>
> Has anyone got a concrete example of where prepare_cached is
> significantly faster? or I am flogging a dead horse?
>
> Martin
> --
> Martin J. Evans
> Easysoft Limited
> http://www.easysoft.com
>

Re: prepare_cached - just how much quicker should I expect?

am 26.07.2006 04:18:55 von mark

I know nothing about MySQL, but I benchmarked prepare vs prepare_cached
in Oracle many moons ago, and saw no great difference, which is actually
what I expected. In theory, the main savings is avoiding the repeated
cost of the database parsing and prepping the statement, but Oracle
does its own internal statement caching, so its mostly moot. There
might be corner cases where prepare_cached buys you something in
Oracle, but I think by and large it doesn't. For example, it does
save a network round trip presumably, so perhaps over a slow link
or something like that...

Mark

Tim Bunce wrote:
> Use DBI::Profile to see where the time is being spent.
>
> Tim.
>
> On Tue, Jul 25, 2006 at 06:35:59PM +0100, Martin J. Evans wrote:
>> I know this is one of those how long is a piece of string questions but
>> I cannot see any difference using prepare_cached with a remote mysql or
>> oracle database. I have code which does inserts/updates/selects - around
>> 20 - 30 different pieces of SQL and most of them are run between a 100
>> and 1000 times per file I process. When I process around 10 files it
>> takes around 4 minutes and this drops about 5s when using prepare_cached
>> (but I thought this was too small so was within the bounds of variance
>> running the same script multiple times anyway).
>>
>> Most of the uses are like this:
>>
>> my $sql = q/select column from table where column2 = ?/;
>> my $val = selectrow_array($sql, undef, $a_value);
>>
>> where the select returns one row. I changed this to:
>>
>> $s = $h->prepare_cached($sql);
>> $s->execute($a_value);
>> $val = $s->fetchrow_array;
>> $s->finish;
>>
>> There is was specific change to prepare_cached which actually doubles
>> the time when run to mysql compared with using prepare:
>>
>> select LAST_INSERT_ID();
>>
>> Am I doing something wrong or are my expectations wrong. I had thought
>> using prepare_cached would shave a lot more than 5s in 240s off.
>>
>> Has anyone got a concrete example of where prepare_cached is
>> significantly faster? or I am flogging a dead horse?
>>
>> Martin
>> --
>> Martin J. Evans
>> Easysoft Limited
>> http://www.easysoft.com
>>

Re: prepare_cached - just how much quicker should I expect?

am 26.07.2006 10:51:02 von Martin.Evans

On 25-Jul-2006 Tim Bunce wrote:
> Use DBI::Profile to see where the time is being spent.

I am looking into this.

My first attempt at setting DBI_PROFILE=2 and running my script returned with:

DBI::Profile on_destroy failed: Undefined subroutine &DBI::Profile::time_in_dbi
called at /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/DBI/P rofile.pm
line 641 during global destruction.

Looking at line 640-642:

$prologue .= sprintf "%fs ", $time_in_dbi;
my $perl_time = ($DBI::PERL_ENDING) ? time_in_dbi() - $^T : $t2-$t1;
$prologue .= sprintf "%.2f%% ", $time_in_dbi/$perl_time*100 if
$perl_time;

I cannot find a time_in_dbi sub, only a scalar called time_in_dbi. To make it
work I had to change the above to use $time_in_dbi and not time_in_dbi(). Don't
know if this was correct but thought I ought to mention it.

Martin
--
Martin J. Evans
Easysoft Ltd, UK
http://www.easysoft.com

> Tim.
>
> On Tue, Jul 25, 2006 at 06:35:59PM +0100, Martin J. Evans wrote:
>> I know this is one of those how long is a piece of string questions but
>> I cannot see any difference using prepare_cached with a remote mysql or
>> oracle database. I have code which does inserts/updates/selects - around
>> 20 - 30 different pieces of SQL and most of them are run between a 100
>> and 1000 times per file I process. When I process around 10 files it
>> takes around 4 minutes and this drops about 5s when using prepare_cached
>> (but I thought this was too small so was within the bounds of variance
>> running the same script multiple times anyway).
>>
>> Most of the uses are like this:
>>
>> my $sql = q/select column from table where column2 = ?/;
>> my $val = selectrow_array($sql, undef, $a_value);
>>
>> where the select returns one row. I changed this to:
>>
>> $s = $h->prepare_cached($sql);
>> $s->execute($a_value);
>> $val = $s->fetchrow_array;
>> $s->finish;
>>
>> There is was specific change to prepare_cached which actually doubles
>> the time when run to mysql compared with using prepare:
>>
>> select LAST_INSERT_ID();
>>
>> Am I doing something wrong or are my expectations wrong. I had thought
>> using prepare_cached would shave a lot more than 5s in 240s off.
>>
>> Has anyone got a concrete example of where prepare_cached is
>> significantly faster? or I am flogging a dead horse?
>>
>> Martin
>> --
>> Martin J. Evans
>> Easysoft Limited
>> http://www.easysoft.com
>>

Re: prepare_cached - just how much quicker should I expect?

am 26.07.2006 11:43:34 von Martin.Evans

On 25-Jul-2006 Tim Bunce wrote:
> Use DBI::Profile to see where the time is being spent.

I got this working and could not see anything taking longer. However, on
further investigation of my logs I have found the reason why doing a
prepare_cached on "select LAST_INSERT_ID()" to mysql is a BAD thing to do.
It always returns the first id ever inserted. i.e.

insert into table1 values (val)
select LAST_INSERT_ID() returns 1
insert into table1 values (val)
select LAST_INSERT_ID() returns 1 (instead of 2)
etc

I see there is a warning about how you can get into trouble with prepare_cached
in the DBI pod. Perhaps it would be worth noting that there is also some SQL
you don't want to prepare_cached as it won't work properly - like this example.

Martin
--
Martin J. Evans
Easysoft Ltd, UK
http://www.easysoft.com

> Tim.
>
> On Tue, Jul 25, 2006 at 06:35:59PM +0100, Martin J. Evans wrote:
>> I know this is one of those how long is a piece of string questions but
>> I cannot see any difference using prepare_cached with a remote mysql or
>> oracle database. I have code which does inserts/updates/selects - around
>> 20 - 30 different pieces of SQL and most of them are run between a 100
>> and 1000 times per file I process. When I process around 10 files it
>> takes around 4 minutes and this drops about 5s when using prepare_cached
>> (but I thought this was too small so was within the bounds of variance
>> running the same script multiple times anyway).
>>
>> Most of the uses are like this:
>>
>> my $sql = q/select column from table where column2 = ?/;
>> my $val = selectrow_array($sql, undef, $a_value);
>>
>> where the select returns one row. I changed this to:
>>
>> $s = $h->prepare_cached($sql);
>> $s->execute($a_value);
>> $val = $s->fetchrow_array;
>> $s->finish;
>>
>> There is was specific change to prepare_cached which actually doubles
>> the time when run to mysql compared with using prepare:
>>
>> select LAST_INSERT_ID();
>>
>> Am I doing something wrong or are my expectations wrong. I had thought
>> using prepare_cached would shave a lot more than 5s in 240s off.
>>
>> Has anyone got a concrete example of where prepare_cached is
>> significantly faster? or I am flogging a dead horse?
>>
>> Martin
>> --
>> Martin J. Evans
>> Easysoft Limited
>> http://www.easysoft.com
>>

Re: prepare_cached - just how much quicker should I expect?

am 26.07.2006 11:51:08 von Martin.Evans

On 26-Jul-2006 mark wrote:
> I know nothing about MySQL, but I benchmarked prepare vs prepare_cached
> in Oracle many moons ago, and saw no great difference, which is actually
> what I expected. In theory, the main savings is avoiding the repeated
> cost of the database parsing and prepping the statement, but Oracle
> does its own internal statement caching, so its mostly moot. There
> might be corner cases where prepare_cached buys you something in
> Oracle, but I think by and large it doesn't. For example, it does
> save a network round trip presumably, so perhaps over a slow link
> or something like that...
>
> Mark

Mark,

Thanks for that. It is nice to at least know that other people have had the
same experience. When running my script to Oracle (over a fast oracle
connection) and doing 74 unique inserts/updates/selects and 26187 calls to
prepare_cached I get no decernable speed up using prepare_cached, execute,
fetch* compared with select*. As you say, with other databases and slower
networks it may make a bigger difference.

Martin
--
Martin J. Evans
Easysoft Ltd, UK
http://www.easysoft.com

>
> Tim Bunce wrote:
>> Use DBI::Profile to see where the time is being spent.
>>
>> Tim.
>>
>> On Tue, Jul 25, 2006 at 06:35:59PM +0100, Martin J. Evans wrote:
>>> I know this is one of those how long is a piece of string questions but
>>> I cannot see any difference using prepare_cached with a remote mysql or
>>> oracle database. I have code which does inserts/updates/selects - around
>>> 20 - 30 different pieces of SQL and most of them are run between a 100
>>> and 1000 times per file I process. When I process around 10 files it
>>> takes around 4 minutes and this drops about 5s when using prepare_cached
>>> (but I thought this was too small so was within the bounds of variance
>>> running the same script multiple times anyway).
>>>
>>> Most of the uses are like this:
>>>
>>> my $sql = q/select column from table where column2 = ?/;
>>> my $val = selectrow_array($sql, undef, $a_value);
>>>
>>> where the select returns one row. I changed this to:
>>>
>>> $s = $h->prepare_cached($sql);
>>> $s->execute($a_value);
>>> $val = $s->fetchrow_array;
>>> $s->finish;
>>>
>>> There is was specific change to prepare_cached which actually doubles
>>> the time when run to mysql compared with using prepare:
>>>
>>> select LAST_INSERT_ID();
>>>
>>> Am I doing something wrong or are my expectations wrong. I had thought
>>> using prepare_cached would shave a lot more than 5s in 240s off.
>>>
>>> Has anyone got a concrete example of where prepare_cached is
>>> significantly faster? or I am flogging a dead horse?
>>>
>>> Martin
>>> --
>>> Martin J. Evans
>>> Easysoft Limited
>>> http://www.easysoft.com
>>>

Re: prepare_cached - just how much quicker should I expect?

am 27.07.2006 22:23:35 von Tim.Bunce

On Wed, Jul 26, 2006 at 09:51:02AM +0100, Martin J. Evans wrote:
>
> On 25-Jul-2006 Tim Bunce wrote:
> > Use DBI::Profile to see where the time is being spent.
>
> I am looking into this.
>
> My first attempt at setting DBI_PROFILE=2 and running my script returned with:
>
> DBI::Profile on_destroy failed: Undefined subroutine &DBI::Profile::time_in_dbi
> called at /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/DBI/P rofile.pm
> line 641 during global destruction.
>
> Looking at line 640-642:
>
> $prologue .= sprintf "%fs ", $time_in_dbi;
> my $perl_time = ($DBI::PERL_ENDING) ? time_in_dbi() - $^T : $t2-$t1;
> $prologue .= sprintf "%.2f%% ", $time_in_dbi/$perl_time*100 if $perl_time;
>
> I cannot find a time_in_dbi sub, only a scalar called time_in_dbi. To make it
> work I had to change the above to use $time_in_dbi and not time_in_dbi(). Don't
> know if this was correct but thought I ought to mention it.

Yeap. It's fixed in the next version which I hope to release this week.

Tim.

Re: prepare_cached - just how much quicker should I expect?

am 27.07.2006 22:25:51 von Tim.Bunce

On Wed, Jul 26, 2006 at 10:43:34AM +0100, Martin J. Evans wrote:
>
> On 25-Jul-2006 Tim Bunce wrote:
> > Use DBI::Profile to see where the time is being spent.
>
> I got this working and could not see anything taking longer. However, on
> further investigation of my logs I have found the reason why doing a
> prepare_cached on "select LAST_INSERT_ID()" to mysql is a BAD thing to do.
> It always returns the first id ever inserted. i.e.
>
> insert into table1 values (val)
> select LAST_INSERT_ID() returns 1
> insert into table1 values (val)
> select LAST_INSERT_ID() returns 1 (instead of 2)
> etc
>
> I see there is a warning about how you can get into trouble with prepare_cached
> in the DBI pod. Perhaps it would be worth noting that there is also some SQL
> you don't want to prepare_cached as it won't work properly - like this example.

You're making a guess about the cause of the problem. You might be
right, but if so it's a bug in DBD::mysql.

But for mysql you'll find it much faster to just use the $dbh->{mysql_insertid}
attribute after each insert and avoid the select entirely.

Tim.

Re: prepare_cached - just how much quicker should I expect?

am 28.07.2006 16:46:20 von Martin.Evans

On 27-Jul-2006 Tim Bunce wrote:
> On Wed, Jul 26, 2006 at 10:43:34AM +0100, Martin J. Evans wrote:
>>
>> On 25-Jul-2006 Tim Bunce wrote:
>> > Use DBI::Profile to see where the time is being spent.
>>
>> I got this working and could not see anything taking longer. However, on
>> further investigation of my logs I have found the reason why doing a
>> prepare_cached on "select LAST_INSERT_ID()" to mysql is a BAD thing to do.
>> It always returns the first id ever inserted. i.e.
>>
>> insert into table1 values (val)
>> select LAST_INSERT_ID() returns 1
>> insert into table1 values (val)
>> select LAST_INSERT_ID() returns 1 (instead of 2)
>> etc
>>
>> I see there is a warning about how you can get into trouble with
>> prepare_cached
>> in the DBI pod. Perhaps it would be worth noting that there is also some SQL
>> you don't want to prepare_cached as it won't work properly - like this
>> example.
>
> You're making a guess about the cause of the problem. You might be
> right, but if so it's a bug in DBD::mysql.

I can reproduce here so I'll try and simplify it.

To be honest, I am starting to get a bit frustrated so perhaps I'm losing some
vision - appologies. I am writing code that needs to work with around 3-6 DBDs
and seem to be hitting a lot of bugs and incompatibilities (almost on a daily
basis). DBI->last_insert_id is one I struggled to get working in all of them so
resorted to other means (I never got last_insert_id method working
in Oracle and had to resort to a trigger and a global package variable,
similarly with DB2 where I use identity_val_local()).

Todays problem is:

use DBI;
#my $h = DBI->connect("dbi:Oracle:XE", "xxx", "yyy");
$h = DBI->connect("dbi:mysql:xxx", "xxx","yyy");
for (my $n = 0; $n < 2; $n++) {
$h->do(q/insert into mje (b) values('a')/);
#$sth = $h->prepare_cached("select LastInsertID() from dual");
$sth = $h->prepare_cached("select LAST_INSERT_ID()");
$sth->execute;
my $r = $sth->fetchall_arrayref;
}

which returns:

prepare_cached(select LAST_INSERT_ID()) statement handle
DBI::st=HASH(0x9ba8114) still Active at -e line 8

when run for mysql but works fine for Oracle (see commented out lines). Yes,
you can fix it by putting a finish in (or adding if_active arg to
prepare_cached) but why are they different. I'd guess its because mysql is not
calling dbd_st_finish when all the rows are consumed.

Not really asking for any help here, just using it as an illustration.

> But for mysql you'll find it much faster to just use the
> $dbh->{mysql_insertid}
> attribute after each insert and avoid the select entirely.

Connecting to a remote mysql server over a 100M network I could not see any
difference in using mysql_insertid and select LAST_INSERT_ID(), I remember
looking into that:

create table mje (a int auto_increment primary key, b char(20))

table dropped and recreated between runs of:

use DBI;

my $h = DBI->connect("dbi:mysql:xxx","xxx", "yyy");
my $sql = q/insert into mje (b) values (?)/;
my $sth = $h->prepare($sql);
for (my $n = 0; $n < 2000; $n++) {
$sth->execute('aaaa');
if ($ARGV[0]) {
my $sql = q/select LAST_INSERT_ID()/;
$val = $h->selectrow_array($sql);
} else {
$val = $h->{'mysql_insertid'};
}
}

run with arg (i.e. doing select for id)

run time(s)
=== =======
1 54
2 54
3 53

run without arg (i.e. mysql_insertid)

run time(s)
=== =======
1 55
2 53
3 55

system/user times were less for the second one but only by a tiny amount.

Martin
--
Martin J. Evans
Easysoft Ltd, UK
http://www.easysoft.com

Re: prepare_cached - just how much quicker should I expect?

am 28.07.2006 19:24:26 von Tim.Bunce

On Fri, Jul 28, 2006 at 03:46:20PM +0100, Martin J. Evans wrote:
>
> On 27-Jul-2006 Tim Bunce wrote:
> > On Wed, Jul 26, 2006 at 10:43:34AM +0100, Martin J. Evans wrote:
> >>
> >> On 25-Jul-2006 Tim Bunce wrote:
> >> > Use DBI::Profile to see where the time is being spent.
> >>
> >> I got this working and could not see anything taking longer. However, on
> >> further investigation of my logs I have found the reason why doing a
> >> prepare_cached on "select LAST_INSERT_ID()" to mysql is a BAD thing to do.
> >> It always returns the first id ever inserted. i.e.
> >>
> >> insert into table1 values (val)
> >> select LAST_INSERT_ID() returns 1
> >> insert into table1 values (val)
> >> select LAST_INSERT_ID() returns 1 (instead of 2)
> >> etc
> >>
> >> I see there is a warning about how you can get into trouble with
> >> prepare_cached
> >> in the DBI pod. Perhaps it would be worth noting that there is also some SQL
> >> you don't want to prepare_cached as it won't work properly - like this
> >> example.
> >
> > You're making a guess about the cause of the problem. You might be
> > right, but if so it's a bug in DBD::mysql.
>
> I can reproduce here so I'll try and simplify it.
>
>
>
> To be honest, I am starting to get a bit frustrated so perhaps I'm losing some
> vision - appologies. I am writing code that needs to work with around 3-6 DBDs
> and seem to be hitting a lot of bugs and incompatibilities (almost on a daily
> basis). DBI->last_insert_id is one I struggled to get working in all of them so
> resorted to other means (I never got last_insert_id method working
> in Oracle and had to resort to a trigger and a global package variable,
> similarly with DB2 where I use identity_val_local()).

Free free to start a separate thread about that.

> Todays problem is:
>
> use DBI;
> #my $h = DBI->connect("dbi:Oracle:XE", "xxx", "yyy");
> $h = DBI->connect("dbi:mysql:xxx", "xxx","yyy");
> for (my $n = 0; $n < 2; $n++) {
> $h->do(q/insert into mje (b) values('a')/);
> #$sth = $h->prepare_cached("select LastInsertID() from dual");
> $sth = $h->prepare_cached("select LAST_INSERT_ID()");
> $sth->execute;
> my $r = $sth->fetchall_arrayref;
> }
>
> which returns:
>
> prepare_cached(select LAST_INSERT_ID()) statement handle
> DBI::st=HASH(0x9ba8114) still Active at -e line 8
>
> when run for mysql but works fine for Oracle (see commented out lines). Yes,
> you can fix it by putting a finish in (or adding if_active arg to
> prepare_cached) but why are they different. I'd guess its because mysql is not
> calling dbd_st_finish when all the rows are consumed.

Sure looks like a DBD::mysql bug. fetchall_arrayref should always leave
the sth inactive.

> Not really asking for any help here, just using it as an illustration.
>
>
>
> > But for mysql you'll find it much faster to just use the
> > $dbh->{mysql_insertid}
> > attribute after each insert and avoid the select entirely.
>
> Connecting to a remote mysql server over a 100M network I could not see any
> difference in using mysql_insertid and select LAST_INSERT_ID(), I remember
> looking into that:

If "select LAST_INSERT_ID()" is broken so it always returns the
same value then perhaps that contributes to the similar performance.

(More generally, as systems and networks get faster then things that
are less efficient are less noticeably inefficient. There are fewer
situations in which the inefficiency is relevant.)

Tim.