RE: ANNOUNCE: DBD::Oracle 1.18

am 10.08.2006 17:04:29 von Philip.Garrett

John Scoles wrote:
> DBD::Oracle 1.18 has been released.
>
> With this release DBD::Oracle finally implements Oracle's native
> Array Interface. You will see very dramatic increase in speed. For
> example; the time for a 2 million plus insert query dropped from well
> over an hour to less than 10 minutes when using execute_array() and
> the new code.

This new feature is really great. Thanks, John and Kristian.

The announcement and docs seem to really emphasize execute_array() over
execute_for_fetch(), though, which puzzles me. I see a dramatic speed
increase using execute_for_fetch() instead of execute_array().

I first converted my code to accumulate columns into individual
"parallel arrays" to be passed into execute_array(). I was happy to see
a 56% speed increase using the bulk insert.

Now, my program was cpu-bound and a lot of the cpu time was being eaten
by execute_array (which was, to my surprise, simply converting the
parallel arrays back into the format I originally used!). I changed the
program to just accumulate rows as arrayrefs and then called
execute_for_fetch with a simple shift() sub. Voila, another 20% faster.

Is there a reason I would want to use execute_array() with
ArrayTupleFetch instead of execute_for_fetch()? The latter is more
direct, since execute_array() just calls it in the end anyway. Would I
be missing out on some potential future optimization?

Also, is there any reason to convert old code to accumulate into single-
column arrays to be passed to execute_array, instead of using the
subroutine reference to return tuples? I assumed that the columnar
format was due to some driver implementation detail but I got burned by
that. It's much easier and much faster to convert old code to use the
fetch sub, since all it requires is to convert $sth->execute(@params) to
push(@rows,\@params).

Philip

Re: ANNOUNCE: DBD::Oracle 1.18

am 10.08.2006 18:52:48 von Tim.Bunce

On Thu, Aug 10, 2006 at 11:04:29AM -0400, Garrett, Philip (MAN-Corporate) wrote:
>
> Is there a reason I would want to use execute_array() with
> ArrayTupleFetch instead of execute_for_fetch()?

Probably not. Just use what comes most naturally for your code, which is
typically execute_for_fetch().

> The latter is more direct, since execute_array() just calls it in the
> end anyway. Would I be missing out on some potential future optimization?

Doubtful, for DBD::Oracle.

> Also, is there any reason to convert old code to accumulate into single-
> column arrays to be passed to execute_array, instead of using the
> subroutine reference to return tuples? I assumed that the columnar
> format was due to some driver implementation detail but I got burned by
> that. It's much easier and much faster to convert old code to use the
> fetch sub, since all it requires is to convert $sth->execute(@params) to
> push(@rows,\@params).

Some database api's might only support array based batches - that was the
case with older versions of Oracle. For them, using execute_for_fetch()
might be slower because they'd have to rebuild batches of parallel arrays.

I forget now which way round ODBC is. No doubt someone will tell...

Tim.

Re: ANNOUNCE: DBD::Oracle 1.18

am 10.08.2006 19:53:15 von Martin.Evans

On Thu, 2006-08-10 at 17:52 +0100, Tim Bunce wrote:
> On Thu, Aug 10, 2006 at 11:04:29AM -0400, Garrett, Philip (MAN-Corporate) wrote:
> >
> > Is there a reason I would want to use execute_array() with
> > ArrayTupleFetch instead of execute_for_fetch()?
>
> Probably not. Just use what comes most naturally for your code, which is
> typically execute_for_fetch().
>
> > The latter is more direct, since execute_array() just calls it in the
> > end anyway. Would I be missing out on some potential future optimization?
>
> Doubtful, for DBD::Oracle.
>
> > Also, is there any reason to convert old code to accumulate into single-
> > column arrays to be passed to execute_array, instead of using the
> > subroutine reference to return tuples? I assumed that the columnar
> > format was due to some driver implementation detail but I got burned by
> > that. It's much easier and much faster to convert old code to use the
> > fetch sub, since all it requires is to convert $sth->execute(@params) to
> > push(@rows,\@params).
>
> Some database api's might only support array based batches - that was the
> case with older versions of Oracle. For them, using execute_for_fetch()
> might be slower because they'd have to rebuild batches of parallel arrays.
>
> I forget now which way round ODBC is. No doubt someone will tell...

ODBC does both row-wise and column-wise binding although the default is
column-wise binding. It makes little difference which you use as the
calls are the same, its just that in row-wise binding to get to the same
column in the next row you add SQL_ATTR_BIND_TYPE to the ptr passed to
SQLBindCol and in column-wise binding you add the length of the column.
What you choose to use usually depends on the format the data you want
to bind is in. i.e. if you have an array of structures, one element per
row you use row-wise binding and it you have an array per column you use
column-wise binding.

I saw a mention in a previous posting recently suggesting DBD::ODBC
handled execute_array - I wasn't aware of this and will check but I
didn't think it did.

Martin
--
Martin J. Evans
Easysoft Limited
http://www.easysoft.com