Performance still bad.

Performance still bad.

am 15.09.2007 17:53:07 von abaumhau

Attached is output from a utility called snapper, being run against the
Session that is basically at a complete standstill. Any ideas from this
report where the problem lies?

SQL> @snapper.sql out 1 30 202

-- Session Snapper v1.03 by Tanel Poder ( http://www.tanelpoder.com )

------------------------------------------------------------ ------------------------------------------------------------ --------------------
-- SID, SNAPSHOT START , SECONDS , TYPE,
STATISTIC , DELTA, D/SEC,
HDELTA, HD/SEC
------------------------------------------------------------ ------------------------------------------------------------ --------------------
DATA, 202, 20070915 05:42:23, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 1
DATA, 202, 20070915 05:42:26, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 4
DATA, 202, 20070915 05:42:29, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 7
DATA, 202, 20070915 05:42:32, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 10
DATA, 202, 20070915 05:42:35, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 13
DATA, 202, 20070915 05:42:36, 1, STAT, opened cursors
cumulative , 6, 6, 6 6
DATA, 202, 20070915 05:42:36, 1, STAT, user
calls , 18, 18,
18 18
DATA, 202, 20070915 05:42:36, 1, STAT, session logical
reads , 164, 164, 164 164
DATA, 202, 20070915 05:42:36, 1, STAT, enqueue
requests , 1, 1,
1 1
DATA, 202, 20070915 05:42:36, 1, STAT, enqueue
releases , 1, 1,
1 1
DATA, 202, 20070915 05:42:36, 1, STAT, db block
gets , 153, 153,
153 153
DATA, 202, 20070915 05:42:36, 1, STAT, db block gets from
cache , 153, 153, 153 153
DATA, 202, 20070915 05:42:36, 1, STAT, consistent
gets , 11, 11,
11 11
DATA, 202, 20070915 05:42:36, 1, STAT, consistent gets from
cache , 11, 11, 11 11
DATA, 202, 20070915 05:42:36, 1, STAT, consistent gets -
examination , 4, 4, 4 4
DATA, 202, 20070915 05:42:36, 1, STAT, db block
changes , 119, 119,
119 119
DATA, 202, 20070915 05:42:36, 1, STAT, free buffer
requested , 1, 1,
1 1
DATA, 202, 20070915 05:42:36, 1, STAT, commit
cleanouts , 3, 3,
3 3
DATA, 202, 20070915 05:42:36, 1, STAT, commit cleanouts
successfully completed , 3, 3, 3 3
DATA, 202, 20070915 05:42:36, 1, STAT, calls to
kcmgcs , 2, 2,
2 2
DATA, 202, 20070915 05:42:36, 1, STAT, calls to
kcmgas , 2, 2,
2 2
DATA, 202, 20070915 05:42:36, 1, STAT, calls to get snapshot
scn: kcmgss , 8, 8, 8 8
DATA, 202, 20070915 05:42:36, 1, STAT, redo
entries , 61, 61,
61 61
DATA, 202, 20070915 05:42:36, 1, STAT, redo
size , 36012, 36012,
36.01k 36.01k
DATA, 202, 20070915 05:42:36, 1, STAT, redo ordering
marks , 1, 1, 1 1
DATA, 202, 20070915 05:42:36, 1, STAT, undo change vector
size , 13720, 13720, 13.72k 13.72k
DATA, 202, 20070915 05:42:36, 1, STAT, immediate (CURRENT)
block cleanout appli, 3, 3, 3 3
DATA, 202, 20070915 05:42:36, 1, STAT, deferred (CURRENT)
block cleanout applic, 6, 6, 6 6
DATA, 202, 20070915 05:42:36, 1, STAT, commit txn count
during cleanout , 2, 2, 2 2
DATA, 202, 20070915 05:42:36, 1, STAT, active txn count
during cleanout , 2, 2, 2 2
DATA, 202, 20070915 05:42:36, 1, STAT, cleanout - number of
ktugct calls , 3, 3, 3 3
DATA, 202, 20070915 05:42:36, 1, STAT, leaf node
splits , 1, 1,
1 1
DATA, 202, 20070915 05:42:36, 1, STAT, heap block
compress , 1, 1,
1 1
DATA, 202, 20070915 05:42:36, 1, STAT, session cursor cache
hits , 6, 6, 6 6
DATA, 202, 20070915 05:42:36, 1, STAT, parse count
(total) , 6, 6,
6 6
DATA, 202, 20070915 05:42:36, 1, STAT, execute
count , 12, 12,
12 12
DATA, 202, 20070915 05:42:36, 1, STAT, bytes sent via
SQL*Net to client , 10438540, 10438540, 10.44M 10.44M
DATA, 202, 20070915 05:42:36, 1, STAT, bytes received via
SQL*Net from client , 20823, 20823, 20.82k 20.82k
DATA, 202, 20070915 05:42:36, 1, STAT, SQL*Net roundtrips
to/from client , 26, 26, 26 26
DATA, 202, 20070915 05:42:36, 1, TIME, repeated bind elapsed
time , 30, 30, 30us 30us
DATA, 202, 20070915 05:42:36, 1, TIME, parse time
elapsed , 87, 87,
87us 87us
DATA, 202, 20070915 05:42:36, 1, TIME, DB
CPU , 2086, 2086,
2.09ms 2.09ms
DATA, 202, 20070915 05:42:36, 1, TIME, sql execute elapsed
time , 1794, 1794, 1.79ms 1.79ms
DATA, 202, 20070915 05:42:36, 1, TIME, DB
time , 2086, 2086,
2.09ms 2.09ms
DATA, 202, 20070915 05:42:36, 1, WAIT, virtual circuit
status , -707825, -707825, -707.83ms -707.83ms
DATA, 202, 20070915 05:42:36, 1, WAIT, SQL*Net message to
client , 26, 26, 26us 26us
DATA, 202, 20070915 05:42:36, 1, WAIT, SQL*Net message from
client , 16, 16, 16us 16us
DATA, 202, 20070915 05:42:36, 1, WAIT, SQL*Net more data
from client , 29292216, 29292216, 29.29s 29.29s
-- End of snap 14
DATA, 202, 20070915 05:42:38, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 16
DATA, 202, 20070915 05:42:41, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 19
DATA, 202, 20070915 05:42:44, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 22
DATA, 202, 20070915 05:42:47, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 25
DATA, 202, 20070915 05:42:50, 1, WAIT, virtual circuit
status , 3000000, 3000000, 3s 3s
-- End of snap 28

PL/SQL procedure successfully completed.

Re: Oracle Performance still bad.

am 16.09.2007 17:05:38 von abaumhau

This is an Oracle database (10.2g). I apologize if this question isn't
DBI or DBD::Oracle related.

Basically, I have an instance of Oracle running a pair of tablespaces
(one for data, the other for indexes) with 101 million records (data dbf
is 53GB). On the same box (a quad core Xeon with 16GB of ram) I'm
running an empty database with the same schema.

Using DBD::Oracle, I open both databases (src, dst) and do the following:

Build query for the names of the tables in the src db

foreach table
query the column names in this table
push ROWID onto column list
build query string using column list (not using SELECT *)
prepare stm
bind columns to an array i'll call @row
build destination insert statement using placeholders from the
column list (minus ROWID)
prepare dst_sth
while src_sth->fetch()
modify @row values (@new_row)
bind new_row values as bind_params on dst_sth
insert @new_row with execute, returns ROWID (using RETURNING INTO
:new_id)
insert tablename, new_rowid, old_rowid (kind of like logging...)
commit every 1000 rows
rinse repeat for each row
rinse repeat for each table


I've done this type of processing on MySQL database before. This one is
on Oracle.

When running the code, I see a huge performance problem that pops up
what appears to be randomly on the execute for the insert. When running
DBI::Profile, and using Time::HiRes around the each of the steps, I've
isolated the delay at the execute statement.

When a delay occurs, it is always 29.99999 (basically 30) seconds for
the execute. Good executes occur in 0.003 seconds.

Here is a piece of DBI::Profile for 20000 rows in the loop:

execute' =>
'DBD::Oracle::st::execute' =>
130.783813s / 20000 = 0.006539s avg (first 0.004013s, min
0.000390s, max 29.996666s)

As you can see, the delay occurred a quite a number of times during this
run (I'm seeing about 1000 rows/second when it "works"). The delay is
always 30 seconds, so this just screams of a timer or timeout somewhere
in code. I've stripped the code to the bare bones (read a row, write a
row, drop all indexes on the dst_database), and I can duplicate the problem.

Using an Oracle procedure I found on the net (snapper), I ran it against
the SID that the above code was connected to. From looking at the
output, I believe all of the time is being consumed in "WAIT, SQL*Net
more data from client , 29292216, 29292216,
29.29s 29.29s ".

According to what I can Google, this implies that there's a "network
performance" problem. I'm connecting with TCP on localhost. I've made
the listener changes suggested in the DBD::Oracle docs. I'm running a
queue depth of 50, with 40 listeners pre-started.

When capturing data using tcpdump, I see the following around the 30
second delay:

DBD::Oracle sends a request data packet (t=0)
TCP ACK packet in 17 microseconds later (t=0.000017)
Delay of 29.995692 seconds before the first DATA RESPONSE packet arrives
from Oracle (port 1521)
DBD::Oracle request data (an insert into second table the ROWID tracking
table) 35.4 microseconds later
Oracle response 75 microseconds later

Subsequent packets are the same, but the inter-packet times are around
14 - 17 microseconds for the reading/writing.

Based on the packet trace, clearly Oracle isn't responding very quickly.

Martin Evans suggested that this might be a "Quantum Wait" timer in
Oracle. He also hinted that the user doing the inserts might be in a
Resource Group that was limiting the performance. After Googling and
looking at the Database Resource Manager, I found my user was in the
DEFAULT_RESOURCE_GROUP. I attempted to move my user to the SYS_GROUP,
thinking that I'd get better performance. I restarted the database, and
ran the test again. It sure looked like the problem disappeared. After
a while, though, the problem returned. The more data I process, it
seems the more the delay is introduced.

I don't think that I'm even running the DRM, because queries into the V$
tables that refer to DRM related things return zero rows.

I'm at a loss to determine what Oracle might be doing to kill
performance. A year ago, there was a post to this list that suggested
the same problem I'm seeing. Unfortunately, there wasn't follow up to
suggest the root cause and remedy.

Any suggestions would be greatly appreciated.



Andy Baumhauer