optimizer behavior in the case of highly updated tables
am 09.06.2010 04:58:34 von Mark Rostron
--_000_FD020D3E50E7FA479567872E5F5F31E304599C0B41ex01corpql2 co_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
I am looking for some specific information regarding optimizer behavior.
We recently experienced a situation where a query that was previously using=
a btree lookup (efficient) SWITCHED to using seqscan/hash lookup.
My questions would be:
- Under what circumstances is the optimizer likely to CHANGE behav=
ior from using a btree index lookup to using a seq scan/hash lookup?
- What are the critical decision factors that would feed into the =
optimizer making such a change?
- Is it possible to measure any metrics in a way that would enable=
a prediction of such a change?
Platform
- 8.3.10 (64bit) on RHEL5.
- Linux xxxxx 2.6.18-164.10.1.el5xen #1 SMP Thu Jan 7 20:28:30 EST=
2010 x86_64 x86_64 x86_64 GNU/Linux
Application
The table in question is:
- 30m rows, variable length (contains varchar fields), rowlength a=
vg about 120B
- Approx. 3m unique values in the index column
Activity on the table would be, per row: "Insert, multiple updates, delete=
after 90 days"
We vacuum analyze this table once/weekly.
No partitions are used.
Our experience which prompts this question was as follows:
- If the table is not "vacuum analyze'd" at least once/week, the q=
uery plans become unpredictable as to whether they will use btree or seqsca=
n/hash lookup
- Until last week, "vacuum analyze" was sufficient
- Friday evening of last week, the query plan for selected queries=
against this index changed again, but "vacuum analyze" was insufficient
- Rebuilding index on primary key and on the column index was insu=
fficient
- It was necessary to take a site outage and perform a "vacuum ful=
l analyze" on the table
- Following this, the query plan reverted to the more efficient bt=
ree lookup
Clearly, the garbage buildup resulting from transaction activity on the tab=
le is the villain here.
- Is it possible to calculate expected space usage given row count=
and average row size
- At what point might the ratio of "expected"/"actual" space usage=
be able to indicate the need to perform "full vacuum", or similar maintena=
nce
Any observations/comments that anyone would care to make are welcome.
Thanks in advance for your time
Mr
--_000_FD020D3E50E7FA479567872E5F5F31E304599C0B41ex01corpql2 co_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
=3D"text/html; charset=3Dus-ascii">
oft Word 14 (filtered medium)">
nk=3Dpurple>I am looking for=
some specific information regarding optimizer behavior.
lass=3DMsoNormal>We recently experienced a situation where a query that was=
previously using a btree lookup (efficient) SWITCHED to using seqscan/hash=
lookup.
=3DMsoNormal>My questions would be:
>
o-list:l0 level1 lfo2'>
>-
&nbs=
p; Under what circumstances=
is the optimizer likely to CHANGE behavior from using a btree index lookup=
to using a seq scan/hash lookup?
style=3D'text-indent:-.25in;mso-list:l0 level1 lfo2'>=
-
"'>
endif]>What are the critical decision factors that would feed into the opti=
mizer making such a change?
=3D'text-indent:-.25in;mso-list:l0 level1 lfo2'>
style=3D'mso-list:Ignore'>-
&nb=
sp;
>Is it possible to measure any metrics in a way that would enable a predict=
ion of such a change?
<=
/p>
Platform
&n=
bsp;
ist:l0 level1 lfo2'>
-<=
span style=3D'font:7.0pt "Times New Roman"'> &=
nbsp; 8.3.10 (64bit) on RHEL5.
p>
t:l0 level1 lfo2'>
-
an style=3D'font:7.0pt "Times New Roman"'> &nb=
sp; Linux xxxxx 2.6.18-164.10.1.e=
l5xen #1 SMP Thu Jan 7 20:28:30 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
>
A=
pplication
The table in question is:
>
:l2 level1 lfo1'>
-
n style=3D'font:7.0pt "Times New Roman"'> &nbs=
p; 30m rows, variable length (con=
tains varchar fields), rowlength avg about 120B
oListParagraph style=3D'text-indent:-.25in;mso-list:l2 level1 lfo1'>
supportLists]>
-
imes New Roman"'>
pan>Approx. 3m unique values in the index column
>
Activit=
y on the table would be, per row: “Insert, multiple updates, de=
lete after 90 days”
:p>
We vacuum analyze this table once/weekly.<=
/o:p>
No partitions are used.
s=3DMsoNormal>
Our experience whic=
h prompts this question was as follows:
agraph style=3D'text-indent:-.25in;mso-list:l2 level1 lfo1'>
ists]>
-
Roman"'>
an>If the table is not “vacuum analyze’d” at le=
ast once/week, the query plans become unpredictable as to whether they will=
use btree or seqscan/hash lookup
style=3D'text-indent:-.25in;mso-list:l2 level1 lfo1'>=
-
"'>
endif]>Until last week, “vacuum analyze” was sufficient
:p>
level1 lfo1'>-
yle=3D'font:7.0pt "Times New Roman"'> &n=
bsp; Friday evening of last week, the q=
uery plan for selected queries against this index changed again, but “=
;vacuum analyze” was insufficient
agraph style=3D'text-indent:-.25in;mso-list:l2 level1 lfo1'>
ists]>-
Roman"'>
an>Rebuilding index on primary key and on the column index was in=
sufficient
..25in;mso-list:l2 level1 lfo1'>
:Ignore'>- &n=
bsp; It was necessary=
to take a site outage and perform a “vacuum full analyze” on t=
he table
5in;mso-list:l2 level1 lfo1'>
gnore'>- &nbs=
p; Following this, th=
e query plan reverted to the more efficient btree lookup
lass=3DMsoNormal> Clearly, the gar=
bage buildup resulting from transaction activity on the table is the villai=
n here.
in;mso-list:l1 level1 lfo3'>
nore'>-  =
; Is it possible to c=
alculate expected space usage given row count and average row size
p>
evel1 lfo3'>-
le=3D'font:7.0pt "Times New Roman"'> &nb=
sp; At what point might the ratio of =
8220;expected”/”actual” space usage be able to indicate t=
he need to perform “full vacuum”, or similar maintenance=
o:p>
Any =
observations/comments that anyone would care to make are welcome.
>
Thanks in advance for your time
class=3DMsoNormal>Mr =
p>
 =
;
=
Re: optimizer behavior in the case of highly updated tables
am 09.06.2010 08:04:04 von Greg Smith
Mark Rostron wrote:
>
> - It was necessary to take a site outage and perform a =93vacuum full=20
> analyze=94 on the table
>
> - Following this, the query plan reverted to the more efficient btree=20
> lookup
>
> Clearly, the garbage buildup resulting from transaction activity on=20
> the table is the villain here.
>
> - Is it possible to calculate expected space usage given row count and=20
> average row size
>
> - At what point might the ratio of =93expected=94/=94actual=94 space us=
age be=20
> able to indicate the need to perform =93full vacuum=94, or similar main=
tenance
>
I think you're right to focus on this part, because with your usage=20
pattern--deleting all old data constantly--you have to get this under=20
control in order for the query planner to do the right thing here.
Start by reading http://wiki.postgresql.org/wiki/VACUUM_FULL
What you probably should have done in order to return to sane query=20
plans was to run CLUSTER instead of VACUUM FULL. Site would have been=20
down less time, and you actually made some future problems a bit worse=20
by screwing your indexes up some using the problematic FULL.
As for measuring what's going wrong here, what you want to do is run=20
this around once a week during a slow period and save the output for=20
analysis:
VACUUM VERBOSE;
This will give you a bunch of statistics about space usage, including a=20
summary at the end that will tell if there's a serious problem you might=20
already be running into (running out of space in the free space map).=20
Combine that with a look at pg_stat_user_tables and you should have a=20
decent initial statistical picture of what's going on with your data.
There are two large scale things you may need to consider if volume on=20
your site expands in the future:
1) Deleting the historical data in smaller chunks and doing an immediate=20
VACUUM afterwards. Perhaps as often as daily. This keeps the amount of=20
dead space VACUUM has to clean up as small as possible, at the expensive=20
of higher site overhead.
2) Start partitioning your tables. This allows you to just DROP old=20
partitions rather than deleting rows from a single master table. It can=20
make this whole class of problem go away.
--=20
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us
--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin