Autovacuum stopped running

Autovacuum stopped running

am 01.05.2010 00:21:41 von Benjamin Krajmalnik

This is a multi-part message in MIME format.

------_=_NextPart_001_01CAE8B3.AD5B9EC1
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

I had a very strange occurrence 2 days ago where autovacuum appears to
have stopped running.

I did not see any error messages in pg_log. The problem caused a file
which is usually around 50Mb but which gets updated extensively to grow
to 105GB, which in turn brought the server's performance down to
unusable levels.

Restarting Postgres got the autovacuum running again, but was just
wondering what could have cause this.

=20

PostgreSQL 8.4.3 amd64 on FreeBSD 7.2.

=20


------_=_NextPart_001_01CAE8B3.AD5B9EC1
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:x=3D"urn:schemas-microsoft-com:office:excel" =
xmlns:p=3D"urn:schemas-microsoft-com:office:powerpoint" =
xmlns:a=3D"urn:schemas-microsoft-com:office:access" =
xmlns:dt=3D"uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" =
xmlns:s=3D"uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882" =
xmlns:rs=3D"urn:schemas-microsoft-com:rowset" xmlns:z=3D"#RowsetSchema" =
xmlns:b=3D"urn:schemas-microsoft-com:office:publisher" =
xmlns:ss=3D"urn:schemas-microsoft-com:office:spreadsheet" =
xmlns:c=3D"urn:schemas-microsoft-com:office:component:spread sheet" =
xmlns:odc=3D"urn:schemas-microsoft-com:office:odc" =
xmlns:oa=3D"urn:schemas-microsoft-com:office:activation" =
xmlns:html=3D"http://www.w3.org/TR/REC-html40" =
xmlns:q=3D"http://schemas.xmlsoap.org/soap/envelope/" =
xmlns:rtc=3D"http://microsoft.com/officenet/conferencing" =
xmlns:D=3D"DAV:" xmlns:Repl=3D"http://schemas.microsoft.com/repl/" =
xmlns:mt=3D"http://schemas.microsoft.com/sharepoint/soap/mee tings/" =
xmlns:x2=3D"http://schemas.microsoft.com/office/excel/2003/x ml" =
xmlns:ppda=3D"http://www.passport.com/NameSpace.xsd" =
xmlns:ois=3D"http://schemas.microsoft.com/sharepoint/soap/oi s/" =
xmlns:dir=3D"http://schemas.microsoft.com/sharepoint/soap/di rectory/" =
xmlns:ds=3D"http://www.w3.org/2000/09/xmldsig#" =
xmlns:dsp=3D"http://schemas.microsoft.com/sharepoint/dsp" =
xmlns:udc=3D"http://schemas.microsoft.com/data/udc" =
xmlns:xsd=3D"http://www.w3.org/2001/XMLSchema" =
xmlns:sub=3D"http://schemas.microsoft.com/sharepoint/soap/20 02/1/alerts/"=
xmlns:ec=3D"http://www.w3.org/2001/04/xmlenc#" =
xmlns:sp=3D"http://schemas.microsoft.com/sharepoint/" =
xmlns:sps=3D"http://schemas.microsoft.com/sharepoint/soap/" =
xmlns:xsi=3D"http://www.w3.org/2001/XMLSchema-instance" =
xmlns:udcs=3D"http://schemas.microsoft.com/data/udc/soap" =
xmlns:udcxf=3D"http://schemas.microsoft.com/data/udc/xmlfile " =
xmlns:udcp2p=3D"http://schemas.microsoft.com/data/udc/partto part" =
xmlns:wf=3D"http://schemas.microsoft.com/sharepoint/soap/wor kflow/" =
xmlns:dsss=3D"http://schemas.microsoft.com/office/2006/digsi g-setup" =
xmlns:dssi=3D"http://schemas.microsoft.com/office/2006/digsi g" =
xmlns:mdssi=3D"http://schemas.openxmlformats.org/package/200 6/digital-sig=
nature" =
xmlns:mver=3D"http://schemas.openxmlformats.org/markup-compa tibility/2006=
" xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns:mrels=3D"http://schemas.openxmlformats.org/package/200 6/relationshi=
ps" xmlns:spwp=3D"http://microsoft.com/sharepoint/webpartpages" =
xmlns:ex12t=3D"http://schemas.microsoft.com/exchange/service s/2006/types"=
=
xmlns:ex12m=3D"http://schemas.microsoft.com/exchange/service s/2006/messag=
es" =
xmlns:pptsl=3D"http://schemas.microsoft.com/sharepoint/soap/ SlideLibrary/=
" =
xmlns:spsl=3D"http://microsoft.com/webservices/SharePointPor talServer/Pub=
lishedLinksService" xmlns:Z=3D"urn:schemas-microsoft-com:" =
xmlns:st=3D"" xmlns=3D"http://www.w3.org/TR/REC-html40">


charset=3Dus-ascii">









I had a very strange occurrence 2 days ago where =
autovacuum
appears to have stopped running.



I did not see any error messages in pg_log.  =
The
problem caused a file which is usually around 50Mb but which gets =
updated
extensively to grow to 105GB, which in turn brought the server’s
performance down to unusable levels.



Restarting Postgres got the autovacuum running =
again, but
was just wondering what could have cause this.



 



PostgreSQL 8.4.3 amd64 on FreeBSD =
7.2.



 









------_=_NextPart_001_01CAE8B3.AD5B9EC1--

Re: Autovacuum stopped running

am 01.05.2010 16:26:33 von Kevin Grittner

"Benjamin Krajmalnik" wrote:
> I had a very strange occurrence 2 days ago where autovacuum
> appears to have stopped running.

What did you see which led you to that conclusion? Did ps show the
'autovacuum launcher process'?

> I did not see any error messages in pg_log. The problem caused a
> file which is usually around 50Mb but which gets updated
> extensively to grow to 105GB, which in turn brought the server's
> performance down to unusable levels.

Were there any long-running transactions showing in
pg_stat_activity? (That would include transactions showing '
in transaction'.)

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 01.05.2010 19:15:53 von Benjamin Krajmalnik

TO be truthful, I did not check that.
I arrived at the conclusion because although I have 10 autovacuum
threads, none of the tables had been autoanalyzed or autovacuumed in
hours (and due to the nature of the activity they do so every minute).
If I ever see this behavior now I know what to check for :) Thx.

> -----Original Message-----
> From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov]
> Sent: Saturday, May 01, 2010 8:27 AM
> To: Benjamin Krajmalnik; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Autovacuum stopped running
>=20
> "Benjamin Krajmalnik" wrote:
> > I had a very strange occurrence 2 days ago where autovacuum
> > appears to have stopped running.
>=20
> What did you see which led you to that conclusion? Did ps show the
> 'autovacuum launcher process'?
>=20
> > I did not see any error messages in pg_log. The problem caused a
> > file which is usually around 50Mb but which gets updated
> > extensively to grow to 105GB, which in turn brought the server's
> > performance down to unusable levels.
>=20
> Were there any long-running transactions showing in
> pg_stat_activity? (That would include transactions showing '
> in transaction'.)
>=20
> -Kevin

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 02.05.2010 01:09:35 von Tom Lane

"Benjamin Krajmalnik" writes:
> TO be truthful, I did not check that.
> I arrived at the conclusion because although I have 10 autovacuum
> threads, none of the tables had been autoanalyzed or autovacuumed in
> hours (and due to the nature of the activity they do so every minute).
> If I ever see this behavior now I know what to check for :) Thx.

We've seen one or two previous reports that suggested that the autovac
launcher might have stopped launching workers, but it hasn't been
reproducible. Are you using any nonstandard parameters for autovacuum?

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 02.05.2010 01:21:49 von Benjamin Krajmalnik

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay =3D 0ms # 0-100 milliseconds
#vacuum_cost_page_hit =3D 1 # 0-10000 credits
#vacuum_cost_page_miss =3D 10 # 0-10000 credits
#vacuum_cost_page_dirty =3D 20 # 0-10000 credits
vacuum_cost_limit =3D 10000 # 1-10000 credits

#----------------------------------------------------------- ------------
-------
# AUTOVACUUM PARAMETERS
#----------------------------------------------------------- ------------
-------

#autovacuum =3D on # Enable autovacuum subprocess?
'on'=20
# requires track_counts to also
be on.
#log_autovacuum_min_duration =3D -1 # -1 disables, 0 logs all
actions and
# their durations, > 0 logs only
# actions running at least this
number
# of milliseconds.
autovacuum_max_workers =3D 10 # max number of autovacuum
subprocesses
#autovacuum_naptime =3D 1min # time between autovacuum runs
#autovacuum_vacuum_threshold =3D 50 # min number of row updates
before
# vacuum
#autovacuum_analyze_threshold =3D 50 # min number of row updates
before=20
# analyze
#autovacuum_vacuum_scale_factor =3D 0.2 # fraction of table size before
vacuum
#autovacuum_analyze_scale_factor =3D 0.1 # fraction of table size before
analyze
#autovacuum_freeze_max_age =3D 200000000 # maximum XID age before forced
vacuum
# (change requires restart)
#autovacuum_vacuum_cost_delay =3D 20ms # default vacuum cost delay for
# autovacuum, in milliseconds;
# -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit =3D -1 # default vacuum cost limit for
# autovacuum, -1 means use
# vacuum_cost_limit


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Saturday, May 01, 2010 5:10 PM
> To: Benjamin Krajmalnik
> Cc: Kevin Grittner; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Autovacuum stopped running
>=20
> "Benjamin Krajmalnik" writes:
> > TO be truthful, I did not check that.
> > I arrived at the conclusion because although I have 10 autovacuum
> > threads, none of the tables had been autoanalyzed or autovacuumed in
> > hours (and due to the nature of the activity they do so every
> minute).
> > If I ever see this behavior now I know what to check for :) Thx.
>=20
> We've seen one or two previous reports that suggested that the autovac
> launcher might have stopped launching workers, but it hasn't been
> reproducible. Are you using any nonstandard parameters for
autovacuum?
>=20
> regards, tom lane

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 05:17:04 von Benjamin Krajmalnik

OK.
This happened again on another server with these same settings.
I did not see any long running transaction, and the autovacuum launch
process was running.


> -----Original Message-----
> From: Benjamin Krajmalnik
> Sent: Saturday, May 01, 2010 5:22 PM
> To: 'Tom Lane'
> Cc: Kevin Grittner; pgsql-admin@postgresql.org
> Subject: RE: [ADMIN] Autovacuum stopped running
>=20
> # - Cost-Based Vacuum Delay -
>=20
> #vacuum_cost_delay =3D 0ms # 0-100 milliseconds
> #vacuum_cost_page_hit =3D 1 # 0-10000 credits
> #vacuum_cost_page_miss =3D 10 # 0-10000 credits
> #vacuum_cost_page_dirty =3D 20 # 0-10000 credits
> vacuum_cost_limit =3D 10000 # 1-10000 credits
>=20
>
#----------------------------------------------------------- -----------
> --------
> # AUTOVACUUM PARAMETERS
>
#----------------------------------------------------------- -----------
> --------
>=20
> #autovacuum =3D on # Enable autovacuum subprocess?
'on'
> # requires track_counts to also
be on.
> #log_autovacuum_min_duration =3D -1 # -1 disables, 0 logs all
actions
> and
> # their durations, > 0 logs only
> # actions running at least this
number
> # of milliseconds.
> autovacuum_max_workers =3D 10 # max number of autovacuum
> subprocesses
> #autovacuum_naptime =3D 1min # time between autovacuum runs
> #autovacuum_vacuum_threshold =3D 50 # min number of row updates
before
> # vacuum
> #autovacuum_analyze_threshold =3D 50 # min number of row updates
before
> # analyze
> #autovacuum_vacuum_scale_factor =3D 0.2 # fraction of table size
> before vacuum
> #autovacuum_analyze_scale_factor =3D 0.1 # fraction of table size
> before analyze
> #autovacuum_freeze_max_age =3D 200000000 # maximum XID age before
> forced vacuum
> # (change requires restart)
> #autovacuum_vacuum_cost_delay =3D 20ms # default vacuum cost delay
> for
> # autovacuum, in milliseconds;
> # -1 means use vacuum_cost_delay
> #autovacuum_vacuum_cost_limit =3D -1 # default vacuum cost limit for
> # autovacuum, -1 means use
> # vacuum_cost_limit
>=20
>=20
> > -----Original Message-----
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > Sent: Saturday, May 01, 2010 5:10 PM
> > To: Benjamin Krajmalnik
> > Cc: Kevin Grittner; pgsql-admin@postgresql.org
> > Subject: Re: [ADMIN] Autovacuum stopped running
> >
> > "Benjamin Krajmalnik" writes:
> > > TO be truthful, I did not check that.
> > > I arrived at the conclusion because although I have 10 autovacuum
> > > threads, none of the tables had been autoanalyzed or autovacuumed
> in
> > > hours (and due to the nature of the activity they do so every
> > minute).
> > > If I ever see this behavior now I know what to check for :) Thx.
> >
> > We've seen one or two previous reports that suggested that the
> autovac
> > launcher might have stopped launching workers, but it hasn't been
> > reproducible. Are you using any nonstandard parameters for
> autovacuum?
> >
> > regards, tom lane

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 15:47:38 von Alvaro Herrera

Benjamin Krajmalnik wrote:
> OK.
> This happened again on another server with these same settings.
> I did not see any long running transaction, and the autovacuum launch
> process was running.

Did you check the server log for FATAL or ERROR lines? Maybe it's dying
trying to vacuum some table and isn't ever able to get to the next ones.


--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 15:52:06 von Kevin Grittner

"Benjamin Krajmalnik" wrote:

> This happened again on another server with these same settings.
> I did not see any long running transaction, and the autovacuum
> launch process was running.

Like last time, all ten autovacuum workers were running? What does
CPU and disk activity look like in this state? The contents of
pg_locks might yield a clue. If you have all ten workers showing on
the task list, but they don't seem to be making much progress,
connecting to each with gdb (or equivalent) and getting a backtrace
might be useful.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 15:58:31 von Tom Lane

"Benjamin Krajmalnik" writes:
> This happened again on another server with these same settings.
> I did not see any long running transaction, and the autovacuum launch
> process was running.

Hmm ... could you strace the launcher for a couple of minutes to see
whether it's doing anything?

Also, I've forgotten exactly which PG version you're running?

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 16:27:34 von Kevin Grittner

Tom Lane wrote:

> Also, I've forgotten exactly which PG version you're running?

The original post said:

PostgreSQL 8.4.3 amd64 on FreeBSD 7.2.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 16:53:43 von Benjamin Krajmalnik

Tom,

1. I had to restart the postgres server, otherwise I would have been in
deep problems.
2. I just tried installing strace from the ports system - unfortunately
it appears that FreeBSD only has it for i38s, not amd64. Any
alternatives so that if this happens again I can take care of this?

>=20
> Hmm ... could you strace the launcher for a couple of minutes to see
> whether it's doing anything?
>=20
> Also, I've forgotten exactly which PG version you're running?
>=20
> regards, tom lane

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 17:05:18 von Benjamin Krajmalnik

I did not find any errors pertaining to the autovacuum task.
In both cases we had a data-centric issue which was causing excessive
errors on the database (we have a timestamp field and an external
utility which we use to collect product keys was sending the date as
"dd/mm/yyyy hh:mm:ss" for some entries. We modified our processing
script to catch this condition and rewrite it to "mm/dd/yyyy hh:mm:ss"
on the server script, and I will be modifying the agent to send it in as
"yyyymmdd hh:mm:ss" in our next agent release to bullet proof it.


> -----Original Message-----
> From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
> Sent: Monday, May 03, 2010 7:48 AM
> To: Benjamin Krajmalnik
> Cc: Tom Lane; pgsql-admin@postgresql.org
> Subject: Re: [ADMIN] Autovacuum stopped running
>=20
> Benjamin Krajmalnik wrote:
> > OK.
> > This happened again on another server with these same settings.
> > I did not see any long running transaction, and the autovacuum
launch
> > process was running.
>=20
> Did you check the server log for FATAL or ERROR lines? Maybe it's
> dying
> trying to vacuum some table and isn't ever able to get to the next
> ones.
>=20
>=20
> --
> Alvaro Herrera
> http://www.CommandPrompt.com/
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Autovacuum stopped running

am 03.05.2010 17:26:43 von Alvaro Herrera

Benjamin Krajmalnik wrote:
> I did not find any errors pertaining to the autovacuum task.

Hmm. I think it would be good to find out what is the launcher doing,
if anything. If you strace it, do you see it sending signals to
postmaster?

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin