PD_ALL_VISIBLE flag warning

PD_ALL_VISIBLE flag warning

am 01.04.2010 15:10:39 von Jonathan Foy

--00163646b966d8b97f04832c96ea
Content-Type: text/plain; charset=ISO-8859-1

Hello

I came in this morning and noticed this warning sitting in my inbox quite a
few times...

WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "table_2010q1"
page 471118
WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "table_2010q1"
page 471119
WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation "table_2010q1"
page 471120
.....

and I'm wondering how worked up I should be getting.

I'm trying to do due diligence on my side and am digging through archives,
but just in case it's a cause for greater alarm than I think it is I'm going
to throw this out there and see what those more experienced that I say.

Postgres version 8.4.1, redhat 5.3.

The table in question is about 6 million rows, a single partition in a
larger set. A relatively heavy load (mass insert followed by a mass update)
takes place each morning, during the time period where I was seeing this
error. Auto-vacuum apparently kicked off at this time as well (auto-analyze
is showing last-run at 4:27 AM, auto-vacuum at 8:48). The first warning
kicked off at 4:55.

My assumption here is that the load from the inserts/updates was interfering
with auto-vacuum in some way, and resulted in the above warnings. Is this a
reasonable assumption at all? I'm under the impression that auto-vacuum
will kill itself if it notices that it's hurting performance, but I wouldn't
think it should throw all of these warnings...does it indicate poor
auto-vacuum configuration? I admit that said configuration is an area that
I've been meaning to look into more (upgraded from 8.1 where we didn't use
auto-vacuum), but hasn't ended up on the top of the priorities list yet.

Let me know if I'm not providing enough info.

--00163646b966d8b97f04832c96ea
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hello

I came in this morning and noticed this warning sitting in my =
inbox quite a few times...

WARNING:=A0 PD_ALL_VISIB=
LE flag was incorrectly set in=20
relation "table_2010q1" page 471118

WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation=20
"
table_2010q1" pa=
ge 471119

WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation=20
"
table_2010q1" pa=
ge 471120
....

and I'm wondering how worked up I should be ge=
tting.

I'm trying to do due diligence on my side and am digging =
through archives, but just in case it's a cause for greater alarm than =
I think it is I'm going to throw this out there and see what those more=
experienced that I say.


Postgres version 8.4.1, redhat 5.3.

The table in question is abo=
ut 6 million rows, a single partition in a larger set.=A0 A relatively heav=
y load (mass insert followed by a mass update) takes place each morning, du=
ring the time period where I was seeing this error.=A0 Auto-vacuum apparent=
ly kicked off at this time as well (auto-analyze is showing last-run at 4:2=
7 AM, auto-vacuum at 8:48).=A0 The first warning kicked off at 4:55.


My assumption here is that the load from the inserts/updates was interf=
ering with auto-vacuum in some way, and resulted in the above warnings.=A0 =
Is this a reasonable assumption at all?=A0 I'm under the impression tha=
t auto-vacuum will kill itself if it notices that it's hurting performa=
nce, but I wouldn't think it should throw all of these warnings...does =
it indicate poor auto-vacuum configuration? I admit that said configuration=
is an area that I've been meaning to look into more (upgraded from 8.1=
where we didn't use auto-vacuum), but hasn't ended up on the top o=
f the priorities list yet.


Let me know if I'm not providing enough info.


--00163646b966d8b97f04832c96ea--

Re: PD_ALL_VISIBLE flag warning

am 08.04.2010 06:59:12 von Greg Stark

On Thu, Apr 1, 2010 at 2:10 PM, Jonathan Foy wrote:
> Hello
>
> I came in this morning and noticed this warning sitting in my inbox quite=
a
> few times...
>
> WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation "table_20=
10q1"
> page 471118
> WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation "table_20=
10q1"
> page 471119
> WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation "table_20=
10q1"
> page 471120
> ....
>
> and I'm wondering how worked up I should be getting.

This is a warning about a fairly serious data corruption problem. In
the worst case prior queries returned incorrect results -- which if
those results were used to calculate subsequent updates could have
caused your data to be inconsistent.

Specifically, having that flag set would cause rows to be returned
which should not have been returned, either because they had been
deleted, the transaction inserting them aborted, or they had been
updated (causing duplicate copies of the same row to be returned).

There was a bug which could cause this in 8.4.0 but it was fixed in
8.4.1. You say you're running 8.4.1 (you should be running 8.4.3
though none of the subsequent bug-fixes are as serious if the SSL
security vulnerability isn't relevant) so this is quite worrisome.
Nobody else has reported anything similar though.

Is it possible these pages have never been vacuumed since you updated
from 8.4.0 to 8.4.1? I'm not sure how likely that is. In 8.4 Postgres
tries to avoid vacuuming pages that don't need it so it's possible
this situation could live on for longer than in the past.

I think a regular "VACUUM" of every table should guarantee that any
remnants of this bug from 8.4.0 are cleaned up. If you do this and
possibly get some more warnings from it but never see it again
afterwards then I would assume they're leftovers from that old bug.

--=20
greg

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: PD_ALL_VISIBLE flag warning

am 08.04.2010 14:57:00 von Jonathan Foy

--0016e65b5b0aed95180483b93654
Content-Type: text/plain; charset=ISO-8859-1

Thanks for the reply...

We have an unhealthy habit of not updating very often around here, one that
I'm trying to fix. We run both 8.1 and 8.4, and we're testing the update to
8.1.20 this weekend....from 8.1.3. Across the board 8.4 update will take
much more testing due to the typing issue. Other 8.4 updates will take place
soon.

Anyway, my guilt about outdated versions aside...

We never actually ran 8.4.0. We went straight from 8.3.3 to 8.4.1. I did a
full dump/restore from the 8.3.3 database into the 8.4.1 database (using the
8.4 pg_dump), partitioning as I went. The table in question was not
partitioned in 8.3, the 8.4 dump/restore seemed as good a place to do the
partitioning as any. I don't specifically remember doing it, but I almost
always do a manual vacuum/analyze after a restore. Autovacuum is/was also
turned on. According to the stats, everything seems to have been at least
vacuumed on 2/15 (the date of the update I believe), so it is doubtful to me
that any of the tables, especially the currently active partition, would
have gone un-vacuumed as of 4/1.

The logs for that day have already rotated into less-easily available
storage, but I'm considering retrieving them to look for any
aborts/rollbacks.

Also, for what it's worth, we never delete rows from this table, we only
insert and update.

I have not seen the warning since.

Sound like trouble?


On Thu, Apr 8, 2010 at 12:59 AM, Greg Stark wrote:

> On Thu, Apr 1, 2010 at 2:10 PM, Jonathan Foy wrote:
> > Hello
> >
> > I came in this morning and noticed this warning sitting in my inbox quite
> a
> > few times...
> >
> > WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation
> "table_2010q1"
> > page 471118
> > WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation
> "table_2010q1"
> > page 471119
> > WARNING: PD_ALL_VISIBLE flag was incorrectly set in relation
> "table_2010q1"
> > page 471120
> > ....
> >
> > and I'm wondering how worked up I should be getting.
>
> This is a warning about a fairly serious data corruption problem. In
> the worst case prior queries returned incorrect results -- which if
> those results were used to calculate subsequent updates could have
> caused your data to be inconsistent.
>
> Specifically, having that flag set would cause rows to be returned
> which should not have been returned, either because they had been
> deleted, the transaction inserting them aborted, or they had been
> updated (causing duplicate copies of the same row to be returned).
>
> There was a bug which could cause this in 8.4.0 but it was fixed in
> 8.4.1. You say you're running 8.4.1 (you should be running 8.4.3
> though none of the subsequent bug-fixes are as serious if the SSL
> security vulnerability isn't relevant) so this is quite worrisome.
> Nobody else has reported anything similar though.
>
> Is it possible these pages have never been vacuumed since you updated
> from 8.4.0 to 8.4.1? I'm not sure how likely that is. In 8.4 Postgres
> tries to avoid vacuuming pages that don't need it so it's possible
> this situation could live on for longer than in the past.
>
> I think a regular "VACUUM" of every table should guarantee that any
> remnants of this bug from 8.4.0 are cleaned up. If you do this and
> possibly get some more warnings from it but never see it again
> afterwards then I would assume they're leftovers from that old bug.
>
> --
> greg
>

--0016e65b5b0aed95180483b93654
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Thanks for the reply...

We have an unhealthy habit of not updating v=
ery often around here, one that I'm trying to fix. We run both 8.1 and =
8.4, and we're testing the update to 8.1.20 this weekend....from 8.1.3.=
=A0 Across the board 8.4 update will take much more testing due to the typi=
ng issue. Other 8.4 updates will take place soon.


Anyway, my guilt about outdated versions aside...

We never actua=
lly ran 8.4.0.=A0 We went straight from 8.3.3 to 8.4.1.=A0 I did a full dum=
p/restore from the 8.3.3 database into the 8.4.1 database (using the 8.4 pg=
_dump), partitioning as I went. The table in question was not partitioned i=
n 8.3, the 8.4 dump/restore seemed as good a place to do the partitioning a=
s any.=A0 I don't specifically remember doing it, but I almost always d=
o a manual vacuum/analyze after a restore.=A0 Autovacuum is/was also turned=
on.=A0 According to the stats, everything seems to have been at least vacu=
umed on 2/15 (the date of the update I believe), so it is doubtful to me th=
at any of the tables, especially the currently active partition, would have=
gone un-vacuumed as of 4/1.


The logs for that day have already rotated into less-easily available s=
torage, but I'm considering retrieving them to look for any aborts/roll=
backs.

Also, for what it's worth, we never delete rows from this=
table, we only insert and update.


I have not seen the warning since.

Sound like trouble?

r>

On Thu, Apr 8, 2010 at 12:59 AM, Greg Stark pan dir=3D"ltr"><&=
gt; wrote:

r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
>On Thu, Apr 1, 2010 at 2:10 PM, Jonathan Foy < gmail.com">thefoy@gmail.com> wrote:


> Hello

>

> I came in this morning and noticed this warning sitting in my inbox qu=
ite a

> few times...

>

> WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation "=
table_2010q1"

> page 471118

> WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation "=
table_2010q1"

> page 471119

> WARNING:=A0 PD_ALL_VISIBLE flag was incorrectly set in relation "=
table_2010q1"

> page 471120

> ....

>

> and I'm wondering how worked up I should be getting.



This is a warning about a fairly serious data corruption problem. In<=
br>
the worst case prior queries returned incorrect results -- which if

those results were used to calculate subsequent updates could have

caused your data to be inconsistent.



Specifically, having that flag set would cause rows to be returned

which should not have been returned, either because they had been

deleted, the transaction inserting them aborted, or they had been

updated (causing duplicate copies of the same row to be returned).



There was a bug which could cause this in 8.4.0 but it was fixed in

8.4.1. =A0You say you're running 8.4.1 (you should be running 8.4.3

though none of the subsequent bug-fixes are as serious if the SSL

security vulnerability isn't relevant) so this is quite worrisome.

Nobody else has reported anything similar though.



Is it possible these pages have never been vacuumed since you updated

from 8.4.0 to 8.4.1? I'm not sure how likely that is. In 8.4 Postgres r>
tries to avoid vacuuming pages that don't need it so it's possible<=
br>
this situation could live on for longer than in the past.



I think a regular "VACUUM" of every table should guarantee that a=
ny

remnants of this bug from 8.4.0 are cleaned up. If you do this and

possibly get some more warnings from it but never see it again

afterwards then I would assume they're leftovers from that old bug.



--

greg




--0016e65b5b0aed95180483b93654--

Re: PD_ALL_VISIBLE flag warning

am 08.04.2010 16:16:41 von Greg Stark

On Thu, Apr 8, 2010 at 1:57 PM, Jonathan Foy wrote:
> Thanks for the reply...
>
> We have an unhealthy habit of not updating very often around here, one th=
at
> I'm trying to fix. We run both 8.1 and 8.4, and we're testing the update =
to
> 8.1.20 this weekend....from 8.1.3.=A0 Across the board 8.4 update will ta=
ke
> much more testing due to the typing issue. Other 8.4 updates will take pl=
ace
> soon.

If you're on 8.1 it's important to be on the latest 8.1 -- there are
at least 17 significant bugs that warranted bug-fix releases between
8.1.3 and 8.1.20 including security bugs and data corruption bugs.

> We never actually ran 8.4.0.=A0 We went straight from 8.3.3 to 8.4.1.=A0 =
I did a
> full dump/restore from the 8.3.3 database into the 8.4.1 database (using =
the
> 8.4 pg_dump), partitioning as I went.

Ugh, then it sounds like there's a real as-yet unidentified problem.

The other thing to check before panicking is, has this server ever run
with fsync off and crashed (either kernel panic or power failure)?

What does the storage system on this server look like? Are they IDE or
SCSI drives? Is there an LVM layer involved? etc.




--=20
greg

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: PD_ALL_VISIBLE flag warning

am 08.04.2010 17:03:14 von Jonathan Foy

--001636417c495aaffd0483bafa81
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Apr 8, 2010 at 10:16 AM, Greg Stark wrote:

>
> If you're on 8.1 it's important to be on the latest 8.1 -- there are
> at least 17 significant bugs that warranted bug-fix releases between
> 8.1.3 and 8.1.20 including security bugs and data corruption bugs.
>

Understood. It's my current top priority.


>
> > We never actually ran 8.4.0. We went straight from 8.3.3 to 8.4.1. I
> did a
> > full dump/restore from the 8.3.3 database into the 8.4.1 database (using
> the
> > 8.4 pg_dump), partitioning as I went.
>
> Ugh, then it sounds like there's a real as-yet unidentified problem.
>
> The other thing to check before panicking is, has this server ever run
> with fsync off and crashed (either kernel panic or power failure)?
>

I have never turned off fsync. Also, the server is showing an uptime of 281
days, and the postgres instance has been up since 2/21.


>
> What does the storage system on this server look like? Are they IDE or
> SCSI drives? Is there an LVM layer involved? etc.
>
> --
> greg
>

Storage is on an NFS mounted Fiber Netapp volume.

Let me know if you need more...

--001636417c495aaffd0483bafa81
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable