Dangerous hint in the PostgreSQL manual

Dangerous hint in the PostgreSQL manual

am 10.12.2007 16:26:12 von lst_hoe01

Hello

I have been trapped by the advice from the manual to use "sysctl -w
vm.overcommit_memory=2" when using Linux (see 16.4.3. Linux Memory
Overcommit). This value should only be used when PostgreSQL is the
only Application running on the machine in question. It should be
checked against the values "CommitLimit" and "Committed_AS" in
/proc/meminfo on a longer running system. If "Committed_AS" reaches or
come close to "CommitLimit" one should not set overcommit_memory=2 (see
http://www.redhat.com/archives/rhl-devel-list/2005-February/ msg00738.html). I
think it should be included in the manual as a warning because with
this setting the machine in question may get trouble with "fork
failed" even if the standard system tools report a lot of free memory
causing confusion to the admins.

Regards

Andreas



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Re: Dangerous hint in the PostgreSQL manual

am 10.12.2007 21:47:50 von Andrew Sullivan

On Mon, Dec 10, 2007 at 04:26:12PM +0100, Listaccount wrote:
> Hello
>
> I have been trapped by the advice from the manual to use "sysctl -w
> vm.overcommit_memory=2" when using Linux (see 16.4.3. Linux Memory
> Overcommit). This value should only be used when PostgreSQL is the

I think you need to read the documentation more carefully, because it
clearly suggests you (1) look at the kernel source and (2) consult a kernel
expert as part of your evaluation.

In any case,

> /proc/meminfo on a longer running system. If "Committed_AS" reaches or
> come close to "CommitLimit" one should not set overcommit_memory=2 (see
> http://www.redhat.com/archives/rhl-devel-list/2005-February/ msg00738.html).

my own reading of that message leads me to the opposite conclusion as yours.
You should _for sure_ set overcommit_memory=2 in that case. And this is
why:

> this setting the machine in question may get trouble with "fork
> failed" even if the standard system tools report a lot of free memory
> causing confusion to the admins.

You _want_ the fork to fail when the kernel can't (over)commit the memory,
because otherwise the stupid genius kernel will come along and maybe blip
your postmaster on the head, causing it to die by surprise. Don't like
that? Use more memory. Or get an operating system that doesn't do stupid
things like promise more memory than it has.

Except, of course, those are getting rarer and rarer all the time.

Please note that memory overcommit is sort of like a high-risk mortgage: the
chances that the OS will recover enough memory in any given round start out
as high. Eventually, however, the [technical|financial] economy is such
that only high-risk commitments are available, and at that point, _someone_
isn't going to pay back enough [memory|money] to the thing demanding it. At
that point, it's anyone's guess what will happen next.

A


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Re: Dangerous hint in the PostgreSQL manual

am 11.12.2007 09:23:38 von lst_hoe01

Zitat von Andrew Sullivan :

> On Mon, Dec 10, 2007 at 04:26:12PM +0100, Listaccount wrote:
>> Hello
>>
>> I have been trapped by the advice from the manual to use "sysctl -w
>> vm.overcommit_memory=3D2" when using Linux (see 16.4.3. Linux Memory
>> Overcommit). This value should only be used when PostgreSQL is the
>
> I think you need to read the documentation more carefully, because it
> clearly suggests you (1) look at the kernel source and (2) consult a kern=
el
> expert as part of your evaluation.
>
> In any case,

Consult the kernel source is a little bit overkill for setup a database.

>> /proc/meminfo on a longer running system. If "Committed_AS" reaches or
>> come close to "CommitLimit" one should not set overcommit_memory=3D2 (see
>> http://www.redhat.com/archives/rhl-devel-list/2005-February/ msg00738.htm=
l).
>
> my own reading of that message leads me to the opposite conclusion as you=
rs.
> You should _for sure_ set overcommit_memory=3D2 in that case. And this is
> why:

I don't want to start the discussion what is the rigth thing todo,
both settings the default "0" and "2" have advantages and drawbacks.
What i would like to see in the documentation is the easy hint to
check if you get i trouble with this setting so one can prepare.
A simple "see if your "CommitLimit - Commited_AS" from /proc/meminfo
come close to 0 after some uptime and if so don't use it.

>> this setting the machine in question may get trouble with "fork
>> failed" even if the standard system tools report a lot of free memory
>> causing confusion to the admins.
>
> You _want_ the fork to fail when the kernel can't (over)commit the memory,
> because otherwise the stupid genius kernel will come along and maybe blip
> your postmaster on the head, causing it to die by surprise. Don't like
> that? Use more memory. Or get an operating system that doesn't do stupid
> things like promise more memory than it has.
>
> Except, of course, those are getting rarer and rarer all the time.
>
> Please note that memory overcommit is sort of like a high-risk mortgage: =
the
> chances that the OS will recover enough memory in any given round start o=
ut
> as high. Eventually, however, the [technical|financial] economy is such
> that only high-risk commitments are available, and at that point, _someon=
e_
> isn't going to pay back enough [memory|money] to the thing demanding it. =
At
> that point, it's anyone's guess what will happen next.

As said the discussion about pro- and -con have happend many times
(for example
http://developers.sun.com/solaris/articles/subprocess/subpro cess.html). I o=
nly
like to see a hint how to check *before* you get in trouble.

Regards

Andreas



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Re: Dangerous hint in the PostgreSQL manual

am 11.12.2007 11:26:05 von Florian Weimer

* Andrew Sullivan:

> You _want_ the fork to fail when the kernel can't (over)commit the
> memory, because otherwise the stupid genius kernel will come along
> and maybe blip your postmaster on the head, causing it to die by
> surprise.

The other side of the story is that with overcommit, the machine
continues to work flawlessly in some loads, when it would fail without
overcommit. It's also not clear that trading a segfault for malloc
returning a null pointer leads to more deterministic failures (because
the malloc failure does not necessarily occur in the memory hog).

My personal experience is that vm.overcommit_memory=3D2 (together with
tons of swap space) leads to more deterministic failure behavior, but
we don't use much software that aggressively allocates address space
without actually using it (Sun's JVM does in some cases, and SBCL is
particularly obnoxious in this regard).

--=20
Florian Weimer
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: Dangerous hint in the PostgreSQL manual

am 11.12.2007 14:42:31 von Andrew Sullivan

On Tue, Dec 11, 2007 at 09:23:38AM +0100, Listaccount wrote:
> I don't want to start the discussion what is the rigth thing todo,

Then you shouldn't ask here. The manual was changed to say what it does
after considerable community discussion. In my view, the Linux kernel's
behaviour is completely unacceptable, and exactly the sort of amateur design
foolishness that people are complaining about when they say Linux is a toy.

> What i would like to see in the documentation is the easy hint to
> check if you get i trouble with this setting so one can prepare.

From the point of view of Postgres, "getting in trouble" means "postmaster
shot in head by surprise." If you feel otherwise, then you have to learn how
to tune your operating system correctly. The PostgreSQL manual is not a
place for general wisdom about how to tune various kernels. I think the
advice is correctly worded as it is.

> A simple "see if your "CommitLimit - Commited_AS" from /proc/meminfo
> come close to 0 after some uptime and if so don't use it.

That's not good enough, because the case where you really get into trouble
might be an unusual case. It's in fact exactly the condition where your
machine is facing surprising loads where memory overcommit will bite you.
So following your advice will still lead people to be surprised when their
postmaster goes away because they were Slashdotted or something.

> only like to see a hint how to check *before* you get in trouble.

"Am I using Linux with overcommit?" would be one such check. The only
reliable one. (Also, "Am I using AIX?" just in case anyone thinks this is
some sort of anti-Linux bias I have. Malloc lying ranks with system sins
right up there with fsync returning before the bits are on the platter.)

A


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: Dangerous hint in the PostgreSQL manual

am 11.12.2007 15:08:36 von lst_hoe01

Zitat von Andrew Sullivan :

> On Tue, Dec 11, 2007 at 09:23:38AM +0100, Listaccount wrote:
>> I don't want to start the discussion what is the rigth thing todo,
>
> Then you shouldn't ask here. The manual was changed to say what it does
> after considerable community discussion. In my view, the Linux kernel's
> behaviour is completely unacceptable, and exactly the sort of amateur des=
ign
> foolishness that people are complaining about when they say Linux is a to=
y.

It was not a question but a hint how to improve the documentation. It
is nice you have a strong opinion on the technical details behind it
but one should not include advices without pointing out possible
downsides.
If this is not possible then the advice should be removed completely
because it has nothing to do with PostgreSQL but with shortcomings of
the Linux (or some other) Kernel.
I would have not been surprised if the OOM-Killer would go around in
case of short memory but i was surprised to see fork failed with a
system having 1GB Memory available.

That's all my own opinion and nothing need to be said anymore. If the
maintainer of the docu agree they could take it out or improve if not
i don't care because i have learned my lesson but others may suffer
from the same hidden problem.

Regards

Andreas



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: Dangerous hint in the PostgreSQL manual

am 11.12.2007 15:15:57 von Alvaro Herrera

Listaccount wrote:

> I would have not been surprised if the OOM-Killer would go around in ca=
se=20
> of short memory but i was surprised to see fork failed with a system ha=
ving=20
> 1GB Memory available.

We've had *a lot* of problem reports due to the OOM killer.

--=20
Alvaro Herrera http://www.PlanetPostgreSQL.=
org/
"Aprender sin pensar es in=FAtil; pensar sin aprender, peligroso" (Confuc=
io)

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: Dangerous hint in the PostgreSQL manual

am 11.12.2007 17:22:29 von Andrew Sullivan

On Tue, Dec 11, 2007 at 03:08:36PM +0100, Listaccount wrote:
> I would have not been surprised if the OOM-Killer would go around in
> case of short memory but i was surprised to see fork failed with a
> system having 1GB Memory available.

You don't understand: the system _did not_ have 1G of memory available. It
was all committed to applications that had asked for it. Just because they
asked for it even though they were never going to use it doesn't mean that
it isn't gone. It's used, as far as the kernel is concerned. The
overcommit trick some OSes have implemented is a filthy hack to get around
poor memory allocation discipline in applications.

The point of the PostgreSQL documentation is to tell you how best to run
Postgres, safely and reliably. The only safe and reliable way to run on
Linux is not to use overcommit. Turning it off ensures that the system
can't run out of memory in this way.

What I _would_ support in the docs is the following addition in 17.4.3,
where this is discussed:

. . .it will lower the chances significantly and will therefore
lead to more robust system behavior. It may also cause fork() to fail
when the machine appears to have available memory. This is done by
selecting. . .

Or something like that. This would warn potential users that they really do
need to read their kernel docs.

A

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Re: Dangerous hint in the PostgreSQL manual

am 12.12.2007 17:19:24 von lst_hoe01

Zitat von Andrew Sullivan :

> On Tue, Dec 11, 2007 at 03:08:36PM +0100, Listaccount wrote:
>> I would have not been surprised if the OOM-Killer would go around in
>> case of short memory but i was surprised to see fork failed with a
>> system having 1GB Memory available.
>
> You don't understand: the system _did not_ have 1G of memory available. =
It
> was all committed to applications that had asked for it. Just because th=
ey
> asked for it even though they were never going to use it doesn't mean that
> it isn't gone. It's used, as far as the kernel is concerned. The
> overcommit trick some OSes have implemented is a filthy hack to get around
> poor memory allocation discipline in applications.


For sure i understand the problem. The key is how you define
"available". But i agree with you that overcommit obfuscate careless
application design.


> The point of the PostgreSQL documentation is to tell you how best to run
> Postgres, safely and reliably. The only safe and reliable way to run on
> Linux is not to use overcommit. Turning it off ensures that the system
> can't run out of memory in this way.

Yes, but the documentation should at least warn if some setting
*could* lead to trouble you would not have otherwise.

> What I _would_ support in the docs is the following addition in 17.4.3,
> where this is discussed:
>
> . . .it will lower the chances significantly and will therefore
> lead to more robust system behavior. It may also cause fork() to fail
> when the machine appears to have available memory. This is done by
> selecting. . .
>
> Or something like that. This would warn potential users that they really=
do
> need to read their kernel docs.

On this one we can agree. Maybe we should mention the root-cause.

"It may also cause fork() to fail when the machine appears to have
available memory because of other applications doing careless memory
allocation"

Would be nice to save others from learning about this the hard way.

Regards

Andreas



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: Dangerous hint in the PostgreSQL manual

am 12.12.2007 17:42:56 von Andrew Sullivan

Dear docs mavens:

Please see below for a possible adjustment to the docs. Is it agreeable?
If so, I'll see about putting together a patch.

On Wed, Dec 12, 2007 at 05:19:24PM +0100, Listaccount wrote:
> >What I _would_ support in the docs is the following addition in 17.4.3,
> >where this is discussed:
> >
> > . . .it will lower the chances significantly and will therefore
> > lead to more robust system behavior. It may also cause fork() to fail
> > when the machine appears to have available memory. This is done by
> > selecting. . .
> >
> >Or something like that. This would warn potential users that they really
> >do
> >need to read their kernel docs.
>
> On this one we can agree. Maybe we should mention the root-cause.
>
> "It may also cause fork() to fail when the machine appears to have
> available memory because of other applications doing careless memory
> allocation"
>
> Would be nice to save others from learning about this the hard way.
>
> Regards
>
> Andreas

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: Dangerous hint in the PostgreSQL manual

am 12.12.2007 18:01:52 von lst_hoe01

Zitat von Andrew Sullivan :

> Dear docs mavens:
>
> Please see below for a possible adjustment to the docs. Is it agreeable?
> If so, I'll see about putting together a patch.

Thanxs !

BTW : How can one find out the application doing unused allocations?
What value of "ps" output to watch for?

Regards

Andreas





---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Re: Dangerous hint in the PostgreSQL manual

am 13.12.2007 22:49:47 von Andrew Sullivan

On Wed, Dec 12, 2007 at 06:01:52PM +0100, Listaccount wrote:
> BTW : How can one find out the application doing unused allocations?
> What value of "ps" output to watch for?

As far as I know, the only way to learn that is to use a debugger. If the
OS knew this, it'd be able to shoot the misbehaving process instead of
whatever it guesses on.

A

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: Dangerous hint in the PostgreSQL manual

am 16.12.2007 12:24:48 von Bruce Momjian

Listaccount wrote:
> Yes, but the documentation should at least warn if some setting
> *could* lead to trouble you would not have otherwise.
>
> > What I _would_ support in the docs is the following addition in 17.4.3,
> > where this is discussed:
> >
> > . . .it will lower the chances significantly and will therefore
> > lead to more robust system behavior. It may also cause fork() to fail
> > when the machine appears to have available memory. This is done by
> > selecting. . .
> >
> > Or something like that. This would warn potential users that they really do
> > need to read their kernel docs.
>
> On this one we can agree. Maybe we should mention the root-cause.
>
> "It may also cause fork() to fail when the machine appears to have
> available memory because of other applications doing careless memory
> allocation"
>
> Would be nice to save others from learning about this the hard way.

Good, text added in parentheses:

On Linux 2.6 and later, an additional measure is to modify the
kernel's behavior so that it will not overcommit memory.
Although this setting will not prevent the OOM killer from
invoking altogether, it will lower the chances significantly and
will therefore lead to more robust system behavior. (It might also
cause fork() to fail when the machine appears to have available memory
because of other applications with careless memory allocation.) This
is done by selecting strict overcommit mode via

--
Bruce Momjian http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq