postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not write block503414 of temporary file: No space

postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not write block503414 of temporary file: No space

am 27.07.2010 00:01:44 von Irene Barg

System info: Lancelot 1874-T dual Intel Xeon(Nehalem-DP) SATAII 24GB
RAM, CentOS release 5.4 (Final), postgresql-server-8.4.4-1PGDG

hi,

We installed a fresh version of postgresql-server-8.4.4-1 using RPMs
from PGDG. We are seeing the following errors, if the system trys to do
'anything' sort of update, insert, etc.:

> 2010-07-26 11:06:59 MST LOG: database system was shut down at 2010-07-26 10:51:41 MST
> 2010-07-26 11:06:59 MST LOG: autovacuum launcher started
> 2010-07-26 11:06:59 MST LOG: database system is ready to accept connections
> 2010-07-26 11:08:18 MST science_admin metadataFATAL: password authentication failed for user "science_admin"
> 2010-07-26 11:09:33 MST postgres postgresFATAL: password authentication failed for user "postgres"
> 2010-07-26 11:09:55 MST arcsoft arcsoftFATAL: password authentication failed for user "arcsoft"
> 2010-07-26 11:10:00 MST arcsoft arcsoftFATAL: database "arcsoft" does not exist
> 2010-07-26 11:10:43 MST arcsoft postgresLOG: statement: ALTER USER postgres WITH PASSWORD 'nsa_core';
> 2010-07-26 11:11:06 MST postgres postgresLOG: statement: ALTER USER operations WITH PASSWORD 'Mosaic_DHS';
> 2010-07-26 11:11:15 MST postgres postgresLOG: statement: ALTER USER pipeline WITH PASSWORD 'Mosaic_DHS';
> 2010-07-26 11:11:23 MST postgres postgresLOG: statement: ALTER USER science_admin WITH PASSWORD 'nsa_devel';
> 2010-07-26 11:11:31 MST postgres postgresLOG: statement: ALTER USER security_admin WITH PASSWORD 'nsa_devel';
> 2010-07-26 11:11:48 MST postgres postgresLOG: statement: ALTER USER voi WITH PASSWORD 'nsa_devel';
> 2010-07-26 11:15:24 MST system_admin metadataFATAL: password authentication failed for user "system_admin"
> 2010-07-26 11:19:36 MST postgres postgresLOG: statement: ALTER USER system_admin WITH PASSWORD 'nsa_core';
> 2010-07-26 11:21:03 MST postgres postgresLOG: statement: ALTER USER system_admin WITH PASSWORD 'nsa_devel';
> 2010-07-26 11:21:19 MST system_admin system_adminFATAL: database "system_admin" does not exist
> 2010-07-26 11:33:33 MST system_admin metadataERROR: could not write block 503414 of temporary file: No space le
> ft on device
> 2010-07-26 11:33:33 MST system_admin metadataHINT: Perhaps out of disk space?

The postgresql is installed on /var/lib/pgsql, but it has plenty of space::

> -bash-3.2$ df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda3 16G 13G 1.7G 89% /
> /dev/sda8 635G 979M 601G 1% /data0
> /dev/sda7 2.0G 241M 1.7G 13% /home
> /dev/sda5 7.8G 1.5G 5.9G 20% /u1
> /dev/sda2 16G 184M 15G 2% /var/log
> /dev/sda1 487M 22M 440M 5% /boot
> tmpfs 12G 0 12G 0% /dev/shm
> /dev/md0 2.7T 202M 2.6T 1% /data1


The pg_stat_tmp is hardly growing::

> -bash-3.2$ date
> Mon Jul 26 14:37:38 MST 2010
> -bash-3.2$ pwd
> /var/lib/pgsql/data
> -bash-3.2$ du -ks pg_stat_tmp/*
> 56 pg_stat_tmp/pgstat.stat
> -bash-3.2$ du -ks pg_stat_tmp/*
> 56 pg_stat_tmp/pgstat.stat

In summary:

1) installed new version of postgresql-server-8.4.4-1.
2) restored a binary dump of a database from another system running same
version. The restore succeeded without error.
3) started using the system

We saw this same symptom in an CentOS5.2 machine, but we thought it had
to do with the version of CentOS, so we repeated the steps to a system
with more resources and running CentOS5.4.

We have postgresql-server-8.2.14-1PGDG.rhel5 on other CentOS5.3 systems
and we do not see this error.

I read this post:

> http://archives.postgresql.org/pgsql-performance/2010-04/msg 00192.php

I understand that autovacuum is enabled by default, I have not modified
them:

> -bash-3.2$ grep vacuum postgresql.conf
> #vacuum_cost_delay = 0ms # 0-100 milliseconds
> #vacuum_cost_page_hit = 1 # 0-10000 credits
> #vacuum_cost_page_miss = 10 # 0-10000 credits
> #vacuum_cost_page_dirty = 20 # 0-10000 credits
> #vacuum_cost_limit = 200 # 1-10000 credits
> #autovacuum = on # Enable autovacuum subprocess? 'on'
> #log_autovacuum_min_duration = -1 # -1 disables, 0 logs all actions and
> #autovacuum_max_workers = 3 # max number of autovacuum subprocesses
> #autovacuum_naptime = 1min # time between autovacuum runs
> #autovacuum_vacuum_threshold = 50 # min number of row updates before
> # vacuum
> #autovacuum_analyze_threshold = 50 # min number of row updates before
> #autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
> #autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
> #autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
> #autovacuum_vacuum_cost_delay = 20ms # default vacuum cost delay for
> # autovacuum, in milliseconds;
> # -1 means use vacuum_cost_delay
> #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
> # autovacuum, -1 means use
> # vacuum_cost_limit
> #vacuum_freeze_min_age = 50000000
> #vacuum_freeze_table_age = 150000000


Your comments/suggestions are appreciated.
--irene
------------------------------------------------------------ ---------
Irene Barg Email: ibarg@noao.edu
NOAO/AURA Inc. http://www.noao.edu/noao/staff/ibarg
950 N. Cherry Ave. Voice: 520-318-8273
Tucson, AZ 85726 USA FAX: 520-318-8360
------------------------------------------------------------ ---------

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not

am 27.07.2010 01:09:38 von Alex Hunsaker

On Mon, Jul 26, 2010 at 16:01, Irene Barg wrote:
>> 2010-07-26 11:33:33 MST system_admin metadataERROR:  could not writ=
e block
>> 503414 of temporary file: No space le
>> ft on device
>> 2010-07-26 11:33:33 MST system_admin metadataHINT:  Perhaps out of =
disk
>> space?

>> -bash-3.2$ df -h
>> Filesystem            Size  Used Avai=
l Use% Mounted on
>> /dev/sda3              16G   13G=
 1.7G  89% /
>> /dev/sda8             635G  979M =C2=
=A0601G   1% /data0
>> /dev/sda7             2.0G  241M =C2=
=A01.7G  13% /home
>> /dev/sda5             7.8G  1.5G =C2=
=A05.9G  20% /u1
>> /dev/sda2              16G  184M=
  15G   2% /var/log
>> /dev/sda1             487M   22M =C2=
=A0440M   5% /boot
>> tmpfs                  12G =
    0   12G   0% /dev/shm
>> /dev/md0              2.7T  202M=
 2.6T   1% /data1
>
>
> The pg_stat_tmp is hardly growing::
>
>> -bash-3.2$ date
>> Mon Jul 26 14:37:38 MST 2010
>> -bash-3.2$ pwd
>> /var/lib/pgsql/data

Ok so it looks like this is on your / (sda3) partition that only has
1.7G free. Its entirely possible that you have some query that needs
to write more than that in tmp files.

>> -bash-3.2$ du -ks pg_stat_tmp/*

No surprise that is not growing, thats not for temp files, its for the
stats collector.

I would try turning on log_temp_files, that might help pin point the
problem query.

--=20
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not write block 503414 of temporary file: No sp

am 27.07.2010 03:33:52 von Tom Lane

Alex Hunsaker writes:
> On Mon, Jul 26, 2010 at 16:01, Irene Barg wrote:
> 2010-07-26 11:33:33 MST system_admin metadataERROR:  could not write block
> 503414 of temporary file: No space le
> ft on device
> 2010-07-26 11:33:33 MST system_admin metadataHINT:  Perhaps out of disk
> space?

> Ok so it looks like this is on your / (sda3) partition that only has
> 1.7G free. Its entirely possible that you have some query that needs
> to write more than that in tmp files.

"Block 503414" is a red flag here --- that means that this temp file
alone required 503414 * 8K = nearly 4Gb. I suspect /var/lib/pgsql has
been symlinked someplace else, as the root device has clearly not got
anything close to that much free space. It's quite possible that the
query involved multiple temp files, though, so maybe one of the other
partitions is getting filled.

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not write block 503414 of temporary file: No sp

am 27.07.2010 23:58:46 von Irene Barg

Alex,

It's funny how one cannot see what is right in front of them. I did NOT
notice that 'var' was mounted as '/var/log', so postgres was getting
installed in root '/'. So I moved pgsql to /data1, and created a symlink
in /var/lib/pgsql to /data1/pgsql.

Now I have a pgsql_tmp file that has grown to 1.9Tb!

> -bash-3.2$ du -ks *
> 5556 1
> 5364 11563
> 5436 11564
> 5529348 28523
> 1962946784 pgsql_tmp

I'm afraid I don't understand how turning on 'log_temp_files' will help?
Right now it is set to '-1' which should mean it's disabled, yet it is
creating 1GB files and NOT cleaning them up:

> -bash-3.2$ du -hs pgsql_tmp/*
> 15M pgsql_tmp/pgsql_tmp22192.3070
> 1.1G pgsql_tmp/pgsql_tmp22192.4916
> 168M pgsql_tmp/pgsql_tmp22192.4918
> 1.1G pgsql_tmp/pgsql_tmp22192.4919
.....

Thanks you.
--irene

Alex Hunsaker wrote:
> On Mon, Jul 26, 2010 at 16:01, Irene Barg wrote:
>>> 2010-07-26 11:33:33 MST system_admin metadataERROR: could not write block
>>> 503414 of temporary file: No space le
>>> ft on device
>>> 2010-07-26 11:33:33 MST system_admin metadataHINT: Perhaps out of disk
>>> space?
>
>>> -bash-3.2$ df -h
>>> Filesystem Size Used Avail Use% Mounted on
>>> /dev/sda3 16G 13G 1.7G 89% /
>>> /dev/sda8 635G 979M 601G 1% /data0
>>> /dev/sda7 2.0G 241M 1.7G 13% /home
>>> /dev/sda5 7.8G 1.5G 5.9G 20% /u1
>>> /dev/sda2 16G 184M 15G 2% /var/log
>>> /dev/sda1 487M 22M 440M 5% /boot
>>> tmpfs 12G 0 12G 0% /dev/shm
>>> /dev/md0 2.7T 202M 2.6T 1% /data1
>>
>> The pg_stat_tmp is hardly growing::
>>
>>> -bash-3.2$ date
>>> Mon Jul 26 14:37:38 MST 2010
>>> -bash-3.2$ pwd
>>> /var/lib/pgsql/data
>
> Ok so it looks like this is on your / (sda3) partition that only has
> 1.7G free. Its entirely possible that you have some query that needs
> to write more than that in tmp files.
>
>>> -bash-3.2$ du -ks pg_stat_tmp/*
>
> No surprise that is not growing, thats not for temp files, its for the
> stats collector.
>
> I would try turning on log_temp_files, that might help pin point the
> problem query.
>

--
------------------------------------------------------------ ---------
Irene Barg Email: ibarg@noao.edu
NOAO/AURA Inc. http://www.noao.edu/noao/staff/ibarg
950 N. Cherry Ave. Voice: 520-318-8273
Tucson, AZ 85726 USA FAX: 520-318-8360
------------------------------------------------------------ ---------

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not

am 28.07.2010 00:35:58 von Alex Hunsaker

On Tue, Jul 27, 2010 at 15:58, Irene Barg wrote:
> I'm afraid I don't understand how turning on 'log_temp_files' will help?

Well, postgres is probably doing exactly what its told. You just need
to figure out what that is and why. ( and if it is some kind of bug,
we wont be able to do anything about it unless you give us a test case
:-) )

The idea behind log_temp_files is that you should be able to see whats
causing the creation of all those temp files (and so potentially 'fix'
the query/problem). The main problem with log_temp_files is it only
logs when the command is done executing. Of course if it really is
creating tons of files it should take long enough that you should be
able to select * from pg_stat_activity; and pinpoint the problem query
that way. If all else fails you might have better luck with turning
on log_statements and then lurking through the logs when you start to
get lots of tmp files.

> Right now it is set to '-1' which should mean it's disabled, yet it is
> creating 1GB files and NOT cleaning them up:

It wont clean them up until its done using them. If a backend
crashes, it looks like it might orphan some tmp files, at least until
postgres is restarted.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not write block 503414 of temporary file: No sp

am 28.07.2010 23:43:31 von Irene Barg

Alex,

Thanks for the explanation. That helps.

After looking at some of the pgsql_tmp files, It was clear we needed to
increase work_mem. So I switched back to the pgtune suggestions with one
change:

> -bash-3.2$ diff postgresql.conf postgresql.conf.local | grep "< "
> < log_temp_files = 0 # log temporary files equal or larger
> < default_statistics_target = 50 # pgtune wizard 2010-07-26
> < maintenance_work_mem = 1GB # pgtune wizard 2010-07-26
> < constraint_exclusion = on # pgtune wizard 2010-07-26
> < checkpoint_completion_target = 0.9 # pgtune wizard 2010-07-26
> < effective_cache_size = 16GB # pgtune wizard 2010-07-26
> < work_mem = 144MB # pgtune wizard 2010-07-26
> < wal_buffers = 8MB # pgtune wizard 2010-07-26
> < checkpoint_segments = 16 # pgtune wizard 2010-07-26
> < #shared_buffers = 5632MB # pgtune wizard 2010-07-26
> < shared_buffers = 2048MB
> < max_connections = 80 # pgtune wizard 2010-07-26

The postmaster wouldn't start with shared_buffers = 5632MB, so we
reduced it to 2GB. The system's shared memory is:

> [dppops@pipen18 ~]$ sudo /sbin/sysctl -a | grep kernel | grep shmmax
> kernel.shmmax = 4182863872

We have 24GB RAM total.

The system is behaving normal now, it does create intermittant
'pgsql_tmp' files, but removes them after the SQL statement completes.
We will monitor the logs and try to make informed tweaks to the
postgresql settings as needed.

Thanks to you and Tom Lane for your responses. This group is one of the
best!

--irene barg

Alex Hunsaker wrote:
> On Tue, Jul 27, 2010 at 15:58, Irene Barg wrote:
>> I'm afraid I don't understand how turning on 'log_temp_files' will help?
>
> Well, postgres is probably doing exactly what its told. You just need
> to figure out what that is and why. ( and if it is some kind of bug,
> we wont be able to do anything about it unless you give us a test case
> :-) )
>
> The idea behind log_temp_files is that you should be able to see whats
> causing the creation of all those temp files (and so potentially 'fix'
> the query/problem). The main problem with log_temp_files is it only
> logs when the command is done executing. Of course if it really is
> creating tons of files it should take long enough that you should be
> able to select * from pg_stat_activity; and pinpoint the problem query
> that way. If all else fails you might have better luck with turning
> on log_statements and then lurking through the logs when you start to
> get lots of tmp files.
>
>> Right now it is set to '-1' which should mean it's disabled, yet it is
>> creating 1GB files and NOT cleaning them up:
>
> It wont clean them up until its done using them. If a backend
> crashes, it looks like it might orphan some tmp files, at least until
> postgres is restarted.

--
------------------------------------------------------------ ---------
Irene Barg Email: ibarg@noao.edu
NOAO/AURA Inc. http://www.noao.edu/noao/staff/ibarg
950 N. Cherry Ave. Voice: 520-318-8273
Tucson, AZ 85726 USA FAX: 520-318-8360
------------------------------------------------------------ ---------

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: postgresql-server-8.4.4-1PGDG.el5 - ERROR: could not

am 28.07.2010 23:53:43 von Alex Hunsaker

On Wed, Jul 28, 2010 at 15:43, Irene Barg wrote:
> Alex,
> The postmaster wouldn't start with shared_buffers = 5632MB, so we reduced it
> to 2GB. The system's shared memory is:
>
>> [dppops@pipen18 ~]$ sudo /sbin/sysctl -a | grep kernel | grep shmmax
>> kernel.shmmax = 4182863872

You can bump that up so it works with the larger setting (assuming you
are running a 64bit OS):
http://developer.postgresql.org/pgdocs/postgres/kernel-resou rces.html

> The system is behaving normal now, it does create intermittant 'pgsql_tmp'
> files, but removes them after the SQL statement completes. We will monitor
> the logs and try to make informed tweaks to the postgresql settings as
> needed.

If its doing a sort or join a btree index on the appropriate column(s) may help.

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin