Restore problem

Restore problem

am 16.07.2010 00:05:07 von Guy Wauters

--Apple-Mail-1-276596982
Content-Type: text/plain;
charset=US-ASCII;
format=flowed;
delsp=yes
Content-Transfer-Encoding: 7bit

Hi all,

Today, I tried to recover a postgresql cluster (pitr). I think it went
fine (although I'm not sure), but I got some warnings afterwards...

2010-07-15 23:01:48 CEST [22907]: [2-1] user=,db= LOG: starting
archive recovery
2010-07-15 23:01:48 CEST [22907]: [3-1] user=,db= LOG:
restore_command = '/opt/postgres/data/dov/data/pitr_ssh.sh %p %f'
2010-07-15 23:01:48 CEST [22907]: [4-1] user=,db= LOG:
recovery_target_time = '2010-07-15 14:00:00+02'
/usr/bin/nsr/recover -d /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog -a /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001.history
6469:recover: /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001.history not in index
67847:recover: Failed to recover files: Nothing to recover

/usr/bin/nsr/recover -d /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog -a /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001000000000000000C.00000020.backup
Recovering 1 file from /zones/postgres-on-charlotte/root/opt/postgres/
data/dov/data/pg_xlog/ into /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog
Requesting 1 file(s), this may take a while...
../00000001000000000000000C.00000020.backup
Received 1 file(s) from NSR server `backup-master1'
Recover completion time: Thu Jul 15 23:01:59 2010
2010-07-15 23:01:59 CEST [22907]: [5-1] user=,db= LOG: restored log
file "00000001000000000000000C.00000020.backup" from archive
/usr/bin/nsr/recover -d /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog -a /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001000000000000000C
Recovering 1 file from /zones/postgres-on-charlotte/root/opt/postgres/
data/dov/data/pg_xlog/ into /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog
Requesting 1 file(s), this may take a while...
Total estimated disk space needed for recover is 5 KB.
../00000001000000000000000C
Received 1 file(s) from NSR server `backup-master1'
Recover completion time: Thu Jul 15 23:02:02 2010
2010-07-15 23:02:02 CEST [22907]: [6-1] user=,db= LOG: restored log
file "00000001000000000000000C" from archive
2010-07-15 23:02:02 CEST [22907]: [7-1] user=,db= LOG: automatic
recovery in progress
2010-07-15 23:02:02 CEST [22907]: [8-1] user=,db= LOG: redo starts at
0/C000068, consistency will be reached at 0/C000088
2010-07-15 23:02:02 CEST [22907]: [9-1] user=,db= LOG: consistent
recovery state reached
/usr/bin/nsr/recover -d /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog -a /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001000000000000000D
Recovering 1 file from /zones/postgres-on-charlotte/root/opt/postgres/
data/dov/data/pg_xlog/ into /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog
Requesting 1 file(s), this may take a while...
Total estimated disk space needed for recover is 7169 KB.
../00000001000000000000000D
Received 1 file(s) from NSR server `backup-master1'
Recover completion time: Thu Jul 15 23:02:10 2010
2010-07-15 23:02:10 CEST [22907]: [10-1] user=,db= LOG: restored log
file "00000001000000000000000D" from archive
2010-07-15 23:02:12 CEST [22907]: [11-1] user=,db= LOG: recovery
stopping before commit of transaction 1003, time 2010-07-15
14:46:23.65437+02
2010-07-15 23:02:12 CEST [22907]: [12-1] user=,db= LOG: redo done at
0/D0010F0
/usr/bin/nsr/recover -d /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog -a /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000002.history
6469:recover: /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000002.history not in index
67847:recover: Failed to recover files: Nothing to recover

2010-07-15 23:02:13 CEST [22907]: [13-1] user=,db= LOG: selected new
timeline ID: 2
/usr/bin/nsr/recover -d /zones/postgres-on-charlotte/fs/dov/data/
pg_xlog -a /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001.history
6469:recover: /zones/postgres-on-charlotte/root/opt/postgres/data/dov/
data/pg_xlog/00000001.history not in index
67847:recover: Failed to recover files: Nothing to recover

2010-07-15 23:02:15 CEST [22907]: [14-1] user=,db= LOG: archive
recovery complete
2010-07-15 23:02:15 CEST [25436]: [1-1] user=,db= LOG: checkpoint
starting: end-of-recovery immediate wait
2010-07-15 23:02:15 CEST [25436]: [2-1] user=,db= LOG: checkpoint
complete: wrote 6 buffers (0.1%); 0 transaction log file(s) added, 0
removed, 0 recycled; write=0.000 s, sync=0.009 s, total=0.180 s
2010-07-15 23:02:15 CEST [27805]: [1-1] user=,db= LOG: autovacuum
launcher started
2010-07-15 23:02:15 CEST [22886]: [5-1] user=,db= LOG: database
system is ready to accept connections


I think everything went fine till here. If not, please tell me what I
did wrong.


From here on, the log file is filled with messages like this:

3817:save: Using backup-master1 as server
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/
00000001000000000000000E: No such file or directory
2010-07-15 23:02:16: BACKUP WAL: ERROR (1) - /usr/bin/nsr/save -b
PGLOGS /zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /
pg_xlog/00000001000000000000000E
2010-07-15 23:02:16 CEST [27810]: [1-1] user=,db= LOG: archive
command failed with exit code 1
2010-07-15 23:02:16 CEST [27810]: [2-1] user=,db= DETAIL: The failed
archive command was: /opt/postgres/dba/scripts/backup_WAL-135.sh
pg_xlog/00000001000000000000000E
3817:save: Using backup-master1 as server
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/
00000001000000000000000E: No such file or directory
2010-07-15 23:02:18: BACKUP WAL: ERROR (1) - /usr/bin/nsr/save -b
PGLOGS /zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /
pg_xlog/00000001000000000000000E
2010-07-15 23:02:18 CEST [27810]: [3-1] user=,db= LOG: archive
command failed with exit code 1
2010-07-15 23:02:18 CEST [27810]: [4-1] user=,db= DETAIL: The failed
archive command was: /opt/postgres/dba/scripts/backup_WAL-135.sh
pg_xlog/00000001000000000000000E
3817:save: Using backup-master1 as server
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/
00000001000000000000000E: No such file or directory
2010-07-15 23:02:20: BACKUP WAL: ERROR (1) - /usr/bin/nsr/save -b
PGLOGS /zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /
pg_xlog/00000001000000000000000E
2010-07-15 23:02:20 CEST [27810]: [5-1] user=,db= LOG: archive
command failed with exit code 1
2010-07-15 23:02:20 CEST [27810]: [6-1] user=,db= DETAIL: The failed
archive command was: /opt/postgres/dba/scripts/backup_WAL-135.sh
pg_xlog/00000001000000000000000E
2010-07-15 23:02:20 CEST [27810]: [7-1] user=,db= WARNING:
transaction log file "00000001000000000000000E" could not be archived:
too many failures
3817:save: Using backup-master1 as server

Saving WAL file 00000001000000000000000E gives an error. The problem
is that the WAL file does not exist in pg_xlog. Does someone know why?
The only WAL file I could find in pg_xlog is 00000002000000000000000D
(timeline id is 2). The next thing I tried was looking for
00000001000000000000000E on our backup server. It seems that it didn't
exist (or has never been backupped).
Finally I created an empty 00000001000000000000000E in pg_xlog and
postgresql saved the fake WAL file, so I get rid off the warnings.

Did I do something wrong during the backup? Or with the recover?


Kind regards,

Guy





--Apple-Mail-1-276596982
Content-Type: text/html;
charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

-webkit-line-break: after-white-space; ">Hi =
all,


Today, I tried to recover a postgresql cluster =
(pitr). I think it went fine (although I'm not sure), but I got some =
warnings afterwards...

class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:01:48 =
CEST [22907]: [2-1] user=3D,db=3D LOG:  starting archive =
recovery
face=3D"'Courier New'">2010-07-15 23:01:48 CEST [22907]: [3-1] user=3D,db=3D=
LOG:  restore_command =3D '/opt/postgres/data/dov/data/pitr_ssh.sh =
%p %f'
New'">2010-07-15 23:01:48 CEST [22907]: [4-1] user=3D,db=3D LOG: =
 recovery_target_time =3D '2010-07-15 =
14:00:00+02'
face=3D"'Courier New'">/usr/bin/nsr/recover -d =
/zones/postgres-on-charlotte/fs/dov/data/pg_xlog -a =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001.history
face=3D"'Courier New'">6469:recover: =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001.history not in index
face=3D"'Courier New'">67847:recover: Failed to recover files: Nothing =
to recover
face=3D"'Courier New'">
class=3D"Apple-style-span" face=3D"'Courier New'">/usr/bin/nsr/recover =
-d /zones/postgres-on-charlotte/fs/dov/data/pg_xlog -a =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001000000000000000C.00000020.backup
class=3D"Apple-style-span" face=3D"'Courier New'">Recovering 1 file from =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/ =
into =
/zones/postgres-on-charlotte/fs/dov/data/pg_xlog
class=3D"Apple-style-span" face=3D"'Courier New'">Requesting 1 file(s), =
this may take a while...
face=3D"'Courier =
New'">./00000001000000000000000C.00000020.backup
class=3D"Apple-style-span" face=3D"'Courier New'">Received 1 file(s) =
from NSR server `backup-master1'
class=3D"Apple-style-span" face=3D"'Courier New'">Recover completion =
time: Thu Jul 15 23:01:59 2010
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:01:59 =
CEST [22907]: [5-1] user=3D,db=3D LOG:  restored log file =
"00000001000000000000000C.00000020.backup" from =
archive
New'">/usr/bin/nsr/recover -d =
/zones/postgres-on-charlotte/fs/dov/data/pg_xlog -a =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001000000000000000C
face=3D"'Courier New'">Recovering 1 file from =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/ =
into =
/zones/postgres-on-charlotte/fs/dov/data/pg_xlog
class=3D"Apple-style-span" face=3D"'Courier New'">Requesting 1 file(s), =
this may take a while...
face=3D"'Courier New'">Total estimated disk space needed for recover is =
5 KB.
New'">./00000001000000000000000C
class=3D"Apple-style-span" face=3D"'Courier New'">Received 1 file(s) =
from NSR server `backup-master1'
class=3D"Apple-style-span" face=3D"'Courier New'">Recover completion =
time: Thu Jul 15 23:02:02 2010
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:02 =
CEST [22907]: [6-1] user=3D,db=3D LOG:  restored log file =
"00000001000000000000000C" from archive
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:02 =
CEST [22907]: [7-1] user=3D,db=3D LOG:  automatic recovery in =
progress
face=3D"'Courier New'">2010-07-15 23:02:02 CEST [22907]: [8-1] user=3D,db=3D=
LOG:  redo starts at 0/C000068, consistency will be reached at =
0/C000088
face=3D"'Courier New'">2010-07-15 23:02:02 CEST [22907]: [9-1] user=3D,db=3D=
LOG:  consistent recovery state reached
class=3D"Apple-style-span" face=3D"'Courier New'">/usr/bin/nsr/recover =
-d /zones/postgres-on-charlotte/fs/dov/data/pg_xlog -a =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001000000000000000D
face=3D"'Courier New'">Recovering 1 file from =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/ =
into =
/zones/postgres-on-charlotte/fs/dov/data/pg_xlog
class=3D"Apple-style-span" face=3D"'Courier New'">Requesting 1 file(s), =
this may take a while...
face=3D"'Courier New'">Total estimated disk space needed for recover is =
7169 KB.
face=3D"'Courier New'">./00000001000000000000000D
class=3D"Apple-style-span" face=3D"'Courier New'">Received 1 file(s) =
from NSR server `backup-master1'
class=3D"Apple-style-span" face=3D"'Courier New'">Recover completion =
time: Thu Jul 15 23:02:10 2010
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:10 =
CEST [22907]: [10-1] user=3D,db=3D LOG:  restored log file =
"00000001000000000000000D" from archive
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:12 =
CEST [22907]: [11-1] user=3D,db=3D LOG:  recovery stopping before =
commit of transaction 1003, time 2010-07-15 =
14:46:23.65437+02
face=3D"'Courier New'">2010-07-15 23:02:12 CEST [22907]: [12-1] =
user=3D,db=3D LOG:  redo done at 0/D0010F0
class=3D"Apple-style-span" face=3D"'Courier New'">/usr/bin/nsr/recover =
-d /zones/postgres-on-charlotte/fs/dov/data/pg_xlog -a =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
002.history
face=3D"'Courier New'">6469:recover: =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
002.history not in index
face=3D"'Courier New'">67847:recover: Failed to recover files: Nothing =
to recover
face=3D"'Courier New'">
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:13 =
CEST [22907]: [13-1] user=3D,db=3D LOG:  selected new timeline ID: =
2
New'">/usr/bin/nsr/recover -d =
/zones/postgres-on-charlotte/fs/dov/data/pg_xlog -a =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001.history
face=3D"'Courier New'">6469:recover: =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001.history not in index
face=3D"'Courier New'">67847:recover: Failed to recover files: Nothing =
to recover
face=3D"'Courier New'">
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:15 =
CEST [22907]: [14-1] user=3D,db=3D LOG:  archive recovery =
complete
face=3D"'Courier New'">2010-07-15 23:02:15 CEST [25436]: [1-1] user=3D,db=3D=
LOG:  checkpoint starting: end-of-recovery immediate =
wait
New'">2010-07-15 23:02:15 CEST [25436]: [2-1] user=3D,db=3D LOG: =
 checkpoint complete: wrote 6 buffers (0.1%); 0 transaction log =
file(s) added, 0 removed, 0 recycled; write=3D0.000 s, sync=3D0.009 s, =
total=3D0.180 s
face=3D"'Courier New'">2010-07-15 23:02:15 CEST [27805]: [1-1] user=3D,db=3D=
LOG:  autovacuum launcher started
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:15 =
CEST [22886]: [5-1] user=3D,db=3D LOG:  database system is ready to =
accept =
connections


I think =
everything went fine till here. If not, please tell me what I did =
wrong.


=46rom here on, the log =
file is filled with messages like =
this:

face=3D"'Courier New'">3817:save: Using backup-master1 as =
server
New'">/zones/postgres-on-charlotte/root/opt/postgres/data/do v/data/pg_xlog=
/00000001000000000000000E: No such file or =
directory
face=3D"'Courier New'">2010-07-15 23:02:16:    BACKUP WAL: =
ERROR (1) - /usr/bin/nsr/save -b PGLOGS =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001000000000000000E
face=3D"'Courier New'">2010-07-15 23:02:16 CEST [27810]: [1-1] user=3D,db=3D=
LOG:  archive command failed with exit code =
1
New'">2010-07-15 23:02:16 CEST [27810]: [2-1] user=3D,db=3D DETAIL: =
 The failed archive command was: =
/opt/postgres/dba/scripts/backup_WAL-135.sh =
pg_xlog/00000001000000000000000E
class=3D"Apple-style-span" face=3D"'Courier New'">3817:save: Using =
backup-master1 as server
face=3D"'Courier =
New'">/zones/postgres-on-charlotte/root/opt/postgres/data/do v/data/pg_xlog=
/00000001000000000000000E: No such file or =
directory
face=3D"'Courier New'">2010-07-15 23:02:18:    BACKUP WAL: =
ERROR (1) - /usr/bin/nsr/save -b PGLOGS =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001000000000000000E
face=3D"'Courier New'">2010-07-15 23:02:18 CEST [27810]: [3-1] user=3D,db=3D=
LOG:  archive command failed with exit code =
1
New'">2010-07-15 23:02:18 CEST [27810]: [4-1] user=3D,db=3D DETAIL: =
 The failed archive command was: =
/opt/postgres/dba/scripts/backup_WAL-135.sh =
pg_xlog/00000001000000000000000E
class=3D"Apple-style-span" face=3D"'Courier New'">3817:save: Using =
backup-master1 as server
face=3D"'Courier =
New'">/zones/postgres-on-charlotte/root/opt/postgres/data/do v/data/pg_xlog=
/00000001000000000000000E: No such file or =
directory
face=3D"'Courier New'">2010-07-15 23:02:20:    BACKUP WAL: =
ERROR (1) - /usr/bin/nsr/save -b PGLOGS =
/zones/postgres-on-charlotte/root/opt/postgres/data/dov/data /pg_xlog/00000=
001000000000000000E
face=3D"'Courier New'">2010-07-15 23:02:20 CEST [27810]: [5-1] user=3D,db=3D=
LOG:  archive command failed with exit code =
1
New'">2010-07-15 23:02:20 CEST [27810]: [6-1] user=3D,db=3D DETAIL: =
 The failed archive command was: =
/opt/postgres/dba/scripts/backup_WAL-135.sh =
pg_xlog/00000001000000000000000E
class=3D"Apple-style-span" face=3D"'Courier New'">2010-07-15 23:02:20 =
CEST [27810]: [7-1] user=3D,db=3D WARNING:  transaction log file =
"00000001000000000000000E" could not be archived: too many =
failures
face=3D"'Courier New'">3817:save: Using backup-master1 as =
server

Saving WAL =
file 00000001000000000000000E gives an error. The problem is =
that the WAL file does not exist in pg_xlog. Does someone know =
why?
The only WAL file I could find in pg_xlog =
is 00000002000000000000000D (timeline id is 2). The next thing I =
tried was  looking for 00000001000000000000000E on our backup =
server. It seems that it didn't exist (or has never been =
backupped).
Finally I created an =
empty 00000001000000000000000E in pg_xlog and postgresql saved the =
fake WAL file, so I get rid off the =
warnings.

Did I do something wrong during the =
backup? Or with the =
recover?


Kind =
regards,

style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; font-size: 12px; =
">
Guy



class=3D"Apple-interchange-newline">

=

--Apple-Mail-1-276596982--