Configuration of Postgres Warm Standby Database
This document describes how to configure a warm standby database using Postgres Plus from Enterprisedb. The instructions are nearly identical for Postgres.
Install Postgres on Two Servers
The major versions of Postgres must be the same on both servers. You could get away with minor versions being slightly different.
Enable Archiving On Primary Server
In the postgres.conf file, set the following parameters:
archive_command = 'cp -v %p /archivelogs/%f'
archive_timeout = 300
Archive timeout will force WAL to archive the log files every 5 minutes.
Test Archiving On Primary
edb-psql codmigration
select pg_switch_xlog();
cd /opt/PostgresPlus/8.3AS/data/archivelogs
ls -l
If you don’t see any log files, then it is not working.
Restart Postgres Primary Server To Pickup Archiving Changes
service edb_8.3 restart
or reload the conf file
select pg_reload_conf
();
Backup Primary Server
SELECT pg_start_backup('standby_backup');
tar cvf backup.tar /opt/PostgresPlus/8.3AS/data
gzip backup.tar
SELECT pg_stop_backup();
Move Backup To Standby Server
scp backup.tar.gz jeffreyh@remoteserver:/home/jeffreyh/Desktop
mv backup.tar.gz /opt/PostgresPlus/8.3AS/
Restore Primary Backup On Standby Server
cd /opt/PostgresPlus/8.3AS/data
gzip –d backup.tar.gz
tar xvf backup.tar
Ship Archive Log Files To Standby Server
We’ll use rsync to ship the archive log files to the remote standby server. This allows us to move the log files without NFS mounting a file system which is the normal way. We create a cronjob that runs every 30 seconds.
ship_archivelogs.sh
rsync -avz --delete 10.1.8.33:/opt/PostgresPlus/8.3AS/data/archivelogs/ /opt/PostgresPlus/8.3AS/data/archivelogs/ > /dev/null
-a = recursion & preserve everything (file permissions, etc.)
-v = verbose
-z = compress files
--delete = delete files at the destination that are not on source side
The reason rsync is used in the archive_command is that rsync features an 'atomic copy' - that is, the in-progress destination file is created as a temp file, and then renamed when the copy is complete. In the situation above, where segments are archived straight to the directory that the slave reads from, 'cp' can cause an error whereby
the slave attempts to process a partially-copied WAL segment. If this happens, postgres will emit an error like:
PANIC: archive file "000000010000000000000031" has wrong size:
1810432 instead of 16777216
LOG: startup process (PID 11356) was terminated by signal 6
LOG: aborting startup due to startup process failure
Configure Recovery On Standby Server
Create recovery.conf file in /opt/PostgresPlus/8.3AS/data directory.
restore_command = '/opt/PostgresPlus/8.3AS/dbserver/bin/pg_standby
-l -d -s 2 -t /tmp/pgsql.trigger /opt/PostgresPlus/8.3AS/data/archivelogs
%f %p %r 2>>standby.log'
Basically pg_standby blocks until it either finds the next log file requested by the server or sees a file named /tmp/pgsql.trigger. Then recovery is done and your standby server pops up and is open for business.
Testing If Standby Receiving Log Files
tail –f /opt/PostgresPlus/8.3AS/data/standby.log
Testing If Standby Applying Log Files
tail –f /opt/PostgresPlus/8.3AS/data/standby.log
Activating The Standby Database
Controlled Failover To Standby On Primary
in psql command prompt:
checkpoint
select pg_switch_xlog();
On Standby
As the postgres user (enterprisedb) touch the trigger file:
touch /tmp/pgsql.trigger
This should cause Postgres to exit recovery mode and open up in normal mode. You should see the file recovery.conf in /opt/PostgresPlus/8.3AS/data directory get renamed to recovery.done
Also the /tmp/pgsql.trigger file will be removed.
Failback To Primary
Unfortunately, you cannot reverse roles and put the new primary back to the standby mode if you actually fix the old primary server. You have to recreate the new standby from scratch.
Good luck.