1. postgres 主备环境搭建
1.1 HA 实现方式
热备份、冷备份
冷备份:物理备份,也就是文件复制。主机写完一个WAL日志文件后,才复制到standby服务器上去应用。
热备份:流复制,也就是同步复制或异步复制。主备机实时同步。
1.2 操作步骤
主备机安装在相同的机器上
(1)主备机PGDATA路径
primary PGDATA路径:
[wln@localhost data]$ echo $PGDATA
/home/wln/postgres9.3/datastandby PGDATA路径:
[wln@localhost standby]$ echo $PGDATA
/home/wln/standby
(2)修改primary 的postgresql.conf 文件
wal_level = hot_standby //热备模式
max_wal_senders = 2 //primary 最多允许的并发standby数据库
wal_keep_segments = 32 //in logfile segments, 16MB each; 0 disables。该xlog日志会循环覆盖,请将该值设置适当大些(需考虑磁盘空间、数据量增长情况)logging_collector = on
(3)修改primary的pg_hba.conf
host replication repluser 127.0.0.1/32 trust
数据库名必须填“replication”, 这是一个为standby连接使用了一个虚拟的数据库名称。用户repluser是给standby连接使用的在主库上建的一个超级用户。
127.0.0.1 是standby数据库的IP地址。
(4)创建一个数据库超级用户repluser,用来给standby备机连接主机使用。
[wln@localhost data]$ psql -d postgres
psql (9.3beta2)
Type "help" for help.postgres=# create user repluser superuser password 'password';
CREATE ROLE
(5)重启primary数据库
pg_ctl restart
(6)对主数据库做一个基础备份
用select pg_start_backup();命令把数据库切换到备份状态
[wln@localhost data]$ psql -d postgres
psql (9.3beta2)
Type "help" for help.postgres=# select pg_start_backup('/home/wln/standby');
pg_start_backup
-----------------
0/6000024
(1 row)(不要退出)
(7)新开1个session2,将主机数据目录copy到备机
[wln@localhost ~]$ cp -r postgres9.3/data/* standby/
[wln@localhost ~]$ cd standby/[wln@localhost standby]$ ll
总计 104
-rw------- 1 wln wln 205 07-06 04:13 backup_label
drwx------ 5 wln wln 4096 07-06 04:13 base
drwx------ 2 wln wln 4096 07-06 04:13 global
drwx------ 2 wln wln 4096 07-06 04:13 pg_clog
-rw------- 1 wln wln 4461 07-06 04:13 pg_hba.conf
-rw------- 1 wln wln 1636 07-06 04:13 pg_ident.conf
drwx------ 2 wln wln 4096 07-06 04:13 pg_log
drwx------ 4 wln wln 4096 07-06 04:13 pg_multixact
drwx------ 2 wln wln 4096 07-06 04:13 pg_notify
drwx------ 2 wln wln 4096 07-06 04:13 pg_serial
drwx------ 2 wln wln 4096 07-06 04:13 pg_snapshots
drwx------ 2 wln wln 4096 07-06 04:13 pg_stat
drwx------ 2 wln wln 4096 07-06 04:13 pg_stat_tmp
drwx------ 2 wln wln 4096 07-06 04:13 pg_subtrans
drwx------ 2 wln wln 4096 07-06 04:13 pg_tblspc
drwx------ 2 wln wln 4096 07-06 04:13 pg_twophase
-rw------- 1 wln wln 4 07-06 04:13 PG_VERSION
drwx------ 3 wln wln 4096 07-06 04:13 pg_xlog
-rw------- 1 wln wln 20312 07-06 04:13 postgresql.conf
-rw------- 1 wln wln 43 07-06 04:13 postmaster.opts
-rw------- 1 wln wln 83 07-06 04:13 postmaster.pid(8)copy完成,结束主机备份状态
postgres=# select pg_stop_backup();
NOTICE: WAL archiving is not enabled; you must ensure that all required WAL segments are copied through other means to complete the backup
pg_stop_backup
----------------
0/60000DC
(1 row)(9)在session2上操作
修改备机postgresql.conf文件
port=5433
hot_standby=on
将share/recovery.conf.sample 复制到standby目录下,并修改如下:
standby_mode = 'on'
primary_conninfo = 'host=127.0.0.1 port=5432 user=repluser password=password'
trigger_file = '/home/wln/standby/trigger_activestb'
删除standby目录下的postmaster.pid文件,执行
[wln@localhost standby]$export PGDATA=/home/wln/standby
[wln@localhost standby]$ echo $PGDATA
/home/wln/standby(10)启动备机
在session2操作
pg_ctl start
检查有没有正常启动pg_ctl status
若正常启动,pg_log下日志:
LOG: database system was shut down at 2014-07-06 03:55:40 CST
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
LOG: database system was interrupted; last known up at 2014-07-06 04:13:09 CST
LOG: entering standby modeLOG: redo starts at 0/6000024
LOG: record with zero length at 0/60000B8
LOG: started streaming WAL from primary at 0/6000000 on timeline 1
LOG: consistent recovery state reached at 0/60000DC(11)主机上执行一些操作,查看备机有没有同步过来
主机上:
postgres=# \d
No relations found.
postgres=# create table t1(id int);
CREATE TABLE
postgres=# insert into t1 select generate_series(1,10);
INSERT 0 10postgres=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
0/7034470
(1 row)postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]----+------------------------------
pid | 10183
usesysid | 24715
usename | repl
application_name | walreceiver
client_addr | 127.0.0.1
client_hostname |
client_port | 41560
backend_start | 2014-07-06 04:15:45.185796+08
state | streaming
sent_location | 0/7034470
write_location | 0/7034470
flush_location | 0/7034470
replay_location | 0/7034470
sync_priority | 0
sync_state | async
备机上:
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+-------
public | t1 | table | wln
(1 row)postgres=# select * from t1;
id
----
1
2
3
4
5
6
7
8
9
10
(10 rows)postgres=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
0/7034470
(1 row)postgres=# select * from pg_stat_replication ;
(No rows)
postgres=# create table t2(id int);
ERROR: cannot execute CREATE TABLE in a read-only transaction(12)一些参数设置
synchronous_standby_names = '*' # standby servers that provide sync rep
# comma-separated list of application_name
# from standby(s); '*' = all(默认为‘’)
synchronous_commit = on # synchronization level;
# off, local, remote_write, or on
synchronous_commit is set to on or remote_write
synchronous_standby_names. The first named standby will be used as the synchronous standby. Standbys listed after this will take over the role of synchronous standby if the first one should fail.
3. 主备切换(方式1)
主备不会自动切换,需借助第三方软件,如heartbeat等,请自行搜索。
(1)如何查看是primary还是standby
方法1:查看进程
主机 sender
[wln@localhost data]$ ps -ef | grep wal
wln 10421 10416 0 07:52 ? 00:00:00 postgres: wal writer process
wln 10691 10416 0 08:04 ? 00:00:00 postgres: wal sender process repluser 127.0.0.1(59226) streaming 0/6011BAC
wln 10702 8927 0 08:05 pts/2 00:00:00 grep wal
备机 receiver
[wln@localhost data]$ ps -ef | grep wal
wln 10421 10416 0 07:52 ? 00:00:00 postgres: wal writer process
wln 10690 10685 3 08:04 ? 00:00:03 postgres: wal receiver process streaming 0/6011BAC
wln 10702 8927 0 08:05 pts/2 00:00:00 grep wal方法2:
主机:
[wln@localhost data]$ pg_controldata | grep Database
Database system identifier: 6032728846531166188
Database cluster state: in productionDatabase block size: 8192 备机:
[wln@localhost standby]$ pg_controldata | grep Database
Database system identifier: 6032728846531166188
Database cluster state: in archive recoveryDatabase block size: 8192(2)备机切换为主机
将主机停掉
[wln@localhost data]$ pg_ctl stop
waiting for server to shut down.... done
server stopped在备机PGDATA对应目录下生成recovery.conf文件中指定的文件trigger_file = '/home/wln/standby/trigger_activestb' (此时备机正常运行状态)
touch /home/wln/standby/trigger_activestb
会看到 recovery.conf 文件变为 recovery.done
查看原来备机日志:
LOG: trigger file found: /home/wln/standby/trigger_activestb
LOG: redo done at 0/6011C40
LOG: last completed transaction was at log time 2014-07-06 08:05:40.152147+08
LOG: selected new timeline ID: 2
LOG: archive recovery complete
LOG: database system is ready to accept connections
LOG: autovacuum launcher started登录原来的备机
[wln@localhost standby]$ psql -d postgres -p 5433
psql (9.3beta2)
Type "help" for help.postgres=# show transaction_read_only;
transaction_read_only
-----------------------
off
(1 row)可以看出现在原备机具有读写功能。即主备切换成功
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+-------
public | t1 | table | wln
(1 row)postgres=# create table t2(id int);
CREATE TABLE
4、主备切换(方式2)
备机中recovery.conf中设置为:
standby_mode = on
recovery_target_timeline = 'latest'
primary_conninfo = 'host=localhost port=5433 user=repluser password=password'
#trigger_file = '/home/wln/standby/trigger_activestb' #这个注释掉
备机升为主机方式为:pg_ctl promote -D standbyPath
会看到recovery.conf过几秒后变为recovery.done 表示升级为主机成功。
其他的请参考方式1中内容。
5. 简单脚本
#master
mDATA=/home/$USER/pg94/data
port=5432
wal_level=hot_standby
max_wal_senders=2
wal_keep_segments=5
logging_collector=on
log_line_prefix='%d %p %t %c %i %e '
#standby
sDATA=/home/$USER/pg94/standby
mkdir $sDATA
chmod 700 $sDATA
sport=5433
shot_standby=on
pg_ctl stop -m i
pg_ctl stop -D $sDATA -m i
rm -rf $sDATA/*
#step
sed -i /port\ =/c\port\ =\ $port $mDATA/postgresql.conf
sed -i /wal_level\ =/c\wal_level\ =\ $wal_level $mDATA/postgresql.conf
sed -i /max_wal_senders\ =/c\max_wal_senders\ =\ $max_wal_senders $mDATA/postgresql.conf
sed -i /wal_keep_segments\ =/c\wal_keep_segments\ =\ $wal_keep_segments $mDATA/postgresql.conf
sed -i /logging_collector\ =/c\logging_collector\ =\ $logging_collector $mDATA/postgresql.conf
sed -i /log_line_prefix\ =/c\log_line_prefix\ =\ "'$log_line_prefix'" $mDATA/postgresql.conf
sed -i 's/#host/host/' $mDATA/pg_hba.conf
#restart master
pg_ctl restart
psql -c "alter user $USER password '123456'"
psql -c "select pg_start_backup('label');"
#copy data
cp -r $mDATA/* $sDATA
psql -c "select pg_stop_backup();"
#modify standby port
sed -i /port\ =/c\port\ =\ $sport $sDATA/postgresql.conf
sed -i /hot_standby\ =/c\hot_standby\ =\ $shot_standby $sDATA/postgresql.conf
rm -rf $sDATA/postmaster.pid
#recovery.conf
cp $PGHOME/share/recovery.conf.sample $sDATA/recovery.conf
sed -i /standby_mode/c\standby_mode=on $sDATA/recovery.conf
sed -i /primary_conninfo\ =/c\primary_conninfo="'host=127.0.0.1 port=5432 user=$USER password=123456'" $sDATA/recovery.conf
sed -i /trigger_file/c\trigger_file="'$sDATA/trigger'" $sDATA/recovery.conf
pg_ctl start -D $sDATA