作者:Bruno Queirós
介绍
PostgreSQL 被认为是世界上最先进的开源关系数据库之一,它已经进行了 30 年的积极开发,并且在可靠性和性能方面赢得了很高的声誉。
对于生产环境,您可能希望部署一种解决方案,即使该服务器关闭或数据库只是停止响应,也可以保证对数据库的访问,这是复制,高可用性和负载平衡之类的功能到位的时候。
Postgresql 允许辅助服务器(备用服务器
)从主服务器(主服务器
)复制数据,以便它成为主服务器(复制
)的忠实副本,这些数据库服务器可以 共同工作,以允许第二台服务器在主服务器发生故障时接管(高可用性
),甚至允许多台计算机提供相同的数据(负载平衡
)。
实施
我想部署一个简单的解决方案,该解决方案是免费的,并且如果我的主服务器出现问题,可以处理故障转移,所以我选择了复制以将数据维护在两个不同的数据库上,KeepAlived 可以执行运行状况检查,故障转移并还可以管理 虚拟 IP,因此我只使用相同的 IP,而与数据库的位置无关。
Postgresql Failover cenario - Before and after
要求
- 具有 postgresql 和 keepalived 的服务器-主服务器(172.17.0.2)
- 具有 postgresql 和 keepalived 的服务器-备用(172.17.0.3)
- 在主服务器上设置复制
- 创建运行状况检查脚本,该脚本检查数据库是否正常,如果不正常,则将备用数据库提升为主数据库并迁移虚拟 IP
- 虚拟 IP(172.17.0.1)
- 将发出数据库请求的 IP(172.17.0.4)
主备安装配置
安装 PostgreSQL - 主
echo "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
apt-get update
apt-get install postgresql-12 -y
echo "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
apt-get update
apt-get install postgresql-12 -y
配置复制 - 主
$ vi /etc/postgresql/12/main/postgresql.conf
listen_addresses = '*'
wal_level = replica
wal_log_hints = on
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
max_wal_senders = 10
wal_keep_segments = 10
$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/
$ vi /etc/postgresql/12/main/pg_hba.conf
host replication all 172.17.0.3/24 trust
$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/
$ vi /etc/postgresql/12/main/pg_hba.conf
host replication all 172.17.0.3/24 trust
host clusterdb postgres 172.17.0.4/24 trust
$ vi /etc/postgresql/12/main/postgresql.conf
listen_addresses = '*'
wal_level = replica
wal_log_hints = on
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
max_wal_senders = 10
wal_keep_segments = 10
$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/
$ vi /etc/postgresql/12/main/pg_hba.conf
host replication all 172.17.0.3/24 trust
$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/
$ vi /etc/postgresql/12/main/pg_hba.conf
host replication all 172.17.0.3/24 trust
host clusterdb postgres 172.17.0.4/24 trust
创建 clusterdb db 数据库并生成一些数据,以便我们可以查看复制是否有效
$ systemctl start postgresql
$ su - postgres
$ createdb clusterdb
$ psql -d clusterdb -U postgres -c "CREATE TABLE test_table(x integer)"
$ psql -d clusterdb -U postgres -c "INSERT INTO test_table(x) SELECT y FROM generate_series(1, 100) a(y)"
$ psql -d clusterdb -U postgres -c "SELECT count(\*) from test_table"
$ systemctl start postgresql
$ su - postgres
$ createdb clusterdb
$ psql -d clusterdb -U postgres -c "CREATE TABLE test_table(x integer)"
$ psql -d clusterdb -U postgres -c "INSERT INTO test_table(x) SELECT y FROM generate_series(1, 100) a(y)"
$ psql -d clusterdb -U postgres -c "SELECT count(\*) from test_table"
安装 Postgresql - 备
echo "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
apt-get update
apt-get install postgresql-12 -y
echo "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
apt-get update
apt-get install postgresql-12 -y
配置复制 - 备
$ systemctl stop postgresql
$ mv /var/lib/postgresql/12 /root
$ su - postgres
$ pg_basebackup -h IP_MASTER -U postgres -D /var/lib/postgresql/12/main -P -Xs -R
$ vi /etc/postgresql/12/main/postgresql.conf
listen_addresses = '*'
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
restore_command = 'cp /var/lib/postgresql/12/main/archivedir/%f %p'
archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/12/main/archivedir %r'
max_wal_senders = 10
wal_keep_segments = 10
hot_standby = on
$ vi /etc/postgresql/12/main/pg_hba.conf
host replication all 172.17.0.2/24 trust
host clusterdb postgres 172.17.0.4/24 trust
$ systemctl stop postgresql
$ mv /var/lib/postgresql/12 /root
$ su - postgres
$ pg_basebackup -h IP_MASTER -U postgres -D /var/lib/postgresql/12/main -P -Xs -R
$ vi /etc/postgresql/12/main/postgresql.conf
listen_addresses = '*'
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
restore_command = 'cp /var/lib/postgresql/12/main/archivedir/%f %p'
archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/12/main/archivedir %r'
max_wal_senders = 10
wal_keep_segments = 10
hot_standby = on
$ vi /etc/postgresql/12/main/pg_hba.conf
host replication all 172.17.0.2/24 trust
host clusterdb postgres 172.17.0.4/24 trust
验证 复制
$ psql -d clusterdb -h 172.17.0.2 -U postgres -c "SELECT count(*) from test_table"
# You should get a count of 100
$ psql -d clusterdb -h 172.17.0.2 -U postgres -c "SELECT count(*) from test_table"
# You should get a count of 100
故障转移
从用户 postgres(主和备)交换 SSH 密钥
$ su - postgres
$ ssh-keygen -t rsa
$ vi .ssh/authorized_keys
# paste id_rsa.pub of postgres user on each server
$ su - postgres
$ ssh-keygen -t rsa
$ vi .ssh/authorized_keys
# paste id_rsa.pub of postgres user on each server
测试用户 postgres 从主服务器到从服务器的 SSH 访问(无需输入密码)
安装 KeepAlived - 主
$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts
$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts
安装故障转移脚本 - 主
$ sudo vi /etc/keepalived/scripts/check_postgres
#!/bin/bash
# Run me on the Master
master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"
(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1
$ sudo chmod +x /etc/keepalived/scripts/check_postgres
$ sudo vi /etc/keepalived/scripts/check_postgres
#!/bin/bash
# Run me on the Master
master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"
(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1
$ sudo chmod +x /etc/keepalived/scripts/check_postgres
安装 KeepAlived - 备
$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts
$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts
安装故障转移脚本 - 备
$ sudo vi /etc/keepalived/scripts/check_postgres
#!/bin/bash
# Run me on the Slave
master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"
(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1
$ sudo chmod +x /etc/keepalived/scripts/check_postgres
$ sudo vi /etc/keepalived/scripts/check_postgres
#!/bin/bash
# Run me on the Slave
master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"
(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1
$ sudo chmod +x /etc/keepalived/scripts/check_postgres
发生故障转移时,我们必须构建一个新的备用服务器,然后使用该新 IP 更新 slave_IP。
配置 KeepAlived - 主
$ sudo vi /etc/keepalived/keepalived.conf
vrrp_script chk_pg_port {
script "/bin/bash /etc/keepalived/scripts/check_postgres"
interval 2
weight 2
}
vrrp_instance VI_1 {
interface enp0s8
state MASTER
virtual_router_id 51
priority 101
authentication {
auth_type PASS
auth_pass Add-Your-Password-Here
}
track_script {
chk_pg_port
}
virtual_ipaddress {
172.17.0.1/24 dev enp0s8
}
}
$ sudo vi /etc/keepalived/keepalived.conf
vrrp_script chk_pg_port {
script "/bin/bash /etc/keepalived/scripts/check_postgres"
interval 2
weight 2
}
vrrp_instance VI_1 {
interface enp0s8
state MASTER
virtual_router_id 51
priority 101
authentication {
auth_type PASS
auth_pass Add-Your-Password-Here
}
track_script {
chk_pg_port
}
virtual_ipaddress {
172.17.0.1/24 dev enp0s8
}
}
enp0s8 - 替换成你的 nic
配置 KeepAlived - 备
$ sudo vi /etc/keepalived/keepalived.conf
vrrp_script chk_pg_port {
script "/bin/bash /etc/keepalived/scripts/check_postgres"
interval 2
weight 2
}
vrrp_instance VI_1 {
interface enp0s8
state MASTER
virtual_router_id 51
priority 100
authentication {
auth_type PASS
auth_pass Add-Your-Password-Here
}
track_script {
chk_pg_port
}
virtual_ipaddress {
172.17.0.1/24 dev enp0s8
}
}
$ sudo vi /etc/keepalived/keepalived.conf
vrrp_script chk_pg_port {
script "/bin/bash /etc/keepalived/scripts/check_postgres"
interval 2
weight 2
}
vrrp_instance VI_1 {
interface enp0s8
state MASTER
virtual_router_id 51
priority 100
authentication {
auth_type PASS
auth_pass Add-Your-Password-Here
}
track_script {
chk_pg_port
}
virtual_ipaddress {
172.17.0.1/24 dev enp0s8
}
}
运行 KeepAlived - 主和备
$ sudo systemctl start keepalived
$ sudo systemctl start keepalived
检查错误
$ sudo journalctl -u keepalived -f
$ sudo journalctl -u keepalived -f
现在可以使用 VIP 检查复制
$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"
You should get a count of 100
$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"
You should get a count of 100
测试故障转移
$ ssh 172.17.0.1
$ sudo shutdown -h now
$ ssh 172.17.0.1
$ sudo shutdown -h now
几秒钟后查询应该成功
$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"
$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"
这是用于 Postgresql 数据库故障转移问题的简单且廉价的解决方案,有付费和免费的应用程序也可以解决此问题。还可以实现一些改进:
- 使用 Postgresql 创建自定义 AMI
- 使用配置管理工具(salt、ansible、puppet 等)将其配置为 Postgresql 备服务器并安装 KeepAlive
- 当发生故障转移时,运行状况脚本也可以用于部署新服务器。
参考资料
[1]
Postgresql replication with automatic failover: https://www.linkedin.com/pulse/postgresql-replication-automatic-failover-bruno-queir%C3%B3s/
喜欢我的文章,烦请【关注】【转发】 ,??!