作者:Bruno Queirós

介绍

PostgreSQL 被认为是世界上最先进的开源关系数据库之一,它已经进行了 30 年的积极开发,并且在可靠性和性能方面赢得了很高的声誉。

对于生产环境,您可能希望部署一种解决方案,即使该服务器关闭或数据库只是停止响应,也可以保证对数据库的访问,这是复制,高可用性和负载平衡之类的功能到位的时候。

Postgresql 允许辅助服务器(备用服务器)从主服务器(主服务器)复制数据,以便它成为主服务器(复制)的忠实副本,这些数据库服务器可以 共同工作,以允许第二台服务器在主服务器发生故障时接管(高可用性),甚至允许多台计算机提供相同的数据(负载平衡)。

实施

我想部署一个简单的解决方案,该解决方案是免费的,并且如果我的主服务器出现问题,可以处理故障转移,所以我选择了复制以将数据维护在两个不同的数据库上,KeepAlived 可以执行运行状况检查,故障转移并还可以管理 虚拟 IP,因此我只使用相同的 IP,而与数据库的位置无关。

postgres 高可用 方案 ppt pgsql 高可用_hive


Postgresql Failover cenario - Before and after

要求

  • 具有 postgresql 和 keepalived 的服务器-主服务器(172.17.0.2)
  • 具有 postgresql 和 keepalived 的服务器-备用(172.17.0.3)
  • 在主服务器上设置复制
  • 创建运行状况检查脚本,该脚本检查数据库是否正常,如果不正常,则将备用数据库提升为主数据库并迁移虚拟 IP
  • 虚拟 IP(172.17.0.1)
  • 将发出数据库请求的 IP(172.17.0.4)

主备安装配置

安装 PostgreSQL - 主

echo  "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -

apt-get update
apt-get install postgresql-12 -y


echo  "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -

apt-get update
apt-get install postgresql-12 -y

配置复制 - 主

$ vi /etc/postgresql/12/main/postgresql.conf

listen_addresses = '*'
wal_level = replica
wal_log_hints = on
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
max_wal_senders = 10
wal_keep_segments = 10

$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/

$ vi /etc/postgresql/12/main/pg_hba.conf
host    replication     all             172.17.0.3/24     trust
$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/
 
 
$ vi /etc/postgresql/12/main/pg_hba.conf
 
host    replication     all             172.17.0.3/24     trust
host    clusterdb       postgres        172.17.0.4/24     trust

$ vi /etc/postgresql/12/main/postgresql.conf

listen_addresses = '*'
wal_level = replica
wal_log_hints = on
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
max_wal_senders = 10
wal_keep_segments = 10

$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/

$ vi /etc/postgresql/12/main/pg_hba.conf
host    replication     all             172.17.0.3/24     trust
$ mkdir /var/lib/postgresql/12/main/archivedir/
$ chown -R postgres:postgres /var/lib/postgresql/12/main/archivedir/
 
 
$ vi /etc/postgresql/12/main/pg_hba.conf
 
host    replication     all             172.17.0.3/24     trust
host    clusterdb       postgres        172.17.0.4/24     trust

创建 clusterdb db 数据库并生成一些数据,以便我们可以查看复制是否有效

$ systemctl start postgresql
$ su - postgres
$ createdb clusterdb
$ psql -d clusterdb -U postgres -c "CREATE TABLE test_table(x integer)"
$ psql -d clusterdb -U postgres -c "INSERT INTO test_table(x) SELECT y FROM generate_series(1, 100) a(y)"

$ psql -d clusterdb -U postgres -c "SELECT count(\*) from test_table"

$ systemctl start postgresql
$ su - postgres
$ createdb clusterdb
$ psql -d clusterdb -U postgres -c "CREATE TABLE test_table(x integer)"
$ psql -d clusterdb -U postgres -c "INSERT INTO test_table(x) SELECT y FROM generate_series(1, 100) a(y)"

$ psql -d clusterdb -U postgres -c "SELECT count(\*) from test_table"

安装 Postgresql - 备

echo  "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -

apt-get update
apt-get install postgresql-12 -y


echo  "deb http://apt.postgresql.org/pub/repos/apt/ bionic-pgdg main" > /etc/apt/sources.list.d/pgdg.list

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -

apt-get update
apt-get install postgresql-12 -y

配置复制 - 备

$ systemctl stop postgresql
$ mv /var/lib/postgresql/12 /root
$ su - postgres
$ pg_basebackup -h IP_MASTER -U postgres -D /var/lib/postgresql/12/main -P -Xs -R


$ vi /etc/postgresql/12/main/postgresql.conf

listen_addresses = '*'
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
restore_command = 'cp /var/lib/postgresql/12/main/archivedir/%f %p'
archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/12/main/archivedir %r'
max_wal_senders = 10
wal_keep_segments = 10
hot_standby = on

$ vi /etc/postgresql/12/main/pg_hba.conf

host    replication   all          172.17.0.2/24    trust
host    clusterdb     postgres     172.17.0.4/24    trust


$ systemctl stop postgresql
$ mv /var/lib/postgresql/12 /root
$ su - postgres
$ pg_basebackup -h IP_MASTER -U postgres -D /var/lib/postgresql/12/main -P -Xs -R


$ vi /etc/postgresql/12/main/postgresql.conf

listen_addresses = '*'
wal_level = replica
archive_mode = on
archive_command = 'test ! -f /var/lib/postgresql/12/main/archivedir/%f && cp %p /var/lib/postgresql/12/main/archivedir/%f'
restore_command = 'cp /var/lib/postgresql/12/main/archivedir/%f %p'
archive_cleanup_command = 'pg_archivecleanup /var/lib/postgresql/12/main/archivedir %r'
max_wal_senders = 10
wal_keep_segments = 10
hot_standby = on

$ vi /etc/postgresql/12/main/pg_hba.conf

host    replication   all          172.17.0.2/24    trust
host    clusterdb     postgres     172.17.0.4/24    trust

验证 复制

$ psql -d clusterdb -h 172.17.0.2 -U postgres -c "SELECT count(*) from test_table"

# You should get a count of 100

$ psql -d clusterdb -h 172.17.0.2 -U postgres -c "SELECT count(*) from test_table"

# You should get a count of 100

故障转移

从用户 postgres(主和备)交换 SSH 密钥

$ su - postgres
$ ssh-keygen -t rsa
$ vi .ssh/authorized_keys

# paste id_rsa.pub of postgres user on each server

$ su - postgres
$ ssh-keygen -t rsa
$ vi .ssh/authorized_keys

# paste id_rsa.pub of postgres user on each server

测试用户 postgres 从主服务器到从服务器的 SSH 访问(无需输入密码)

安装 KeepAlived - 主

$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts

$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts

安装故障转移脚本 - 主

$ sudo vi /etc/keepalived/scripts/check_postgres


#!/bin/bash
# Run me on the Master

master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"

(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1


$ sudo chmod +x /etc/keepalived/scripts/check_postgres

$ sudo vi /etc/keepalived/scripts/check_postgres


#!/bin/bash
# Run me on the Master

master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"

(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1


$ sudo chmod +x /etc/keepalived/scripts/check_postgres

安装 KeepAlived - 备

$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts

$ sudo apt-get install keepalived -y
$ sudo mkdir /etc/keepalived/scripts

安装故障转移脚本 - 备

$ sudo vi /etc/keepalived/scripts/check_postgres


#!/bin/bash
# Run me on the Slave

master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"

(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1


$ sudo chmod +x /etc/keepalived/scripts/check_postgres

$ sudo vi /etc/keepalived/scripts/check_postgres


#!/bin/bash
# Run me on the Slave

master_ip="172.17.0.2"
slave_ip="172.17.0.3"
pg_ctl="/usr/lib/postgresql/12/bin/pg_ctl"
pg_home="/var/lib/postgresql/12/main/"

(echo >/dev/tcp/"$master_ip"/5432) &>/dev/null && echo "All is OK"; exit 0 || ssh postgres@"$slave_ip" "$pg_ctl -D $pg_home promote"; exit 1


$ sudo chmod +x /etc/keepalived/scripts/check_postgres

发生故障转移时,我们必须构建一个新的备用服务器,然后使用该新 IP 更新 slave_IP。

配置 KeepAlived - 主

$ sudo vi /etc/keepalived/keepalived.conf


vrrp_script chk_pg_port {
        script "/bin/bash /etc/keepalived/scripts/check_postgres"
        interval 2
        weight 2
}
vrrp_instance VI_1 {
        interface enp0s8
        state MASTER
        virtual_router_id 51
        priority 101
        authentication {
            auth_type PASS
            auth_pass Add-Your-Password-Here
        }
        track_script {
            chk_pg_port
        }
        virtual_ipaddress {
                172.17.0.1/24 dev enp0s8
        }
}

$ sudo vi /etc/keepalived/keepalived.conf


vrrp_script chk_pg_port {
        script "/bin/bash /etc/keepalived/scripts/check_postgres"
        interval 2
        weight 2
}
vrrp_instance VI_1 {
        interface enp0s8
        state MASTER
        virtual_router_id 51
        priority 101
        authentication {
            auth_type PASS
            auth_pass Add-Your-Password-Here
        }
        track_script {
            chk_pg_port
        }
        virtual_ipaddress {
                172.17.0.1/24 dev enp0s8
        }
}

enp0s8 - 替换成你的 nic

配置 KeepAlived - 备

$ sudo vi /etc/keepalived/keepalived.conf


vrrp_script chk_pg_port {
        script "/bin/bash /etc/keepalived/scripts/check_postgres"
        interval 2
        weight 2
}
vrrp_instance VI_1 {
        interface enp0s8
        state MASTER
        virtual_router_id 51
        priority 100
        authentication {
            auth_type PASS
            auth_pass Add-Your-Password-Here
        }
        track_script {
            chk_pg_port
        }
        virtual_ipaddress {
                172.17.0.1/24 dev enp0s8
        }
}

$ sudo vi /etc/keepalived/keepalived.conf


vrrp_script chk_pg_port {
        script "/bin/bash /etc/keepalived/scripts/check_postgres"
        interval 2
        weight 2
}
vrrp_instance VI_1 {
        interface enp0s8
        state MASTER
        virtual_router_id 51
        priority 100
        authentication {
            auth_type PASS
            auth_pass Add-Your-Password-Here
        }
        track_script {
            chk_pg_port
        }
        virtual_ipaddress {
                172.17.0.1/24 dev enp0s8
        }
}

运行 KeepAlived - 主和备

$ sudo systemctl start keepalived

$ sudo systemctl start keepalived

检查错误

$ sudo journalctl -u keepalived -f

$ sudo journalctl -u keepalived -f

现在可以使用 VIP 检查复制

$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"

You should get a count of 100

$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"

You should get a count of 100

测试故障转移

$ ssh 172.17.0.1
$ sudo shutdown -h now

$ ssh 172.17.0.1
$ sudo shutdown -h now

几秒钟后查询应该成功

$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"

$ psql -d clusterdb -h 172.17.0.1 -U postgres -c "SELECT count(\*) from test_table"

这是用于 Postgresql 数据库故障转移问题的简单且廉价的解决方案,有付费和免费的应用程序也可以解决此问题。还可以实现一些改进:

  • 使用 Postgresql 创建自定义 AMI
  • 使用配置管理工具(salt、ansible、puppet 等)将其配置为 Postgresql 备服务器并安装 KeepAlive
  • 当发生故障转移时,运行状况脚本也可以用于部署新服务器。

参考资料


[1]

Postgresql replication with automatic failover: https://www.linkedin.com/pulse/postgresql-replication-automatic-failover-bruno-queir%C3%B3s/

喜欢我的文章,烦请【关注】【转发】 ,??!