在 ProxySQL 的默认配置下,当所有从库出现故障时,查询请求不会自动切换到主库,这会导致业务中断。


以下是一个示例说明:

假设你的架构是一主一从,主库监听在 6666 端口,从库监听在 6667 端口。

Admin> SELECT hostgroup_id, hostname, port, status, weight FROM mysql_servers;
+--------------+-----------------+------+--------+--------+
| hostgroup_id | hostname        | port | status | weight |
+--------------+-----------------+------+--------+--------+
| 0            | 192.168.198.239 | 6666 | ONLINE | 1      |
| 1            | 192.168.198.239 | 6667 | ONLINE | 1      |
+--------------+-----------------+------+--------+--------+
2 rows in set (0.00 sec)


现在,如果你关闭从库 6667 端口,你会发现 select 查询请求挂起,业务中断,并且请求没有切换到主库:

Admin> SELECT hostgroup_id, hostname, port, status, weight FROM runtime_mysql_servers;
+--------------+-----------------+------+---------+--------+
| hostgroup_id | hostname        | port | status  | weight |
+--------------+-----------------+------+---------+--------+
| 0            | 192.168.198.239 | 6666 | ONLINE  | 1      |
| 1            | 192.168.198.239 | 6667 | SHUNNED | 1      |
+--------------+-----------------+------+---------+--------+
2 rows in set (0.01 sec)


/var/lib/proxysql/proxysql.log 日志中,你可能会看到如下错误信息:

2024-08-23 09:28:21 MyHGC.cpp:228:get_random_MySrvC(): [ERROR] Hostgroup 1 has no servers available! Checking servers shunned for more than 1 second

2024-08-23 09:28:21 mysql_connection.cpp:1203:handler(): [ERROR] Failed to mysql_real_connect() on 1:192.168.198.239:6667 , FD (Conn:0 , MyDS:34) , 2002: Can't connect to server on '192.168.198.239' (111).

2024-08-23 09:28:21 MySQL_Monitor.cpp:3158:monitor_ping(): [ERROR] Server 192.168.198.239:6667 missed 3 heartbeats, shunning it and killing all the connections. Disabling other checks until the node comes back online.


解决方案

为了解决这个问题,你可以调整 mysql_servers 表的数据,将主库也加入到从库的 reader_hostgroup 组中。

如下是具体的操作步骤:

  1. 通过查询 mysql_replication_hostgroups 表,确认主库与从库的 hostgroup_id组ID:
  • 主库 (6666 端口) 的 hostgroup_id 为 0。
  • 从库 (6667 端口) 的 hostgroup_id 为 1。
Admin> SELECT * FROM mysql_replication_hostgroups;
+------------------+------------------+------------+------------------------------+
| writer_hostgroup | reader_hostgroup | check_type | comment                      |
+------------------+------------------+------------+------------------------------+
| 0                | 1                | read_only  | Read Write Split Host Groups |
+------------------+------------------+------------+------------------------------+
1 row in set (0.00 sec)


  1. 将主库添加到从库的reader_hostgroup组里:
Admin> update mysql_servers set weight=10 where hostgroup_id=1 and port=6667;
Admin> INSERT INTO mysql_servers(hostgroup_id, hostname, port) VALUES (1, '192.168.198.239', 6666);
Admin> LOAD MYSQL SERVERS TO RUNTIME;
Admin> SAVE MYSQL SERVERS TO DISK;
Admin> select hostgroup_id,hostname,port,status from mysql_servers;
+--------------+-----------------+------+--------+
| hostgroup_id | hostname        | port | status |
+--------------+-----------------+------+--------+
| 0            | 192.168.198.239 | 6666 | ONLINE |
| 1            | 192.168.198.239 | 6667 | ONLINE |
| 1            | 192.168.198.239 | 6666 | ONLINE |
+--------------+-----------------+------+--------+
3 rows in set (0.00 sec)

经过上述配置,当从库出现故障时,查询请求将自动切换到主库,从而避免业务中断。