Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL. This allow you to receive event like insert, update, delete with their datas and raw SQL queries.
python-mysql-replication 是基于python实现的 MySQL复制协议工具,我们可以用它来解析binlog 获取日志的insert,update,delete等事件 ,并基于此做其他业务需求。比如数据更改时失效缓存,监听dml事件通知下游业务方做对应处理。
一.使用场景
1.MySQL to NoSQL database replication
2.MySQL to search engine replication
3.Invalidate cache when something change in database
4.Audit
5.Real time analytics
二.主要模块
1.BinLogStreamReader
python-mysql-replication
的入口的类。这个类支持用户传入mysql配置,slave需要同步的信息等,同时实现了__iter__,注册slave,读packet。
使用该工具时需要实例化一个BinLogStreamReader()对象 stream。BinLogStreamReader 通过 ReportSlave 向主库注册作为一个slave角色,用于接受MySQL的binlog广播 。
文件名字为 binlogstream.py
2.BinLogPacketWrapper
BinLogPacketWrapper类,mysql网络包序列化和反序列化。
存放的文件为packet.py
3.事件类
各个event(select、update、insert、delete、rollback、heartbeat等)对应的实现类,全都继承子BinlogEvent,在BinLogPacketWrapper类中把获取到的event映射到对应的evnet处理类
存放的文件为event.py
4.依赖的基础文件
pymysql中的connnections.py:Connection类,实现连接、读写mysql包(具体包格式由protocol实现)。
pymysql中的protocol.py:MysqlPackge类,具体包的格式和读写。
三.已有功能实现(案例)
Projects | Remark | Github |
binlog2sql | a popular binlog parser that could convert raw binlog to sql and also could generate flashback sql from raw binlog | |
mymongo | MySQL to mongo replication | |
MySQLStreamer | MySQLStreamer is a database change data capture and publish system | |
BitSwanPump | A real-time stream processor | |
Aventri MySQL Monitor | MySQL Monitor is an application which continuously watches a MySQL database for data updates and publishes information about those changes to a message queue on RabbitMQ. | |
elasticsearch-river-mysql | MySQL River Plugin for ElasticSearch | |
pg_chameleon | Migration and replica from MySQL to PostgreSQL | |
python-mysql-eventprocessor | Daemon interface for handling MySQL binary log events. | |
Python MySQL Replication Blinker | This package read events from MySQL binlog and send to blinker's signal. | https://github.com/tarzanjw/python-mysql-replication-blinker |
四.注意事项
1.权限:
可以直接使用复制账号也可以使用其他账号,但是该账号必须 SELECT, REPLICATION SLAVE, REPLICATION CLIENT 权限
2.数据库日志相关的参数
log_bin=on ,binlog_format=row,binlog_row_image=FULL
3.用作表级别同步
除了解析binlog,还可以通过python-mysql-replication
做数据全量加增量迁移。
例如仅仅迁移某些大表而不是整个库的时候,先dump,再增量实时同步
五.学习网址
1. https://github.com/julien-duponchelle/python-mysql-replication
2. 基于python的mysql复制工具详解
https://www.h5w3.com/229701.html
3.使用 mysql-replication python监听mysql binlog 实时同步数据