前阵子一数据库服务器的事务日志开始暴增,当时使用下面脚本检查发现该数据库的log_reuse_wait_desc 一直处于REPLICATION状态, 也就是说在事务复制过程中,与发布相关的事务仍未传递到分发数据库。刚好前一天有个同事配置了AWS的DMS相关作业。
SELECT name, log_reuse_wait_desc FROM sys.databases;
找到具体作业检查发现,该作业的第二步出现了错误,LogReader服务启动失败了。如需截图所示:
Message
Unable to start execution of step 2 (reason: The LogReader subsystem failed to load [see the SQLAGENT.OUT file for details]; The job has been suspended). The step failed.
进一步检查SQL Server Agent的日志输出,发现是因为“because the QueueReader subsystem failed to load”
Date 2018/11/15 14:54:41
Log SQL Server Agent (Archive #1 - 2018/11/20 9:11:00)
Message
[LOG] Step 2 of job 'xxxx' (0xE00DFF76D02DAD47920124DD907A412D) cannot be run because the LogReader subsystem failed to load. The job has been suspended
Date 2018/11/15 14:54:44
Log SQL Server Agent (Archive #1 - 2018/11/20 9:11:00)
Message
[LOG] Step 2 of job 'xxxx' (0x1BC045267CAE2F4A8C3E283921F40641) cannot be run because the QueueReader subsystem failed to load. The job has been suspended
使用下面脚本检查,发现子系统QueueReader的dll是存在的,刚好同事前一天配置AWS时,在SQL Server里面添加了Replication相关组件。而添加组件后,没有重启SQL Server Agent服务。重启SQL Server Agent服务后,问题解决!
select * from msdb.dbo.syssubsystems
参考资料:
https://www.sqlservercentral.com/Forums/783200/Replication-subsystems-failed-to-load