前阵子一数据库服务器的事务日志开始暴增,当时使用下面脚本检查发现该数据库的log_reuse_wait_desc 一直处于REPLICATION状态, 也就是说在事务复制过程中,与发布相关的事务仍未传递到分发数据库。刚好前一天有个同事配置了AWS的DMS相关作业。

 

SELECT name,  log_reuse_wait_desc FROM sys.databases;

 

找到具体作业检查发现,该作业的第二步出现了错误,LogReader服务启动失败了。如需截图所示:

 

Message

Unable to start execution of step 2 (reason: The LogReader subsystem failed to load [see the SQLAGENT.OUT file for details]; The job has been suspended).  The step failed.

cannot be run because the QueueReader subsystem failed to load_Replication

 

 

进一步检查SQL Server Agent的日志输出,发现是因为because the QueueReader subsystem failed to load

 

 

cannot be run because the QueueReader subsystem failed to load_2d_02

 

 

 

Date       2018/11/15 14:54:41
Log        SQL Server Agent (Archive #1 - 2018/11/20 9:11:00)

Message
[LOG] Step 2 of job 'xxxx' (0xE00DFF76D02DAD47920124DD907A412D) cannot be run because the LogReader subsystem failed to load.  The job has been suspended



Date       2018/11/15 14:54:44
Log        SQL Server Agent (Archive #1 - 2018/11/20 9:11:00)

Message
[LOG] Step 2 of job 'xxxx' (0x1BC045267CAE2F4A8C3E283921F40641) cannot be run because the QueueReader subsystem failed to load.  The job has been suspended

 

 

使用下面脚本检查,发现子系统QueueReader的dll是存在的,刚好同事前一天配置AWS时,在SQL Server里面添加了Replication相关组件。而添加组件后,没有重启SQL Server Agent服务。重启SQL Server Agent服务后,问题解决!

 

 

select * from msdb.dbo.syssubsystems

 

 

 

参考资料:

 

https://www.sqlservercentral.com/Forums/783200/Replication-subsystems-failed-to-load