作者:Wenhui

在上一篇文章中我们介绍了 Sentinel 独特的初始化步骤。这篇文章我们会介绍Sentinel 的主时间事件函数。

Sentinel 使用和 Redis 服务器相同的事件处理机制:分为文件事件和时间事件。文件事件处理机制使用I/O 多路复用来处理服务器端的网络 I/O 请求,例如客户端连接,读写等操作。时间处理机制则在主循环中周期性调用时间函数来处理定时操作,例如服务器端的维护,定时更新,删除等操作。Redis 服务器主时间函数是在 server.c 中定义的 serverCron 函数,在默认情况下,serverCron 会每 100ms 被调用一次。在这个函数中,我们看到如下代码:


int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {  
  
    int j;  
  
    UNUSED(eventLoop);  
  
    UNUSED(id);  
  
    UNUSED(clientData);  
  
  
    ...........  
  
    /* Run the Sentinel timer if we are in sentinel mode. */  
  
    if (server.sentinel_mode) sentinelTimer();  
  
    ...........  
  
}

其中当服务器以 sentinel 模式运行的时候,serverCron 会调用 sentinelTimer 函数,来运行 Sentinel 中的主逻辑,sentinelTimer 函数在 sentinel.c 中的定义如下:

void sentinelTimer(void) {  
    sentinelCheckTiltCondition();  
    sentinelHandleDictOfRedisInstances(sentinel.masters);  
    sentinelRunPendingScripts();  
    sentinelCollectTerminatedScripts();  
    sentinelKillTimedoutScripts();  
  
    /* We continuously change the frequency of the Redis "timer interrupt" 
     * in order to desynchronize every Sentinel from every other. 
     * This non-determinism avoids that Sentinels started at the same time 
     * exactly continue to stay synchronized asking to be voted at the 
     * same time again and again (resulting in nobody likely winning the 
     * election because of split brain voting). */  
    server.hz = CONFIG_DEFAULT_HZ + rand() % CONFIG_DEFAULT_HZ;  
}

Sentinel Timer 函数会做如下几个操作:

  1. 检查 Sentinel 当前是否在 Tilt 模式(Tilt 模式将会在稍后章节介绍)。

  2. 检查 Sentinel 与其监控主备实例,以及其他Sentinel 实例的连接,更新当前状态,并在主实例下线的时候自动做主备倒换操作。

  3. 检查回调脚本状态,并做相应操作。

  4. 更新服务器频率(调用 serverCron函数的频率),加上一个随机因子,作用是防止监控相同主节点的 Sentinel 在选举 Leader 的时候时间冲突,导致选举无法产生绝对多的票数。

其中 SentinelHandleDictOfRedisInstances 函数的定义如下:


/* Perform scheduled operations for all the instances in the dictionary. 
 * Recursively call the function against dictionaries of slaves. */  
void sentinelHandleDictOfRedisInstances(dict *instances) {  
    dictIterator *di;  
    dictEntry *de;  
    sentinelRedisInstance *switch_to_promoted = NULL;  
  
    /* There are a number of things we need to perform against every master. */  
    di = dictGetIterator(instances);  
    while((de = dictNext(di)) != NULL) {  
        sentinelRedisInstance *ri = dictGetVal(de);  
  
        sentinelHandleRedisInstance(ri);  
        if (ri->flags & SRI_MASTER) {  
            sentinelHandleDictOfRedisInstances(ri->slaves);  
            sentinelHandleDictOfRedisInstances(ri->sentinels);  
            if (ri->failover_state == SENTINEL_FAILOVER_STATE_UPDATE_CONFIG) {  
                switch_to_promoted = ri;  
            }  
        }  
    }  
    if (switch_to_promoted)  
        sentinelFailoverSwitchToPromotedSlave(switch_to_promoted);  
    dictReleaseIterator(di);  
}

SentinelHandleDictOfRedisInstances 函数主要做的工作是:

1.调用 sentinelHandleDictOfRedisInstance 函数处理 Sentinel与其它特定实例连接,状态更 新,以及主备倒换工作。

2.如果当前处理实例为主实例,递归调用SentinelHandleDictOfRedisInstances 函数处理其下属的从实例以及其他监控这个主实例的Sentinel。

  1. 在主备倒换成功的情况下,更新主实例为升级为主实例的从实例。

其中在 sentinelHandleRedisInstance 的定义如下:


/* Perform scheduled operations for the specified Redis instance. */  
void sentinelHandleRedisInstance(sentinelRedisInstance *ri) {  
    /* ========== MONITORING HALF ============ */  
    /* Every kind of instance */  
    sentinelReconnectInstance(ri);  
    sentinelSendPeriodicCommands(ri);  
  
    /* ============== ACTING HALF ============= */  
    /* We don't proceed with the acting half if we are in TILT mode. 
     * TILT happens when we find something odd with the time, like a 
     * sudden change in the clock. */  
    if (sentinel.tilt) {  
        if (mstime()-sentinel.tilt_start_time < SENTINEL_TILT_PERIOD) return;  
        sentinel.tilt = 0;  
        sentinelEvent(LL_WARNING,"-tilt",NULL,"#tilt mode exited");  
    }  
  
    /* Every kind of instance */  
    sentinelCheckSubjectivelyDown(ri);  
  
    /* Masters and slaves */  
    if (ri->flags & (SRI_MASTER|SRI_SLAVE)) {  
        /* Nothing so far. */  
    }  
  
    /* Only masters */  
    if (ri->flags & SRI_MASTER) {  
        sentinelCheckObjectivelyDown(ri);  
        if (sentinelStartFailoverIfNeeded(ri))  
            sentinelAskMasterStateToOtherSentinels(ri,SENTINEL_ASK_FORCED);  
        sentinelFailoverStateMachine(ri);  
        sentinelAskMasterStateToOtherSentinels(ri,SENTINEL_NO_FLAGS);  
    }  
}

这个函数会做以下两部分操作:

  1. 检查 Sentinel 和其他实例(主备实例以及其他 Sentinel)的连接,如果连接没有设置或已经断开连接,Sentinel 会重试相对应的连接,并定时发送响应命令。需要注意的是:Sentinel 和每个主备实例都有两个连接,命令连接和发布订阅连接。但是与其他监听相同主备实例的 Sentinel 只保留命令连接,这部分细节会在网络章节单独介绍

  2. 第二部分操作主要做的是监测主备及其他Sentinel 实例,并监测其是否在主观下线状态,对于主实例来说,还要检测是否在客观下线状态,并进行相应的主备倒换操作。

需要注意的是第二部分操作如果 Sentinel 在 Tilt 模式下是忽略的,下面我们来看一下这个函数第二部分的的具体实现细节。

sentinelCheckSubjectivelyDown 函数会监测特定的 Redis 实例(主备实例以及其他 Sentinel)是否处于主观下线状态,这部分函数代码如下:


/* Is this instance down from our point of view? */  
void sentinelCheckSubjectivelyDown(sentinelRedisInstance *ri) {  
    mstime_t elapsed = 0;  
  
    if (ri->link->act_ping_time)  
        elapsed = mstime() - ri->link->act_ping_time;  
    else if (ri->link->disconnected)  
        elapsed = mstime() - ri->link->last_avail_time;  
  
    .......  
  
    /* Update the SDOWN flag. We believe the instance is SDOWN if: 
     * 
     * 1) It is not replying. 
     * 2) We believe it is a master, it reports to be a slave for enough time 
     *    to meet the down_after_period, plus enough time to get two times 
     *    INFO report from the instance. */  
    if (elapsed > ri->down_after_period ||  
        (ri->flags & SRI_MASTER &&  
         ri->role_reported == SRI_SLAVE &&  
         mstime() - ri->role_reported_time >  
          (ri->down_after_period+SENTINEL_INFO_PERIOD*2)))  
    {  
        /* Is subjectively down */  
        if ((ri->flags & SRI_S_DOWN) == 0) {  
            sentinelEvent(LL_WARNING,"+sdown",ri,"%@");  
            ri->s_down_since_time = mstime();  
            ri->flags |= SRI_S_DOWN;  
        }  
    } else {  
        /* Is subjectively up */  
        if (ri->flags & SRI_S_DOWN) {  
            sentinelEvent(LL_WARNING,"-sdown",ri,"%@");  
            ri->flags &= ~(SRI_S_DOWN|SRI_SCRIPT_KILL_SENT);  
        }  
    }  
}

主观下线状态意味着特定的 Redis 实例满足以下条件之一:

  1. 在实例配置的 down_after_milliseconds 时间内没有收到 Ping 的回复。

  2. Sentinel认为实例是主实例,但收到实例为从实例的回复,并且上次实例角色回复时间大于在实例配置的down_after_millisecon 时间加上 2 倍 INFO 命令间隔。

如果任何一个条件满足,Sentinel 会打开实例的 S_DOWN 标志并认为实例进入主观下线状态。

主观下线状态意味着 Sentinel 主观认为实例下线,但此时Sentinel 并没有询问其他监控此实例的其他 Sentinel 此实例的在线状态。

sentinelCheckObjectivelyDown 函数会检查实例是否为客观下线状态,这个操作仅仅对主实例进行。sentinelCheckObjectivelyDown函数定义如下:


/* Is this instance down according to the configured quorum? 
 * 
 * Note that ODOWN is a weak quorum, it only means that enough Sentinels 
 * reported in a given time range that the instance was not reachable. 
 * However messages can be delayed so there are no strong guarantees about 
 * N instances agreeing at the same time about the down state. */  
void sentinelCheckObjectivelyDown(sentinelRedisInstance *master) {  
    dictIterator *di;  
    dictEntry *de;  
    unsigned int quorum = 0, odown = 0;  
  
    if (master->flags & SRI_S_DOWN) {  
        /* Is down for enough sentinels? */  
        quorum = 1; /* the current sentinel. */  
        /* Count all the other sentinels. */  
        di = dictGetIterator(master->sentinels);  
        while((de = dictNext(di)) != NULL) {  
            sentinelRedisInstance *ri = dictGetVal(de);  
  
            if (ri->flags & SRI_MASTER_DOWN) quorum++;  
        }  
        dictReleaseIterator(di);  
        if (quorum >= master->quorum) odown = 1;  
    }  
  
    /* Set the flag accordingly to the outcome. */  
    if (odown) {  
        if ((master->flags & SRI_O_DOWN) == 0) {  
            sentinelEvent(LL_WARNING,"+odown",master,"%@ #quorum %d/%d",  
                quorum, master->quorum);  
            master->flags |= SRI_O_DOWN;  
            master->o_down_since_time = mstime();  
        }  
    } else {  
        if (master->flags & SRI_O_DOWN) {  
            sentinelEvent(LL_WARNING,"-odown",master,"%@");  
            master->flags &= ~SRI_O_DOWN;  
        }  
    }  
 }
 

这个函数主要进行的操作是循环查看监控此主实例的其他 SentinelSRI_MASTER_DOWN 标志是否打开,如果打开则意味着其他特定的 Sentinel 认为主实例处于下线状态,并统计认为主实例处于下线状态的票数,如果票数大于等于主实例配置的quorum 值,则 Sentinel 会把主实例的 SRI_O_DOWN 标志打开,并认为主实例处于客观下线状态。

sentinelStartFailoverIfNeeded 函数首先会检查实例是否处于客观下线状态(SRI_O_DOWN 标志是否打开),并且在 2 倍主实例配置的主备倒换超时时间内没有进行主备倒换工作,Sentinel 会打开 SRI_FAILOVER_IN_PROGRESS标志并设置倒换状态为 SENTINEL_FAILOVER_STATE_WAIT_START。并开始进行主备倒换工作。主备倒换的细节将在主备倒换的章节里介绍。


int sentinelStartFailoverIfNeeded(sentinelRedisInstance *master) {  
    /* We can't failover if the master is not in O_DOWN state. */  
    if (!(master->flags & SRI_O_DOWN)) return 0;  
  
    /* Failover already in progress? */  
    if (master->flags & SRI_FAILOVER_IN_PROGRESS) return 0;  
  
    /* Last failover attempt started too little time ago? */  
    if (mstime() - master->failover_start_time <  
        master->failover_timeout*2)  
    {  
        if (master->failover_delay_logged != master->failover_start_time) {  
            time_t clock = (master->failover_start_time +  
                            master->failover_timeout*2) / 1000;  
            char ctimebuf[26];  
  
            ctime_r(&clock,ctimebuf);  
            ctimebuf[24] = '\0'; /* Remove newline. */  
            master->failover_delay_logged = master->failover_start_time;  
            serverLog(LL_WARNING,  
                "Next failover delay: I will not start a failover before %s",  
                ctimebuf);  
        }  
        return 0;  
    }  
  
    sentinelStartFailover(master);  
    return 1;  
}

参考资料: https://github.com/antirez/redis

https://redis.io/topics/sentinel

Redis 设计与实现第二版 黄健宏著