注:spark版本2.1.1,启动模式:Standalone ,需要启动Master和Worker守护进程

一、脚本分析

start-all.sh中会直接启动start-slaves.sh

spark启动成功后主节点只有master没有worker spark启动不出现worker_spark

start-slaves.sh中会调用org.apache.spark.deploy.master.Worker

 

 

二、源码解析

org.apache.spark.deploy.master.Worker

1、Worker主类进入main方法,main方法主要是创建RPC环境

def main(argStrings: Array[String]) {
    Thread.setDefaultUncaughtExceptionHandler(new SparkUncaughtExceptionHandler(
      exitOnUncaughtException = false))
    Utils.initDaemon(log)
    val conf = new SparkConf
    val args = new WorkerArguments(argStrings, conf)
    val rpcEnv = startRpcEnvAndEndpoint(args.host, args.port, args.webUiPort, args.cores,
  。。。。。。
    val externalShuffleServiceEnabled = conf.getBoolean("spark.shuffle.service.enabled", false)
    val sparkWorkerInstances = scala.sys.env.getOrElse("SPARK_WORKER_INSTANCES", "1").toInt
    require(externalShuffleServiceEnabled == false || sparkWorkerInstances <= 1,
      "Starting multiple workers on one host is failed because we may launch no more than one " +
        "external shuffle service on each host, please set spark.shuffle.service.enabled to " +
        "false or set SPARK_WORKER_INSTANCES to 1 to resolve the conflict.")
    rpcEnv.awaitTermination()
  }

同Master中一样、worker会先初始化一个RpcEnv环境,这里就不在重复可以对比上一篇博客查看RPCenv初始化流程,这里不再重复endpoint的初始化的流程

2、Worker实例化后执行Onstart()方法

override def onStart() {
    assert(!registered)
    logInfo("Starting Spark worker %s:%d with %d cores, %s RAM".format(
      host, port, cores, Utils.megabytesToString(memory)))
    logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
    logInfo("Spark home: " + sparkHome)
    // 创建worker工作目录
    createWorkDir()
    shuffleService.startIfEnabled()
    // 创建workerWebUI,并绑定
    webUi = new WorkerWebUI(this, workDir, webUiPort)
    webUi.bind()
    workerWebUiUrl = s"http://$publicAddress:${webUi.boundPort}"
    //  重点! 向master注册
    registerWithMaster()
    //  metricsSystem指标度量系统注册资源
    metricsSystem.registerSource(workerSource)
    //  开启metricsSystem指标度量系统
    metricsSystem.start()
    // Attach the worker metrics servlet handler to the web ui after the metrics system is started.
    metricsSystem.getServletHandlers.foreach(webUi.attachHandler)
  }

三、总结

这里只写worker在初始化rpcEnv之后,Worker被实例化的工作

1、创建worker工作目录

2、启动服务

startExternalShuffleService()==>
 /** Start the external shuffle service */
  def start() {
    require(server == null, "Shuffle server already started")
    val authEnabled = securityManager.isAuthenticationEnabled()
    logInfo(s"Starting shuffle service on port $port (auth enabled = $authEnabled)")
    val bootstraps: Seq[TransportServerBootstrap] =
      if (authEnabled) {
        Seq(new AuthServerBootstrap(transportConf, securityManager))
      } else {
        Nil
      }
    server = transportContext.createServer(port, bootstraps.asJava)

    shuffleServiceSource.registerMetricSet(server.getAllMetrics)
    shuffleServiceSource.registerMetricSet(blockHandler.getAllMetrics)
    masterMetricsSystem.registerSource(shuffleServiceSource)
    masterMetricsSystem.start()
  }

3、绑定worker web ui地址

webUi = new WorkerWebUI(this, workDir, webUiPort)
webUi.bind()

4、registerWithMaster()方法向master注册(Master收到会匹配类型回复)

tryRegisterAllMasters中,创建了一个注册的线程池,因为向master注册是一个阻塞的操作,所以这个线程池必须要满足master rpc地址同时请求的最大数

接下来调用sendRegisterMessageToMaster方法:用于worker端与master进行通信,向master发送注册信息

Worker
    registerWithMaster()==>
    tryRegisterAllMasters()==>
    sendRegisterMessageToMaster(masterEndpoint)==>
    private def sendRegisterMessageToMaster(masterEndpoint: RpcEndpointRef): Unit = {
    masterEndpoint.send(RegisterWorker(
      workerId,//当前Worker的标识ID
      host,//节点
      port,//端口
      self,//当前的RpcEndpointRef
      cores,//CPU数
      memory,//内存大小
      workerWebUiUrl,//Worker Web UI地址
      masterEndpoint.address))
  }

5、Master匹配消息类型后回复,Master会给Worker返回一个消息msg,告诉Worker注册的结果是成功 or 失败

override def receive: PartialFunction[Any, Unit] = {
......
 case RegisterWorker(
      id, workerHost, workerPort, workerRef, cores, memory, workerWebUiUrl, masterAddress) =>
      logInfo("Registering worker %s:%d with %d cores, %s RAM".format(
        workerHost, workerPort, cores, Utils.megabytesToString(memory)))
      if (state == RecoveryState.STANDBY) {
        workerRef.send(MasterInStandby)
      } else if (idToWorker.contains(id)) {
        workerRef.send(RegisterWorkerFailed("Duplicate worker ID"))
      } else {
        val worker = new WorkerInfo(id, workerHost, workerPort, cores, memory,
          workerRef, workerWebUiUrl)
        if (registerWorker(worker)) {
          persistenceEngine.addWorker(worker)
          workerRef.send(RegisteredWorker(self, masterWebUiUrl, masterAddress))
          schedule()
        } else {
          val workerAddress = worker.endpoint.address
          logWarning("Worker registration failed. Attempted to re-register worker at same " +
            "address: " + workerAddress)
          workerRef.send(RegisterWorkerFailed("Attempted to re-register worker at same address: "
            + workerAddress))
        }
      }
.......

6、master返回消息msg,Worker端通过消息类型,对msg进行处理,如果消息是注册成功,则启动forwordMessageScheduler定时器,并开始定期向Master发送心跳包,发送心跳包的同时定期向Master报告Worker中executor的最新信息

7、发送心跳给master后,master会记录最后一次的心跳时间,并且master内部会一直定时轮询worker的状态

case Heartbeat(workerId, worker) =>
   idToWorker.get(workerId) match {
     case Some(workerInfo) =>
          workerInfo.lastHeartbeat = System.currentTimeMillis()
     case None =>
          if (workers.map(_.id).contains(workerId)) {
            logWarning(s"Got heartbeat from unregistered worker $workerId." +
              " Asking it to re-register.")
            worker.send(ReconnectWorker(masterUrl))
          } else {
            logWarning(s"Got heartbeat from unregistered worker $workerId." +
              " This worker was never registered, so ignoring the heartbeat.")
          }
}
........
case CheckForWorkerTimeOut =>
      timeOutDeadWorkers()

8、timeOutDeadWorkers方法是对Worker的状态进行检查、如果worker失联,则将其移除

// 检查、移除,超时的worker
  private def timeOutDeadWorkers() {
    // Copy the workers into an array so we don't modify the hashset while iterating through it
    // 现在的时间
    val currentTime = System.currentTimeMillis()
    // 对workers的HashSet进行筛选,筛选超出规定时间的worker,默认的时间为60秒
    // 将这些筛选出来的worker添加到一个array中,以便于在迭代过程中不修改HashSet
    val toRemove = workers.filter(_.lastHeartbeat < currentTime - WORKER_TIMEOUT_MS).toArray
    //  将筛选出来的worker移除
    for (worker <- toRemove) {
      //  如果worker的状态为DEAD
      if (worker.state != WorkerState.DEAD) {
        logWarning("Removing %s because we got no heartbeat in %d seconds".format(
          worker.id, WORKER_TIMEOUT_MS / 1000))
        //  移除worker
        removeWorker(worker)
      } else { //如果worker上一次心跳时间小于现在 if (worker.lastHeartbeat < currentTime - ((REAPER_ITERATIONS + 1) * WORKER_TIMEOUT_MS)) {
          workers -= worker // we've seen this DEAD worker in the UI, etc. for long enough; cull it
        }
      }
    }
  }

 

9、向metricsSystem注册并开启metricsSystem指标度量系统

 

到此worker就算启动完成