目录

 

Spark原理之资源调度和任务调度

Worker注册部分的源码分析

worker注册的流程:

关于去重

结论

Driver

 Application

问题:


Spark原理之资源调度和任务调度

Spark资源调度涉及到三个集合

集合

类型

workers

val workers = new HashSet[WorkerInfo]

waitingDrivers

private val waitingDrivers = new ArrayBuffer[DriverInfo]

waitingApps

val waitingApps = new ArrayBuffer[ApplicationInfo]

Worker注册部分的源码分析

首先Worker向Master注册时会调用Master重写的receiveAndReply方法,这个方法比较核心,是接收消息和响应的一个方法。

worker注册的流程:

  1. 匹配接收的context是否为worker注册类型的对象
  2. 如果是,判断master状态是否是active。
  3. 判断是否已经包含该worker。
  4. new一个WorkerInfo对象,将注册信息封装到里面。
  5. 调用refisterWorker()方法。在里面在次判断是否重复。若不重复:workers+=worker;
  6. 调用schedule()方法;稍后我们解析此方法。
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
   //1.类型匹配,看请求是否为注册worker
     case RegisterWorker(
        id, workerHost, workerPort, workerRef, cores, memory, workerUiPort, publicAddress) => {
      logInfo("Registering worker %s:%d with %d cores, %s RAM".format(
        workerHost, workerPort, cores, Utils.megabytesToString(memory)))
    //2.判断master是否健在
      if (state == RecoveryState.STANDBY) {
        context.reply(MasterInStandby)
      } //3.判断ID是否已存在
        else if (idToWorker.contains(id)) {
        context.reply(RegisterWorkerFailed("Duplicate worker ID"))
      } else {
        //4.将worker信息封装成一个WorkerInfo对象。
        val worker = new WorkerInfo(id, workerHost, workerPort, cores, memory,
          workerRef, workerUiPort, publicAddress)
        //5.调用registerWorker()方法注册worker
        if (registerWorker(worker)) {
          persistenceEngine.addWorker(worker)
          context.reply(RegisteredWorker(self, masterWebUiUrl))
          //6.注册成功调用schedule方法
          schedule()
        } else {
          val workerAddress = worker.endpoint.address
          logWarning("Worker registration failed. Attempted to re-register worker at same " +
            "address: " + workerAddress)
          context.reply(RegisterWorkerFailed("Attempted to re-register worker at same address: "
            + workerAddress))
        }
      }
    }

 

private def registerWorker(worker: WorkerInfo): Boolean = {
    // There may be one or more refs to dead workers on this same node (w/ different ID's),
    // remove them.
    workers.filter { w =>
      (w.host == worker.host && w.port == worker.port) && (w.state == WorkerState.DEAD)
    }.foreach { w =>
      workers -= w
    }

    val workerAddress = worker.endpoint.address
    if (addressToWorker.contains(workerAddress)) {
      val oldWorker = addressToWorker(workerAddress)
      if (oldWorker.state == WorkerState.UNKNOWN) {
        // A worker registering from UNKNOWN implies that the worker was restarted during recovery.
        // The old worker must thus be dead, so we will remove it and accept the new worker.
        removeWorker(oldWorker)
      } else {
        logInfo("Attempted to re-register worker at same address: " + workerAddress)
        return false
      }
    }

    workers += worker
    idToWorker(worker.id) = worker
    addressToWorker(workerAddress) = worker
    true
  }

关于去重

worker的节点信息是封装成WorkerInfo对象会存放在workers这个集合中,集合类型是HashSet,这里选用HashSet可以防止集合中出现相同的workerInfo对象。

val workers = new HashSet[WorkerInfo]

不过通过观察spark源码我们发现每次来注册的worker都是new的对象,也就是说不会出现相同地之的对象在这个HashSet中。

spark资源不足时是否会启动 sparkresource_封装

关于去重,则是spark在源码中手动去重,参看下面的  if (addressToWorker.contains(workerAddress)) 。如果包含相同地址的worker,如果是未知状态,则将其移除(未知状态是在recovery阶段重新启动的worker),否则提示尝试重新注册一个已存在的worker。这里集合类型选用HashSet的动机还有点疑惑,不过这样无疑更加安全。

spark资源不足时是否会启动 sparkresource_封装_02

 下面是WorkerInfo的数据结构:封装了位置、端口以及资源等信息

private[spark] class WorkerInfo(
    val id: String,
    val host: String,
    val port: Int,
    val cores: Int,
    val memory: Int,
    val endpoint: RpcEndpointRef,
    val webUiPort: Int,
    val publicAddress: String)
  extends Serializable {
    ...
}

结论

通过上面一顿分析源码我们可以得知以下结论:

  1. 当有新的worker向master注册时,注册成功后都会调用schedule方法。
  2. 为了避免workers集合中出现重复元素,数据结构选择了HashSet,而且在插入新元素的时候手动做了层层判断。

Driver注册的源码分析

对于driver注册就不像worker那样严格了,只要master节点为alive状态即可,waitingDriver是ArrayBuffer类型的。同样的,driver注册也会调用schedule方法。

case RequestSubmitDriver(description) => {
      if (state != RecoveryState.ALIVE) {
        val msg = s"${Utils.BACKUP_STANDALONE_MASTER_PREFIX}: $state. " +
          "Can only accept driver submissions in ALIVE state."
        context.reply(SubmitDriverResponse(self, false, None, msg))
      } else {
        logInfo("Driver submitted " + description.command.mainClass)
        val driver = createDriver(description)
        persistenceEngine.addDriver(driver)
        waitingDrivers += driver
        drivers.add(driver)
        schedule()

        // TODO: It might be good to instead have the submission client poll the master to determine
        //       the current status of the driver. For now it's simply "fire and forget".

        context.reply(SubmitDriverResponse(self, true, Some(driver.id),
          s"Driver successfully submitted as ${driver.id}"))
      }
    }

 Application注册时的源码分析

Application的注册是在另外一个方法:receive。和Driver一样waitingApps也是ArrayBuffer类型的,注册完毕之后也会调用schedule方法。

case RegisterApplication(description, driver) => {
      // TODO Prevent repeated registrations from some driver
      if (state == RecoveryState.STANDBY) {
        // ignore, don't send response
      } else {
        logInfo("Registering app " + description.name)
        val app = createApplication(description, driver)
        registerApplication(app)
        logInfo("Registered app " + description.name + " with ID " + app.id)
        persistenceEngine.addApplication(app)
        driver.send(RegisteredApplication(app.id, self))
        schedule()
      }
private def registerApplication(app: ApplicationInfo): Unit = {
    val appAddress = app.driver.address
    if (addressToApp.contains(appAddress)) {
      logInfo("Attempted to re-register application at same address: " + appAddress)
      return
    }

    applicationMetricsSystem.registerSource(app.appSource)
    apps += app
    idToApp(app.id) = app
    endpointToApp(app.driver) = app
    addressToApp(appAddress) = app
    waitingApps += app
  }

问题:

  1. 用户在提交应用程序的时候是先调用的哪个方法?是driver还是APP。
  2. 为什么每次都要调用schedule方法?此方法是做什么的?

 schedule方法

private def schedule(): Unit = {
    if (state != RecoveryState.ALIVE) {
      return
    }
    // Drivers take strict precedence over executors
    val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE))
    val numWorkersAlive = shuffledAliveWorkers.size
    var curPos = 0
    for (driver <- waitingDrivers.toList) { // iterate over a copy of waitingDrivers
      // We assign workers to each waiting driver in a round-robin fashion. For each driver, we
      // start from the last worker that was assigned a driver, and continue onwards until we have
      // explored all alive workers.
      var launched = false
      var numWorkersVisited = 0
      while (numWorkersVisited < numWorkersAlive && !launched) {
        val worker = shuffledAliveWorkers(curPos)
        numWorkersVisited += 1
        if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) {
          launchDriver(worker, driver)
          waitingDrivers -= driver
          launched = true
        }
        curPos = (curPos + 1) % numWorkersAlive
      }
    }
    startExecutorsOnWorkers()
  }

未完待续。。。