Master主备切换
Master的主备切换分为两种:基于文件系统和基于zookeeper
- 基于文件系统的,spark提供目录保存spark Application和worker的注册信息,并将他们的恢复状态写入该目录,当spark的master节点宕掉的时候,重启master,就能获取application和worker的注册信息。需要手动进行切换
- 基于zookeeper的,用于生产模式。其基本原理是通过zookeeper来选举一个Master,其他的Master处于Standby状态。将Standalone集群连接到同一个ZooKeeper实例并启动多个Master,利用zookeeper提供的选举和状态保存功能,可以使一个Master被选举,而其他Master处于Standby状态。如果现任Master死去,另一个Master会通过选举产生,并恢复到旧的Master状态,然后恢复调度。
因为涉及到多个Master,需要在SparkContext指向一个Master列表,spark://host1:port1,host2:port2,host3:port3,应用程序会轮询列表
不能将Master定义在conf/spark-env.sh里了,而是直接在Application中定义。涉及的参数是 export SPARK_MASTER_IP=bigdata001,这项不配置或者为空。否则,无法启动多个master
流程:(注意completeRecovery方法)
- 在active Master宕掉之后,内部持久化(FileSystemPersistenceEngine和ZookeeperPersistenceEngine)引擎首先会读取持久化的storedApps、storedDrivers、storedWorkers
- 如果storedApps、storedWorkers有任何一个是有内容的,那么就将持久化的Application、Worker信息重新注册
- 将Application和Worker的状态都修改为UNKNOWN,然后向Application所对应的Driver和Worker发送StandBy Master的地址
- 如果Driver和Wroker是正常运转的情况下,接收到Master发送过来的地址后,就会相应到新的Master,在Master陆续接收到Driver和Worker发送过来的消息后,会使用completeRecovery()方法对没有发送响应消息的Driver和Worker进行处理,过滤掉他们的信息。
- 调用Master的schedule()方法,对正在调度的Driver和Application进行调度。在worker上启动driver,或者是为Applicaiton在worker上启动executor
看看beginRecover代码:
private def canCompleteRecovery =
workers.count(_.state == WorkerState.UNKNOWN) == 0 &&
apps.count(_.state == ApplicationState.UNKNOWN) == 0
private def beginRecovery(storedApps: Seq[ApplicationInfo], storedDrivers: Seq[DriverInfo],
storedWorkers: Seq[WorkerInfo]) {
for (app <- storedApps) {
logInfo("Trying to recover app: " + app.id)
try {
registerApplication(app)
app.state = ApplicationState.UNKNOWN
app.driver.send(MasterChanged(self, masterWebUiUrl))
} catch {
case e: Exception => logInfo("App " + app.id + " had exception on reconnect")
}
}
for (driver <- storedDrivers) {
// Here we just read in the list of drivers. Any drivers associated with now-lost workers
// will be re-launched when we detect that the worker is missing.
drivers += driver
}
for (worker <- storedWorkers) {
logInfo("Trying to recover worker: " + worker.id)
try {
registerWorker(worker)
worker.state = WorkerState.UNKNOWN
worker.endpoint.send(MasterChanged(self, masterWebUiUrl))
} catch {
case e: Exception => logInfo("Worker " + worker.id + " had exception on reconnect")
}
}
}
可以看到,形参三个,storedApps、storedDrivers、storedWorkers
对于storedApps:
for (app <- storedApps) {
logInfo("Trying to recover app: " + app.id)
try {
registerApplication(app) //直接注册
app.state = ApplicationState.UNKNOWN //设置状态为UNKONWN
app.driver.send(MasterChanged(self, masterWebUiUrl))//发送备用master信息
} catch {
case e: Exception => logInfo("App " + app.id + " had exception on reconnect")
}
}
对于storedDrivers:
for (driver <- storedDrivers) {
// Here we just read in the list of drivers. Any drivers associated with now-lost workers
// will be re-launched when we detect that the worker is missing.
drivers += driver
}
在这里,我们只是阅读了Drivers程序列表。当我们检测到Workers失踪时,任何与现在失踪工人相关的Workers程序都将重新启动。
对于storedWorkers:
for (worker <- storedWorkers) {
logInfo("Trying to recover worker: " + worker.id)
try {
registerWorker(worker)
worker.state = WorkerState.UNKNOWN
worker.endpoint.send(MasterChanged(self, masterWebUiUrl))
} catch {
case e: Exception => logInfo("Worker " + worker.id + " had exception on reconnect")
}
}
再看看completeRecovery代码
private def completeRecovery() {
// Ensure "only-once" recovery semantics using a short synchronization period.
if (state != RecoveryState.RECOVERING) { return }
state = RecoveryState.COMPLETING_RECOVERY
// Kill off any workers and apps that didn't respond to us.
workers.filter(_.state == WorkerState.UNKNOWN).foreach(removeWorker)
apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication)
// Reschedule drivers which were not claimed by any workers
drivers.filter(_.worker.isEmpty).foreach { d =>
logWarning(s"Driver ${d.id} was not found after master recovery")
if (d.desc.supervise) {
logWarning(s"Re-launching ${d.id}")
relaunchDriver(d)
} else {
removeDriver(d.id, DriverState.ERROR, None)
logWarning(s"Did not re-launch ${d.id} because it was not supervised")
}
}
state = RecoveryState.ALIVE
schedule()
logInfo("Recovery complete - resuming operations!")
}
可以看到,进来就进行Driver和Worker的过滤
// Kill off any workers and apps that didn't respond to us.
//杀死所有没有回应我们的Workers和Apps。
workers.filter(_.state == WorkerState.UNKNOWN).foreach(removeWorker)
apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication)
进来removeWorker看看
private def removeWorker(worker: WorkerInfo) {
logInfo("Removing worker " + worker.id + " on " + worker.host + ":" + worker.port)
worker.setState(WorkerState.DEAD) //设置状态为DEAD
idToWorker -= worker.id //从内存结构中移除
addressToWorker -= worker.endpoint.address
if (reverseProxy) {
webUi.removeProxyTargets(worker.id)
}
for (exec <- worker.executors.values) {
logInfo("Telling app of lost executor: " + exec.id)
exec.application.driver.send(ExecutorUpdated(
exec.id, ExecutorState.LOST, Some("worker lost"), None, workerLost = true))
exec.state = ExecutorState.LOST
exec.application.removeExecutor(exec) //移除driver上的executors
}
for (driver <- worker.drivers.values) {
if (driver.desc.supervise) { //driver属性
logInfo(s"Re-launching ${driver.id}")
relaunchDriver(driver) //对于Driver,driver.desc.supervise为true就先尝试重启
} else {
logInfo(s"Not re-launching ${driver.id} because it was not supervised")
removeDriver(driver.id, DriverState.ERROR, None)
}
}
persistenceEngine.removeWorker(worker)
}
完了最后会调用schedule()方法,后面再看。