sparkYarn集群提交流程分析(四)
书接上文,上次进行到了在不同的节点开启了一个CoarseGrainedExecutorBackend进程,这次就从这个进程的启动开始说起(也就是main方法的执行)
org.apache.spark.executor.CoarseGrainedExecutorBackend main()
def main(args: Array[String]) {
var driverUrl: String = null
var executorId: String = null
var hostname: String = null
var cores: Int = 0
var appId: String = null
var workerUrl: Option[String] = None
val userClassPath = new mutable.ListBuffer[URL]()
var argv = args.toList
while (!argv.isEmpty) {
argv match {
case ("--driver-url") :: value :: tail =>
driverUrl = value
argv = tail
case ("--executor-id") :: value :: tail =>
executorId = value
argv = tail
case ("--hostname") :: value :: tail =>
hostname = value
argv = tail
case ("--cores") :: value :: tail =>
cores = value.toInt
argv = tail
case ("--app-id") :: value :: tail =>
appId = value
argv = tail
case ("--worker-url") :: value :: tail =>
// Worker url is used in spark standalone mode to enforce fate-sharing with worker
workerUrl = Some(value)
argv = tail
case ("--user-class-path") :: value :: tail =>
userClassPath += new URL(value)
argv = tail
case Nil =>
case tail =>
// scalastyle:off println
System.err.println(s"Unrecognized options: ${tail.mkString(" ")}")
// scalastyle:on println
printUsageAndExit()
}
}
if (driverUrl == null || executorId == null || hostname == null || cores <= 0 ||
appId == null) {
printUsageAndExit()
}
run(driverUrl, executorId, hostname, cores, appId, workerUrl, userClassPath)
System.exit(0)
}
- 1 .这里还是老样子,主要代码还是执行run方法,其他都是一些配置
run(driverUrl, executorId, hostname, cores, appId, workerUrl, userClassPath)
private def run(
driverUrl: String,
executorId: String,
hostname: String,
cores: Int,
appId: String,
workerUrl: Option[String],
userClassPath: Seq[URL]) {
Utils.initDaemon(log)
SparkHadoopUtil.get.runAsSparkUser { () =>
// Debug code
Utils.checkHost(hostname)
// Bootstrap to fetch the driver's Spark properties.
val executorConf = new SparkConf
val port = executorConf.getInt("spark.executor.port", 0)
val fetcher = RpcEnv.create(
"driverPropsFetcher",
hostname,
port,
executorConf,
new SecurityManager(executorConf),
clientMode = true)
val driver = fetcher.setupEndpointRefByURI(driverUrl)
val cfg = driver.askWithRetry[SparkAppConfig](RetrieveSparkAppConfig)
val props = cfg.sparkProperties ++ Seq[(String, String)](("spark.app.id", appId))
fetcher.shutdown()
// Create SparkEnv using properties we fetched from the driver.
val driverConf = new SparkConf()
for ((key, value) <- props) {
// this is required for SSL in standalone mode
if (SparkConf.isExecutorStartupConf(key)) {
driverConf.setIfMissing(key, value)
} else {
driverConf.set(key, value)
}
}
if (driverConf.contains("spark.yarn.credentials.file")) {
logInfo("Will periodically update credentials from: " +
driverConf.get("spark.yarn.credentials.file"))
SparkHadoopUtil.get.startCredentialUpdater(driverConf)
}
val env = SparkEnv.createExecutorEnv(
driverConf, executorId, hostname, port, cores, cfg.ioEncryptionKey, isLocal = false)
env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend(
env.rpcEnv, driverUrl, executorId, hostname, cores, userClassPath, env))
workerUrl.foreach { url =>
env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))
}
env.rpcEnv.awaitTermination()
SparkHadoopUtil.get.stopCredentialUpdater()
}
}
- 1 .这里牵涉到了Spark底层的通信框架
Netty
我会专门出一期简洁版的Netty框架的介绍 - 2 .这里我们寻找关键字,在50行看到了Executor字样,要知道我们现在只是知道启动了一个
CoarseGrainedExecutorBackend
,真正的Executor是在这里的,这里的Executor实际上是Netty的通信组件 - 3 .也就是说我们所说的Executor不是进程,而是进程中维护的一个通信对象,说是进程也不错,毕竟Executor维护在这个进程中
- 4 .既然这个跟Executor创建有关系那就进去看看
env.rpcEnv.setupEndpoint(“Executor”…)
- 1 .点进去后会发现,这是一个抽象方法且只有一个实现类就是
org.apache.spark.rpc.netty.NettyRpcEnv
看样子spark用的通信框架已经很明显了 - 2 .进入到这个类中的找到这个方法只有一行代码
override def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef = {
dispatcher.registerRpcEndpoint(name, endpoint)
}
- 3 .这里就是将Executor通信终端注册注册到spark通信框架Netty终端的Dispatcher中,由这个分发器将Executor注册到Driver中
步骤⑦总结
- 1 .这里是启动Executor组件的主要代码,发现Executor是一个通信对象
- 2 .顺势下棋发现Executor会向Driver反向注册自己
总结:
- 本次是对spark源码的简单解读,解读的过程花了大概4天的时间融汇贯通,直到现在还是不是很熟练,但是研究源码对于spark项目的运作,调优有着很大的帮助,即使困难也要继续做下去
,发现Executor是一个通信对象
- 2 .顺势下棋发现Executor会向Driver反向注册自己
总结:
- 本次是对spark源码的简单解读,解读的过程花了大概4天的时间融汇贯通,直到现在还是不是很熟练,但是研究源码对于spark项目的运作,调优有着很大的帮助,即使困难也要继续做下去
- 我会持续的更新关于源码的理解,理解不到位或者对文章有不一样的看法欢迎私信一起进步