文章目录
- 前言:
- 一 spark应用运行流程图示
- 二 spark应用运行流程源码分析
- 1 提交参数的封装
- 2 反射创建对象
- 3 向yarn发送指令启动AM
- 4 yarn会在某个NM启动AM
- 5 AM启动Driver线程,执行用户类的main方法
- 6 AM向RM注册并申请资源
- 7 RM向AM返回可用资源,AM根据本地化级别分配资源
- 8 AM根据资源访问NM,并启动Executor
- 9 Executor向Driver发送消息,注册Executor(反向注册)
- 10 Driver反馈信息,注册完毕
- 11 Driver向Executor发送Task并执行
- --job的划分、调度以及Task的提交和执行
前言:
学习完spark之后,其实只是大致懂了怎么去用这个框架,但是对于它是怎么运行的还是一无所知,所以为了更加清楚的了解框架底层到底是怎么实现和运行的,我们需要跟踪源码看看它底层是怎么实现的
花费很多精力才追完源码,看完觉得有帮助的朋友三连哦,请不要下次一定😜
一 spark应用运行流程图示
步骤:
1 spark向yarn提交运行指令 bin/java ApplicationMaster
2 yarn会在某个NodeManager启动ApplicationMaster
3 AM启动Driver线程,执行用户类的main方法
4 AM向RM进行注册、申请资源
5 RM向AM返回可用资源,AM分配资源
6 AM根据资源访问NM,并启动Executor
6.1 Executor启动后会注册通信,并收到消息
6.2 收到消息后会执行通信对象的onStart方法
7 Executor向driver反向注册
8 Driver向Executor反馈消息,注册完毕
8.1Executor接收消息之后,创建计算对象,等待任务的执行
9 Driver进行任务的划分并发送给Executor进行计算
二 spark应用运行流程源码分析
我们一般都是基于yarn cluster模式来进行部署,我们也就按照这个模式来进行追踪源码
之前在我们配好yarn模式之后,我们根据官方文档给的案例进行一次测试
bin/spark-submit
--class org.apache.spark.examples.SparkPi
--master yarn
--deploy-mode cluster
./examples/jars/spark-examples_2.11-2.1.1.jar 10
当时我们并不知道为什么要这样去写,我们去查看spark-submit脚本
exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"
发现其实就是去调用org.apache.spark.deploy.SparkSubmit这个类
1 提交参数的封装
我们去查看SparkSubmit这个类的源码
def main(args: Array[String]): Unit = {
val submit = new SparkSubmit()
submit.doSubmit(args)
}
在main方法中创建了一个SparkSubmit对象,对象调用了doSubmit方法,传的参数就是我们写的那些参数
val appArgs = parseArguments(args) //解析参数
在doSubmit方法中,我们看到这个方法,看的出是一个解析参数的方法,我们再进去看看是怎么进行参数的解析的
protected def parseArguments(args: Array[String]): SparkSubmitArguments = {
new SparkSubmitArguments(args)
}
我们进入到SparkSubmitArguments类中,发现有很多变量和方法用来封装参数,有一个handle方法,就是处理参数的方法了
protected final String CLASS = "--class";
protected final String CONF = "--conf";
protected final String DEPLOY_MODE = "--deploy-mode";
protected final String MASTER = "--master";
protected final String NAME = "--name";
override protected def handle(opt: String, value: String): Boolean = {
opt match {
case NAME =>
name = value
case MASTER =>
master = value
case CLASS =>
mainClass = value
case DEPLOY_MODE =>
if (value != "client" && value != "cluster") {
error("--deploy-mode must be either \"client\" or \"cluster\"")
}
deployMode = value
//下面还有很多,我们就只看比较熟悉的几个
参数解析完成之后,我们回退到进入解析参数方法的那里
appArgs.action match {
case SparkSubmitAction.SUBMIT => submit(appArgs, uninitLog)
case SparkSubmitAction.KILL => kill(appArgs)
case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)
case SparkSubmitAction.PRINT_VERSION => printVersion()
}
我们可以知道这是模式匹配,并且只能匹配这几个值,但是我们不知道具体是哪个值,点进action里面看看
var action: SparkSubmitAction = null
action = Option(action).getOrElse(SUBMIT)
我们可以看到,如果action为空的时候,默认会给它一个值SUBMIT,此时我们知道是调用这行代码执行
case SparkSubmitAction.SUBMIT => submit(appArgs, uninitLog)
2 反射创建对象
进入submit方法之后,我们去掉一些暂时不关心的代码
private def submit(args: SparkSubmitArguments, uninitLog: Boolean): Unit = {
def doRunMain(): Unit = {
if (args.proxyUser != null) {
val proxyUser = UserGroupInformation.createProxyUser(args.proxyUser,
UserGroupInformation.getCurrentUser())
} else {
runMain(args, uninitLog)
}
}
我们可以发现,最后其实是调用runMain方法
//这是在准备提交的环境
val (childArgs, childClasspath, sparkConf, childMainClass) = prepareSubmitEnvironment(args)
//获取提交的类加载器
val loader = getSubmitClassLoader(sparkConf)
//将我们需要的jar包放到classPath下
for (jar <- childClasspath) {
addJarToClasspath(jar, loader)
}
//通过类名加载类,获取类信息
mainClass = Utils.classForName(childMainClass)
val app: SparkApplication = if(classOf[SparkApplication].isAssignableFrom(mainClass)) {
//通过反射创建对象并进行强转为SparkApplication类型
mainClass.getConstructor().newInstance().asInstanceOf[SparkApplication]
} else {
new JavaMainApplication(mainClass)
}
//启动app对象
app.start(childArgs.toArray, sparkConf)
通过上面的分析,我们知道会创建一个APP对象,但是我们还不知道具体是创建哪个类的对象,但是我们可以知道是通过prepareSubmitEnvironment方法获取的childMainClass
if (isYarnCluster) {
childMainClass = YARN_CLUSTER_SUBMIT_CLASS
private[deploy] val YARN_CLUSTER_SUBMIT_CLASS =
"org.apache.spark.deploy.yarn.YarnClusterApplication"
这下我们就知道了
childMainClass="org.apache.spark.deploy.yarn.YarnClusterApplication"
我们还可以知道,如果不是yarn cluster集群模式,而是yarn client模式的话,此时创建的类就是我们参数传递的那个类
if (deployMode == CLIENT) {
childMainClass = args.mainClass
最终我们可以看做是
YarnClusterApplication.start
3 向yarn发送指令启动AM
override def start(args: Array[String], conf: SparkConf): Unit = {
conf.remove("spark.jars")
conf.remove("spark.files")
new Client(new ClientArguments(args), conf).run()
}
new ClientArguments(args) // 参数的封装
new Client //创建客户端
private val yarnClient = YarnClient.createYarnClient
public static YarnClient createYarnClient() {
YarnClient client = new YarnClientImpl();
return client;
}
protected void serviceStart() throws Exception {
//创建RMClient
this.rmClient = (ApplicationClientProtocol)ClientRMProxy.createRMProxy(this.getConfig(), ApplicationClientProtocol.class);
进入run() //执行
//提交应用
this.appId = submitApplication()
//后台进行连接
launcherBackend.connect()
//根据hadoop配置初始化yarnClient
yarnClient.init(hadoopConf)
//启动yarnClient
yarnClient.start()
// Get a new application from our RM
val newApp = yarnClient.createApplication()
// Set up the appropriate contexts to launch our AM
val containerContext = createContainerLaunchContext(newAppResponse)
val appContext = createApplicationSubmissionContext(newApp, containerContext)
//向yarn提交应用
yarnClient.submitApplication(appContext)
看看createContainerLaunchContext、createApplicationSubmissionContext这两个方法封装了那些信息
createApplicationSubmissionContext() //就是一些参数配置的封装
createContainerLaunchContext()
// Add Xmx for AM memory
//配置JVM启动参数
javaOpts += "-Xmx" + amMemory + "m"
//Command for the ApplicationMaster
//封装给ApplicationMaster的指令
val commands = prefixEnv ++
Seq(Environment.JAVA_HOME.$$() + "/bin/java", "-server") ++
javaOpts ++ amArgs ++
Seq(
"1>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout",
"2>", ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
//amArgs参数
val amArgs =
Seq(amClass) ++ userClass ++ userJar ++ primaryPyFile ++ primaryRFile ++ userArgs ++
Seq("--properties-file", buildPath(Environment.PWD.$$(), LOCALIZED_CONF_DIR, SPARK_CONF_FILE))
//amClass的具体类型
val amClass =
if (isClusterMode) {
Utils.classForName("org.apache.spark.deploy.yarn.ApplicationMaster").getName
} else {
Utils.classForName("org.apache.spark.deploy.yarn.ExecutorLauncher").getName
}
由此,我们可以得出封装给ApplicationMaster的指令:
【cluster】command :bin/java org.apache.spark.deploy.yarn.ApplicationMaster
【client】command :bin/java org.apache.spark.deploy.yarn.ExecutorLauncher
将指令和配置封装到amContainer对象中并返回
amContainer.setCommands(printableCommands.asJava)
amContainer
//向yarn提交应用
yarnClient.submitApplication(appContext)
4 yarn会在某个NM启动AM
5 AM启动Driver线程,执行用户类的main方法
因为传递的时封装的指令是
bin/java org.apache.spark.deploy.yarn.ApplicationMaster
所以肯定会调用ApplicationMaster的main方法,进入ApplicationMaster的main方法
def main(args: Array[String]): Unit = {
//对传递过来的参数的封装
val amArgs = new ApplicationMasterArguments(args)
//创建SparkConf对象
val sparkConf = new SparkConf()
//根据参数和配置文件信息创建master对象
master = new ApplicationMaster(amArgs, sparkConf, yarnConf)
ugi.doAs(new PrivilegedExceptionAction[Unit]() {
//执行master的run方法
override def run(): Unit = System.exit(master.run())
})
进入run方法
//设置参数
System.setProperty(UI_PORT.key, "0")
System.setProperty("spark.master", "yarn")
System.setProperty(SUBMIT_DEPLOY_MODE.key, "cluster")
//根据部署模式不同调用不同方法
if (isClusterMode) {
runDriver()
} else {
runExecutorLauncher()
}
进入runDriver方法
//创建用户应用,即传递参数中--class的值
userClassThread = startUserApplication()
进入startUserApplication方法
//反射调用用户类的main方法
val mainMethod = userClassLoader.loadClass(args.userClass)
.getMethod("main", classOf[Array[String]])
val userThread = new Thread {
override def run(): Unit = {
//中间省略具体代码
}
userThread.setContextClassLoader(userClassLoader)
userThread.setName("Driver")
userThread.start()
从这里就可以看出了AM创建一个用户线程,该用户线程就是来执行我们的应用程序,并且设置名字为“Driver”,所以我们说的Driver其实是一个线程
6 AM向RM注册并申请资源
回退到runDriver方法
//等待应用程序加载完sparkContext
val sc = ThreadUtils.awaitResult(sparkContextPromise.future,
Duration(totalWaitTime, TimeUnit.MILLISECONDS))
if (sc != null) {
//一些通信的信息
val rpcEnv = sc.env.rpcEnv
val userConf = sc.getConf
val host = userConf.get(DRIVER_HOST_ADDRESS)
val port = userConf.get(DRIVER_PORT)
//注册
registerAM(host, port, userConf, sc.ui.map(_.webUrl), appAttemptId)
//创建终端的引用
val driverRef = rpcEnv.setupEndpointRef(
RpcAddress(host, port),
YarnSchedulerBackend.ENDPOINT_NAME)
//创建资源分配器
createAllocator(driverRef, userConf, rpcEnv, appAttemptId, distCacheConf)
进入registerAM方法
client.register(host, port, yarnConf, _sparkConf, uiAddress, historyAddress)
进入register方法
def register: Unit = {
//创建AMRMClient
amClient = AMRMClient.createAMRMClient()
amClient.init(conf)
//启动AMRMClient
amClient.start()
this.uiHistoryAddress = uiHistoryAddress
synchronized {
//AM向RM进行注册
amClient.registerApplicationMaster(driverHost, driverPort, trackingUrl)
registered = true
}
}
7 RM向AM返回可用资源,AM根据本地化级别分配资源
在回退到runDriver方法,进入createAllocator方法
//创建分配器
allocator = client.createAllocator()
//分配资源
allocator.allocateResources()
//表示保证Driver线程必须执行完成之后才会执行后面的代码
userClassThread.join()
进入allocateResources方法
val allocateResponse = amClient.allocate(progressIndicator)
//获取可用的资源列表
val allocatedContainers = allocateResponse.getAllocatedContainers()
//处理可用的资源容器
handleAllocatedContainers(allocatedContainers.asScala)
进入handleAllocatedContainers方法
这里会涉及到进行容器的匹配操作,会有一个概念叫做首选位置
在大数据领域中,有句话说的简明重要,就是移动数据不如移动计算,就是尽量保证让计算接近数据,就涉及到几个等级:
①进程本地化:即计算和数据都在一个进程里面
②节点本地化:即计算和数据在一个机器节点上
③机架本地化:即计算和数据在同一机架的不同机器节点上
④任意
val remainingAfterHostMatches = new ArrayBuffer[Container]
for (allocatedContainer <- allocatedContainers) {
matchContainerToRequest(allocatedContainer, allocatedContainer.getNodeId.getHost,
containersToUse, remainingAfterHostMatches)
}
val remainingAfterRackMatches = new ArrayBuffer[Container]
if (remainingAfterHostMatches.nonEmpty) {
var exception: Option[Throwable] = None
val thread = new Thread("spark-rack-resolver") {
override def run(): Unit = {
try {
for (allocatedContainer <- remainingAfterHostMatches) {
val rack = resolver.resolve(allocatedContainer.getNodeId.getHost)
matchContainerToRequest(allocatedContainer, rack, containersToUse,
remainingAfterRackMatches)
}
} catch {
case e: Throwable =>
exception = Some(e)
}
}
}
thread.setDaemon(true)
//会启动一个线程进行容器的匹配
thread.start()
//node-local:节点本地化
//rack-local:机架本地化
// Assign remaining that are neither node-local nor rack-local
val remainingAfterOffRackMatches = new ArrayBuffer[Container]
for (allocatedContainer <- remainingAfterRackMatches) {
matchContainerToRequest(allocatedContainer, ANY_HOST, containersToUse,
remainingAfterOffRackMatches)
}
//运行可用的容器
runAllocatedContainers(containersToUse)
8 AM根据资源访问NM,并启动Executor
进入runAllocatedContainers方法
launcherPool.execute(() => {
try {
//创建新的ExecutorRunnable并执行
new ExecutorRunnable(
//....
).run()
进入run方法
def run(): Unit = {
logDebug("Starting Executor Container")
//创建NMClient
nmClient = NMClient.createNMClient()
//初始化
nmClient.init(conf)
//启动NMClient
nmClient.start()
//启动容器
startContainer()
}
进入startContainer方法
//准备指令
val commands = prepareCommand()
进入prepareCommand方法
val commands = prefixEnv ++
Seq(Environment.JAVA_HOME.$$() + "/bin/java", "-server") ++
javaOpts ++
Seq("org.apache.spark.executor.YarnCoarseGrainedExecutorBackend",
"--driver-url", masterAddress,
"--executor-id", executorId,
"--hostname", hostname,
"--cores", executorCores.toString,
"--app-id", appId,
"--resourceProfileId", resourceProfileId.toString) ++
userClassPath ++
Seq(
s"1>${ApplicationConstants.LOG_DIR_EXPANSION_VAR}/stdout",
s"2>${ApplicationConstants.LOG_DIR_EXPANSION_VAR}/stderr")
// TODO: it would be nicer to just make sure there are no null commands here
commands.map(s => if (s == null) "null" else s).toList
又是对指令参数的封装,最后的command是这样的
bin/java org.apache.spark.executor.YarnCoarseGrainedExecutorBackend
意思我们又要执行这个类的main方法了
进入YarnCoarseGrainedExecutorBackend的main方法
def main(args: Array[String]): Unit = {
val createFn: (RpcEnv, CoarseGrainedExecutorBackend.Arguments, SparkEnv, ResourceProfile) =>
CoarseGrainedExecutorBackend = { case (rpcEnv, arguments, env, resourceProfile) =>
new YarnCoarseGrainedExecutorBackend(rpcEnv, arguments.driverUrl, arguments.executorId,
arguments.bindAddress, arguments.hostname, arguments.cores, arguments.userClassPath, env,
arguments.resourcesFileOpt, resourceProfile)
}
//解析参数
val backendArgs = CoarseGrainedExecutorBackend.parseArguments(args,
this.getClass.getCanonicalName.stripSuffix("$"))
//后台调用run执行
CoarseGrainedExecutorBackend.run(backendArgs, createFn)
System.exit(0)
}
我们看到其实调用的是CoarseGrainedExecutorBackend类的run方法
进入run方法
//与driver关联
driver = fetcher.setupEndpointRefByURI(arguments.driverUrl)
//创建Executor的环境
val env = SparkEnv.createExecutorEnv(driverConf, arguments.executorId, arguments.bindAddress,arguments.hostname, arguments.cores, cfg.ioEncryptionKey, isLocal = false)
//创建终端连接
env.rpcEnv.setupEndpoint("Executor",backendCreateFn(env.rpcEnv, arguments, env, cfg.resourceProfile))
arguments.workerUrl.foreach { url => env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))}
进入backendCreateFn方法
其实就是创建一个CoarseGrainedExecutorBackend对象
backendCreateFn: (RpcEnv, Arguments, SparkEnv, ResourceProfile) =>
CoarseGrainedExecutorBackend
9 Executor向Driver发送消息,注册Executor(反向注册)
Executor启动后会注册通信,并收到消息,收到消息后会执行通信对象的onStart方法
至于为什么会这样我们看下面的分析:
进入setupEndpoint方法
override def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef = {
dispatcher.registerRpcEndpoint(name, endpoint)
}
进入registerRpcEndpoint方法
messageLoop = endpoint match {
case e: IsolatedRpcEndpoint =>
new DedicatedMessageLoop(name, e, this)
case _ =>
//看着重点,这里表示对传过来的endpoint进行注册
sharedLoop.register(name, endpoint)
sharedLoop
}
endpoints.put(name, messageLoop)
进入register方法
//这是一个收件箱,原理和早期版本的akka非常相似
val inbox = new Inbox(name, endpoint)
进入Inbox类
//我们可以看到会调用每个通信对象的OnStart方法
inbox.synchronized {
messages.add(OnStart)
}
我们现在的通信对象就是CoarseGrainedExecutorBackend,所以进入onStart方法
driver = Some(ref)
//向driver注册Executor
ref.ask[Boolean](RegisterExecutor(executorId, self, hostname, cores, extractLogUrls,extractAttributes, _resources, resourceProfile.id))
10 Driver反馈信息,注册完毕
既然两者进行了通信,并且Executor发送了消息给Driver,那么正常情况下Driver肯定会收到消息并返回
在Driver端我们一般称SparkContext就是Driver
所以进入SparkContext类中
//需要一个后台类进行通信
private var _schedulerBackend: SchedulerBackend = _
class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: RpcEnv) extends ExecutorAllocationClient with SchedulerBackend with Logging
进入CoarseGrainedSchedulerBackend类中
有个receiveAndReply方法
CoarseGrainedSchedulerBackend.this.synchronized {
executorDataMap.put(executorId, data)
if (currentExecutorIdCounter < executorId.toInt) {
currentExecutorIdCounter = executorId.toInt
}
if (numPendingExecutors > 0) {
numPendingExecutors -= 1
logDebug(s"Decremented number of pending executors ($numPendingExecutors left)")
}
}
listenerBus.post(
SparkListenerExecutorAdded(System.currentTimeMillis(), executorId, data))
// Note: some tests expect the reply to come after we put the executor in the map
context.reply(true)
3.0.0版本和之前不一样了,之前这里会有明确的代码显示回传消息
executorRef.send(RegistedExecutor)
现在只是没有了,但是在CoarseGrainedExecutorBackend类中的onStart方法中,新加了一点逻辑操作
rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
// This is a very fast action so we can use "ThreadUtils.sameThread"
driver = Some(ref)
ref.ask[Boolean](RegisterExecutor(executorId, self, hostname, cores, extractLogUrls,
extractAttributes, _resources, resourceProfile.id))
}(ThreadUtils.sameThread).onComplete {
//增加注册是否成功后的逻辑操作,将之前在Driver需要发送的消息,放到executor进行回调发送
case Success(_) =>
self.send(RegisteredExecutor)
case Failure(e) =>
exitExecutor(1, s"Cannot register with driver: $driverUrl", e, notifyDriver = false)
}(ThreadUtils.sameThread)
我们可以看到直接在该方法中就进行了注册是否成功后的逻辑判断代码,将driver回复消息直接放在executor进行操作
此时driver发送消息之后,CoarseGrainedExecutorBackend肯定要接收消息,进入receive方法
case RegisteredExecutor =>
logInfo("Successfully registered with driver")
try {
//创建Executor计算对象
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false,
resources = _resources)
driver.get.send(LaunchedExecutor(executorId))
可以看到我们常说的Executor其实就是CoarseGrainedExecutorBackend类中的一个对象属性,就是一个用来封装计算逻辑的对象
11 Driver向Executor发送Task并执行
因为任务分解调度划分又是一部分比较细致的工作,后面会单独拿出一章来详讲,我们先大致了解会进行这样一步操作
–job的划分、调度以及Task的提交和执行
在CoarseGrainedExecutorBackend类的receive方法中
//执行任务
case LaunchTask(data) =>
if (executor == null) {
exitExecutor(1, "Received LaunchTask command but executor was null")
} else {
val taskDesc = TaskDescription.decode(data.value)
logInfo("Got assigned task " + taskDesc.taskId)
taskResources(taskDesc.taskId) = taskDesc.resources
//如果Task不为空,则调用计算对象executor执行Task
executor.launchTask(this, taskDesc)
}
至此我们对spark运行一个应用的整体流程就有了非常清醒的认识了,学习一个框架,我们先要懂怎么使用,之后要明白底层是怎么运行的,再之后就是为什么底层会这么用,有什么缺点,可以怎么优化,这就是一直进步的最佳捷径,希望和大家一起学习进步