本期内容:
1. Receiver启动的方式设想
2. Receiver启动源码彻底分析
1. Receiver启动的方式设想
Spark Streaming是个运行在Spark Core上的应用程序。这个应用程序既要接收数据,还要处理数据,这些都是在分布式的集群中进行的,应该启动多个Job,让它们分工并能协调。Receiver的工作是接收数据,应该是用Spark Core中的Job来实现。
Receiver启动的设计,还要解决以下问题:
1. 一个Executor上启动多个Receiver、而其它Executor却空闲的负载不均衡问题;
2. Receiver启动异常导致整个Spark Streaming应用程序失败的问题。
2. Receiver启动源码彻底分析
Spark Streaming的应用程序要处理流数据,肯定是在开始阶段就要做好接收数据的准备。
Spark Streaming的应用程序代码定义DStream时,会定义一个或多个InputDStream。每个InputDStream分别对应有一个Receiver。
Receiver启动全生命周期主流程图如下:
Receiver的启动,是在ssc.start()中。
剖析一下StreamingContext的start():
/**
* Start the execution of the streams.
*
* @throws IllegalStateException if the StreamingContext is already stopped.
*/
def start(): Unit = synchronized {
state match {
case INITIALIZED =>
startSite.set(DStream.getCreationSite())
StreamingContext.ACTIVATION_LOCK.synchronized {
StreamingContext.assertNoOtherContextIsActive()
try {
validate()
// Start the streaming scheduler in a new thread, so that thread local properties
// like call sites and job groups can be reset without affecting those of the
// current thread.
ThreadUtils.runInNewThread("streaming-start") {
sparkContext.setCallSite(startSite.get)
sparkContext.clearJobGroup()
sparkContext.setLocalProperty(SparkContext.SPARK_JOB_INTERRUPT_ON_CANCEL, "false")
// 启动子线程,一方面为了本地初始化工作,另外一方面是不要阻塞主线程。
scheduler.start
()
}
state = StreamingContextState.ACTIVE
} catch {
case NonFatal(e) =>
logError("Error starting the context, marking it as stopped", e)
scheduler.stop(false)
state = StreamingContextState.STOPPED
throw e
}
StreamingContext.setActiveContext(this)
}
shutdownHookRef = ShutdownHookManager.addShutdownHook(
StreamingContext.SHUTDOWN_HOOK_PRIORITY)(stopOnShutdown)
// Registering Streaming Metrics at the start of the StreamingContextassert(env.metricsSystem != null)
env.metricsSystem.registerSource(streamingSource)
uiTab.foreach(_.attach())
logInfo("StreamingContext started")
case ACTIVE =>
logWarning("StreamingContext has already been started")
case STOPPED =>
thrownew IllegalStateException("StreamingContext has already been stopped")
}
}
而在JobScheduler的start方法中ReceiverTracker的start方法被调用,Receiver就启动了。
JobScheduler的start:
def start(): Unit = synchronized {
if (eventLoop != null) return // scheduler has already been started
logDebug("Starting JobScheduler")
eventLoop = new EventLoop[JobSchedulerEvent]("JobScheduler") {
override protected def onReceive(event: JobSchedulerEvent): Unit = processEvent(event)
override protected def onError(e: Throwable): Unit = reportError("Error in job scheduler", e)
}
eventLoop.start()
// attach rate controllers of input streams to receive batch completion updates
for {
inputDStream <- ssc.graph.getInputStreams
rateController <- inputDStream.rateController
} ssc.addStreamingListener(rateController)
listenerBus.start(ssc.sparkContext)
receiverTracker = new ReceiverTracker
(ssc)
inputInfoTracker = new InputInfoTracker(ssc)
//启动receiverTracker
receiverTracker.start
()
jobGenerator.start()
logInfo("Started JobScheduler")
}
ReceiverTracker的start方法启动RPC消息通信体,为啥呢?因为ReceiverTracker会监控整个集群中的Receiver,Receiver转过来要向ReceiverTrackerEndpoint汇报自己的状态,接收的数据,包括生命周期等信息。
ReceiverTracker.start:
/** Start the endpoint and receiver execution thread. */
def start(): Unit = synchronized {
if (isTrackerStarted) {
thrownew SparkException("ReceiverTracker already started")
}
// Receiver的启动是依据输入数据流的。
if (!receiverInputStreams.isEmpty) {
endpoint
= ssc.env.rpcEnv.
setupEndpoint
(
"ReceiverTracker", new ReceiverTrackerEndpoint(ssc.env.rpcEnv))
if (!skipReceiverLaunch) launchReceivers
()
logInfo("ReceiverTracker started")
trackerState = Started
}
}
基于ReceiverInputDStream(是在Driver端)来获得具体的Receivers实例,然后再把他们分布到Worker节点上。一个ReceiverInputDStream只对应一个Receiver。
ReceiverTracker.launchReceivers:
/**
* Get the receivers from the ReceiverInputDStreams, distributes them to the
* worker nodes as a parallel collection, and runs them.
*/
private def launchReceivers(): Unit = {
val receivers = receiverInputStreams.map(nis => {
// 一个数据输入来源(receiverInputDStream)只对应一个Receiver
val
rcvr = nis.getReceiver()
rcvr.setReceiverId(nis.id)
rcvr
})
runDummySparkJob
()
logInfo("Starting " + receivers.length + " receivers")
// 此时的endpoint就是上面代码中在ReceiverTracker的start方法中构造的ReceiverTrackerEndpoint
endpoint.send(StartAllReceivers(receivers))
}
先看其中的runDummySparkJob()。
runDummySparkJob()是为了确保所有节点活着,而且避免所有的receivers集中在一个节点上。
ReceiverTracker.runDummySparkJob():
/**
* Run the dummy Spark job to ensure that all slaves have registered. This avoids all the
* receivers to be scheduled on the same node.
*
* TODO Should poll the executor number and wait for executors according to
* "spark.scheduler.minRegisteredResourcesRatio"and
* "spark.scheduler.maxRegisteredResourcesWaitingTime" rather than running a dummy job.
*/
private def runDummySparkJob(): Unit = {
if (!ssc.sparkContext.isLocal) {
ssc.sparkContext.makeRDD(1 to 50, 50).map(x => (x, 1)).reduceByKey(_ + _, 20).collect()
}
assert(getExecutors.nonEmpty)
}
再回去看ReceiverTracker.launchReceivers()中的getReceiver()。
ReceiverInputDStream.getReceiver():
/**
* Gets the receiver object that will be sent to the worker nodes
* to receive data. This method needs to defined by any specific implementation
* of a ReceiverInputDStream.
*/
def getReceiver(): Receiver[T] //返回的是Receiver对象
ReceiverInputDStream的getReceiver()方法返回Receiver对象。 该方法实际上要靠ReceiverInputDStream的子类实现。
相应的,ReceiverInputDStream的子类中必须要实现这个getReceiver()方法。ReceiverInputDStream的子类还必须定义自己对应的Receiver子类,因为这个Receiver子类会在getReceiver()方法中用来创建这个Receiver子类的对象。
SocketInputDStream.getReceiver:
def getReceiver(): Receiver[T] = {
new SocketReceiver
(host, port, bytesToObjects, storageLevel)
}
}
SocketInputDStream中还定义了相应的Receiver子类SocketReceiver。SocketReceiver类中还必须定义onStart方法。
onStart方法会启动后台线程,调用receive方法。
private[streaming]
class SocketReceiver[T: ClassTag](
host: String,
port: Int,
bytesToObjects: InputStream => Iterator[T],
storageLevel: StorageLevel
) extends Receiver[T](storageLevel) with Logging {
再回到 ReceiverTracker.
launchReceivers()
中,看最后的代码
endpoint.send(StartAllReceivers(receivers))。这个代码给ReceiverTrackerEndpoint对象发送了StartAllReceivers消息,ReceiverTrackerEndpoint对象接收后所做的处理在ReceiverTrackerEndpoint.receive中。
ReceiverTracker.ReceiverTrackerEndpoint.receive:
/** RpcEndpoint to receive messages from the receivers. */
private class ReceiverTrackerEndpoint(override val rpcEnv: RpcEnv) extends ThreadSafeRpcEndpoint {
// TODO Remove this thread pool after https://github.com/apache/spark/issues/7385 is merged
private val submitJobThreadPool = ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("submit-job-thread-pool"))
privateval walBatchingThreadPool = ExecutionContext.fromExecutorService(
ThreadUtils.newDaemonCachedThreadPool("wal-batching-thread-pool"))
@volatile private var active: Boolean = true
receive: PartialFunction[Any, Unit] = {
// Local messages
case StartAllReceivers
(receivers) =>
// schedulingPolicy调度策略
// receivers就是要启动的receiver
// getExecutors获得集群中的Executors的列表
// scheduleReceivers就可以确定receiver可以运行在哪些Executor上
val scheduledLocations = schedulingPolicy.
scheduleReceivers
(receivers, getExecutors)
for (receiver <- receivers) {
// scheduledLocations根据receiver的Id就找到了当前那些Executors可以运行
Receiverval executors = scheduledLocations(receiver.streamId)
updateReceiverScheduledExecutors(receiver.streamId, executors)
receiverPreferredLocations(receiver.streamId) = receiver.preferredLocation
// 上述代码之后要启动的Receiver确定了,具体Receiver运行在哪些Executors上也确定了。
// 循环receivers,每次将一个receiver传入过去。
startReceiver
(receiver, executors)
}
// 用于接收RestartReceiver消息,重新启动Receiver.
case RestartReceiver(receiver) =>
// Old scheduled executors minus the ones that are not active any more
// 如果Receiver失败的话,从可选列表中减去。
// 刚在调度为Receiver分配给哪个Executor的时候会有一些列可选的Executor列表
// 重新获取Executors
val scheduledLocations = if (oldScheduledExecutors.nonEmpty) {
// Try global scheduling again
oldScheduledExecutors
} else {
// 如果可选的Executor使用完了,则会重新执行rescheduleReceiver重新获取Executor.
val oldReceiverInfo = receiverTrackingInfos(receiver.streamId)
// Clear "scheduledLocations" to indicate we are going to do local scheduling
val newReceiverInfo = oldReceiverInfo.copy(
state = ReceiverState.INACTIVE, scheduledLocations = None)
receiverTrackingInfos(receiver.streamId) = newReceiverInfo
schedulingPolicy.rescheduleReceiver(
receiver.streamId,
receiver.preferredLocation,
receiverTrackingInfos,
getExecutors)
}
// Assume there is one receiver restarting at one time, so we don't need to update
// receiverTrackingInfos
// 重复调用startReceiver
startReceiver
(receiver, scheduledLocations)
case c: CleanupOldBlocks =>
receiverTrackingInfos.values.flatMap(_.endpoint).foreach(_.send(c))
case UpdateReceiverRateLimit(streamUID, newRate) =>
for (info <- receiverTrackingInfos.get(streamUID); eP <- info.endpoint) {
eP.send(UpdateRateLimit(newRate))
}
// Remote messagescase ReportError(streamId, message, error) =>
reportError(streamId, message, error)
}
从注释中可以看到,Spark Streaming指定receiver在哪些Executors上运行,而不是基于Spark Core中的Task来指定。
Spark使用submitJob的方式启动Receiver,而在应用程序执行的时候会有很多Receiver,这个时候是启动一个Receiver呢,还是把所有的Receiver通过这一个Job启动?
在ReceiverTracker的receive方法中startReceiver方法第一个参数就是receiver,从实现中可以看出for循环不断取出receiver,然后调用startReceiver。由此就可以得出一个Job只启动一个Receiver。
如果Receiver启动失败,此时并不会认为是作业失败,会重新发消息给ReceiverTrackerEndpoint重新启动Receiver,这样也就确保了Receivers一定会被启动,这样就不会像Task启动Receiver的话如果失败受重试次数的影响。
ReceiverTracker.startReceiver:
/**
* Start a receiver along with its scheduled executors
*/
startReceiver(
receiver: Receiver[_],
// scheduledLocations指定的是在具体的那台物理机器上执行。
scheduledLocations: Seq[TaskLocation]): Unit = {
// 判断下Receiver的状态是否正常。
def shouldStartReceiver: Boolean = {
// It's okay to start when trackerState is Initialized or Started
!(isTrackerStopping || isTrackerStopped)
}
val receiverId = receiver.streamId
if (!shouldStartReceiver) {
// 如果不需要启动Receiver则会调用
onReceiverJobFinish(receiverId)
return
}
val checkpointDirOption = Option(ssc.checkpointDir)
val serializableHadoopConf =
new SerializableConfiguration(ssc.sparkContext.hadoopConfiguration)
// startReceiverFunc封装了在worker上启动receiver的动作。
// Function to start the receiver on the worker node
val startReceiverFunc
: Iterator[Receiver[_]] => Unit =
(iterator: Iterator[Receiver[_]]) => {
if (!iterator.hasNext) {
throw new SparkException(
"Could not start receiver as object not found.")
}
if (TaskContext.get().attemptNumber() == 0) {
val receiver = iterator.next()
assert(iterator.hasNext == false)
// ReceiverSupervisorImpl是Receiver的监控器,同时负责数据的写等操作。
val supervisor = new ReceiverSupervisorImpl(
receiver, SparkEnv.get, serializableHadoopConf.value, checkpointDirOption)
supervisor.start
()
supervisor.awaitTermination()
} else {
// 如果你想重新启动receiver的话,你需要重新完成上面的调度,重新schedule,而不是Task重试。
// It's restarted by TaskScheduler, but we want to reschedule it again. So exit it.
}
}
// Create the RDD using the scheduledLocations to run the receiver in a Spark job
val receiverRDD: RDD[Receiver[_]] =
if (scheduledLocations.isEmpty) {
ssc.sc.makeRDD(Seq(receiver), 1)
} else {
val preferredLocations = scheduledLocations.map(_.toString).distinct
ssc.sc.makeRDD(Seq(receiver -> preferredLocations))
}
// receiverId可以看出,receiver只有一个
receiverRDD.setName(s"Receiver $receiverId")
ssc.sparkContext.setJobDescription(s"Streaming job running receiver $receiverId")
ssc.sparkContext.setCallSite(Option(ssc.getStartSite()).getOrElse(Utils.getCallSite()))
// 每个Receiver的启动都会触发一个Job,而不是一个作业的Task去启动所有的Receiver.
// 应用程序一般会有很多Receiver,
// 调用SparkContext的submitJob,为了启动Receiver,启动了Spark一个作业。
val future = ssc.sparkContext.submitJob
[Receiver[_], Unit, Unit](
receiverRDD, startReceiverFunc, Seq(0), (_, _) => Unit, ())
// We will keep restarting the receiver job until ReceiverTracker is stopped
future.onComplete {
case Success(_) =>
if (!shouldStartReceiver) {
onReceiverJobFinish(receiverId)
} else {
logInfo(s"Restarting Receiver $receiverId")
self.send(RestartReceiver(receiver))
}
case Failure(e) =>
if (!shouldStartReceiver) {
onReceiverJobFinish(receiverId)
} else {
logError("Receiver has been stopped. Try to restart it.", e)
logInfo(s"Restarting Receiver $receiverId")
self.send( RestartReceiver
(receiver))
}
// 使用线程池的方式提交Job,这样的好处是可以并发的启动Receiver。
}(submitJobThreadPool)
logInfo(s"Receiver ${receiver.streamId} started")
}
当Receiver启动失败的话,就会触发ReceiverTrackEndpoint重新启动一个Spark Job去启动Receiver.
/**
* This message will trigger ReceiverTrackerEndpoint to restart a Spark job for the receiver.
*/
private[streaming] case class RestartReceiver(receiver: Receiver[_]) extends ReceiverTrackerLocalMessage
// 当Receiver关闭的话,并不需要重新启动Spark Job.
/**
* Call when a receiver is terminated. It means we won't restart its Spark job.
*/
private def onReceiverJobFinish(receiverId: Int): Unit = {
receiverJobExitLatch.countDown()
// 使用foreach将receiver从receiverTrackingInfo中去掉。
receiverTrackingInfos.remove(receiverId).foreach { receiverTrackingInfo =>
if (receiverTrackingInfo.state == ReceiverState.ACTIVE) {
logWarning(s"Receiver $receiverId exited but didn't deregister")
}
}
}
回头再看ReceiverTracker.startReceiver中的代码supervisor.start()。在子类ReceiverSupervisorImpl中并没有start方法,因此调用的是父类ReceiverSupervisor的start方法。
ReceiverSupervisor.start:
/** Start the supervisor */
def start() {
onStart
() // 具体实现是子类实现的。
startReceiver
()
}
Receiver的onStart方法源码如下:
/**
* Called when supervisor is started.
* Note that this must be called before the receiver.onStart() is called to ensure
* things like [[BlockGenerator]]s are started before the receiver starts sending data.
*/
protected def onStart() { }
其具体实现是在子类的ReceiverSupervisorImpl的onStart方法:
overrideprotected def onStart() {
registeredBlockGenerators.foreach { _. start
() }
}
其中的_.start()是BlockGenerator.start:
/** Start block generating and pushing threads. */
def start(): Unit = synchronized {
if (state == Initialized) {
state = Active
blockIntervalTimer.start()
blockPushingThread.start()
logInfo("Started BlockGenerator")
} else {
thrownew SparkException(
s"Cannot start BlockGenerator as its not in the Initialized state [state = $state]")
}
}
回过头再看ReceiverSupervisor.start中的startReceiver()。
ReceiverSupervisor.startReceiver:
/** Start receiver */
startReceiver(): Unit = synchronized {
try {
if (onReceiverStart()) {
logInfo("Starting receiver")
receiverState = Started
receiver.onStart
()
logInfo("Called receiver onStart")
} else {
// The driver refused us
stop("Registered unsuccessfully because Driver refused to start receiver " + streamId, None)
}
} catch {
case NonFatal(t) =>
stop("Error starting receiver " + streamId, Some(t))
}
}
仍以Receiver的子类SocketReceiver为例说明onStart方法:
SocketReceiver.onStart:
def onStart
() {
// Start the thread that receives data over a connection
new Thread("Socket Receiver") {
setDaemon(true)
override def run() { receive
() }
}.start()
}
ReceiverInputDStream的 子类SocketInputDStream中。
SocketInputDStream.receive:
/** Create a socket connection and receive data until receiver is stopped */
def receive() {
var socket: Socket = null
try {
logInfo("Connecting to " + host + ":" + port)
socket = new Socket
(host, port)
logInfo("Connected to " + host + ":" + port)
val iterator = bytesToObjects(socket.getInputStream())
while(!isStopped && iterator.hasNext) {
store(iterator.next)
}
if (!isStopped()) {
restart("Socket data stream had no more data")
} else {
logInfo("Stopped receiving")
}
} catch {
case e: java.net.ConnectException =>
restart("Error connecting to " + host + ":" + port, e)
case NonFatal(e) =>
logWarning("Error receiving data", e)
restart("Error receiving data", e)
} finally {
if (socket != null) {
socket.close()
logInfo("Closed socket to " + host + ":" + port)
}
}
}
}