客户端提交作业:

YarnRunner.submitApplication()
    YarnClientImpl.submitApplication
       ApplicationClientProtocal.submitApplication()  提交到RM端 在服务端,ResourceManager 里的组件   ClientRMService实现了ApplicationClientProtocal
ClientRMService.submitApplication
    RMAppManager.sbumitApplication
    创建APP,并加入到RMApps的map中,触发: RMAppEventType.START事件,该事件处理器是在ResourceManager中注册的 ApplicationEventDispatcher来处理。
    ApplicationEventDispatcher 通过事件中的appId, 从RMApps中获取 RMAppImpl,由RMAppImpl.handle处理,里面触发RMAppImpl的状态机来处理。
        RMAppImpl的状态机 调用StateStore.storeNewApplication(app)
            触发RMStateStoreAppEvent事件   RMStateStoreEventType.STORE_APP
            RMStateStore也有一个状态机,保存完appState之后,触发事件 RMAPPEventType.APP_NEW_SAVED
                RMAppImpl的状态机处理,触发 AppAddedScheduleEvent SchedulerEventType.APP_ADDED
                ResourceManager的调度器Scheduler来处理 SchedulerEventType.APP_ADDED, 简化主流程分析,我们使用FifoScheduler来进行分析
                    FifoScheduler触发RMAppEventType.APP_ACCEPTED事件
                        RMAppImpl的状态机触发,调用app.createAndStartNewAttempt(),创建一个RMAppAttemp, 触发RMAppAttemptEventType.START 事件 RMAppAttemp 的状态是NEW
                            RMAppImpl的状态机触发,调用ApplicationMasterService注册该RMAppAttempt, 并触发事件 AppAttemptAddedSchedulerEvent : SchedulerEventType.APP_ATTEMPT_ADDED
                                FifoScheduler 创建一个FiCaSchedulerApp, application设置当前的appAttemp为FiCaScheudlerApp, 触发RMAppAttemptEventType.ATTEMPT_ADDED
                                    RMAppAttemptImpl 处理该事件,调用scheduler.allocate方法 allocate Allocation.调用SchedulerApplicaitonAttempt的pullNewlyAllocatedContainersAndNMTokens方法
                                       SchedulerApplicaitonAttempt.pullNewlyAllocatedContainersAndNMTokens() 
                                       如果有资源,就触发RMContainerEventType.ACQUIRED. SchedulerApplicaitonAttempt的资源newlyAllocatedContainers来自NodeManager的给ResourceManager的心跳请求,后面会详细分析
                                          RMContainerImpl处理RMContainerEventType.ACQUIRED 事件,触发RMAppRunningOnNodeEvent : RMAppEventType.APP_RUNNING_ON_NODE
                                             RMAppImpl处理RMAppEventType.APP_RUNNING_ON_NODE, 
                                    
                            
                        NodeManager如何发送HeartBeat给ResourceManager?
NodeManager是通过组件NodeStatusUpdaterImpl发送的。
                
 NodeStatusUpdaterImpl里有一个线程 statusUpdaterRunable,定期的发送心跳给NodeManager,发送心跳的协议: ResourceTracker ResourceManager处理NodeManager的心跳:
ResourceTrackerService实现了ResourceTracker的协议
 ResourceTrackerService.nodeHeartbeat方法, 触发RMNodeEventType.STATUS_UPDATE事件,
 RMNodeImpl里的状态机会处理RMNodeEventType.STATUS_UPDATE 事件, 触发SchedulerEventType.NODE_UPDATE事件
     Scheduler.nodeUpdate
         completedContainer, 触发RMContainerEventType.FINISHED 事件
         assignContainers(node)
             FiCaSchedulerApp.allocate()
                rmContainer = new RMContainerImpl()
                newlyAllocatedContainers.add(rmContainer)
                触发RMContainerEventType.START
                    RMContainerImpl处理 RMContainerEventType.START 事件,
                        触发 RMAPPAttemptEventType.CONTAINER_ALLOCATED事件
                            RMAPPAttemptImpl 处理 RMAPPAttemptEventType.CONTAINER_ALLOCATED事件
                                appAttempt.scheudler.allocate(...)
                                    FiCaSchedulerApp application = getApplicationAttempt(...)
                                    allocation = application.pullNewlyAllocatedContainersAndNMtokens();
                                        for(Container contaner : newlyAllocatedContainers)
                                        {
                                            rmContainer.handler(new RMContainerEvent(rmContainer.getContainerId(), RMContainerEventType.ACQUIRED))
                                            触发RMAppEventType.APP_RUNNING_ON_NODE事件
                                        }
                                 appAttempt.storeAttempt()
                                     rmContext.getStateStore().storeNewApplicationAttempt(this);
                                     触发 new RMStateStoreAppAttemptEvent(attemptState) : RMStateStoreEventType.STORE_APP_ATTEMPT
                                         RMStateStore 处理该事件,并发出 RMAppAttemptEventType.ATTEMPT_NEW_SAVED 事件
                                            RMAPPAttemptImpl 处理,调用 appAttempt.launchAttempt(); 触发 AMLauncherEventType.LAUNCH事件
                                               ApplicationMasterLauncher处理 createRunableLaucher
                                                    AMLauncer.launcher(), connect()调用建立ContainerManagementProtocal的proxy, 创建一个ContainerLauncherContext, 并调用 proxy.startContainers(..)
                                                    ContainerManagerImpl实现了ContianerManagementProtocal
                                                    触发RMAppAttemptEventType.LAUNCHED
                                                    
                                            
                                     
                                     
                                                                     
                                            
                                            
                    
           
         
     
    启动Container
ContainerManagerImpl实现了ContianerManagementProtocal
ContainerManagerImpl.startContainers()
     startContainersInternal
         Application application = newApplicaitonImpl(...)
         context.getNMStateStore().storeApplication();
         触发事件: ApplicaitonInitEvent: ApplicationEventType.INIT_APPLICATION
             ApplicationImpl处理, 触发事件:LogHandlerAppStartedEvent: LogHandlerEventType.APPLICATION_STARTED.
                 LogAggregationService处理, InitApp ,触发事件:ApplicationEventType.APPLICATION_LOG_HANDLING_INITED
                     ApplicationImpl处理, 触发 LocalizationEventType.INIT_APPLICATION_RESOURCES
                        ResourceLocalizationService.handler(...) 
                            hanldeInitApplicaitonResources(...)
                                privateRsrc.putifAbsent(userName, new LocalResourcesTrackerImpl(...))
                                appRsrc.putifAbsent(appIdStre, new LocalResourcesTrackerImpl(...))
                                触发 ApplicationEventType.APPLICATION_INITED
                                    ApplicationImpl处理,
                                    for(Container container : app.containers.values()) {
                                        app.dispatcher.getEVentHandler().handler(new ContainerInitEvent(container.getContainerId()))
                                    }
                                    
                                       ContainerImpl
                                           构造LocalResourceREquest req
                                           container.dispatcher.getEVentHandler().handler( new ContainerLocalizationRequestEvent(container,req))  LocalizationEventType.INIT_CONTAINER_RESOURCES
                                               ResourceLocalizationService.handleInitContainerREsources(...)
                                                   for(LocalREsourceREquest req : event.getRequestedResources())
                                                   {
                                                       LocalResourcesTracker tracker = getLocalResourcesTracker(...);
                                                       tracker.handler(new ResourceRequestEvent(req,...)) : ResourceEventType.REQUEST
                                                   }
                                                   
                                                        LocalizedResource 处理 ResourceEventType.REQUEST
                                                        rsrc.dispatcher.getEVentHandler().handler(new LocalizerResourceREquestEvent(...))  LocalizationEventType.REQUEST_RESOURCE_LOCAIZATION
                                                             ResourceLocalizationService处理事件 
                                                                如果是PUBLIC publicLocalizer.addResource(req);
                                                                如果是PRIVATE 或者APPLICATION,privLocalizers.get(locId); 为空就创建一个 localizer = new LocalizerRunner(...);
                                                                  LocalizerRunner.run(){
                                                                     ContainerExecutor.startLocalizer(...) ContainerExecutor 的实现是LinuxContainerExecutor
                                                                     LinuxContainerExecutor.startLocalizer(...)
                                                                     构造了一个命令: 调用 ContainerLocalizer.main(String[] args)
                                                                        ContainerLocalizer.runLocalization(...)
                                                                            localizeFiles(...)
                                                                                while线程定期的像nodemanager.heartbeat(status), 告诉nodemanager哪些下载好了  LocalizationProtocal
                                                                                threadpool.submit(downoad(new Path(....)))
                                                                                   downoad()返回一个new FsDownload线程
                                                                                      FsDownload.call(...)把文件从hdfs下载到本地
                                                                                      
                                                                             ResourceLocalizationService 实现了 LocalizationProtocal    
                                                                                heartbeat(LocalizerStatus status)
                                                                                    localizerTracker.processHeartbeat(status);
                                                                                       privLocalizers.get(locId).processHeartbeat(status);
                                                                                          for(LocalResourceStatus stat : remoteResourceStatuses) {
                                                                                              case FETCH_SUCCESS: 
                                                                                                  触发 new ResourceLocalizedEvent(...)  ResourceEventType.LOCALIZED
                                                                                                      LocalREsourcesTrackerImpl处理
                                                                                                          LocalizedResource rsrc = localrsrc.get(req);
                                                                                                          rsrc.hanlder(event);
                                                                                                              for(ContainerId container: refs.ref) {
                                                                                                                  rsrc.dispatcher.getEVentHandler().handler(new ContanerResourceLocalizedEvent(...))  ContainerEventType.RESOURCE_LOCALIZED
                                                                                                              }
                                                                                                              
                                                                                                                  ContainerImpl处理事件ContainerEventType.RESOURCE_LOCALIZED
                                                                                                                  LocalizedTransition.transition()
                                                                                                                      List<String> syms = container.pendingREsources.remove(rsrcEvent.getResource());
                                                                                                                      如果syms空了,触发事件 LocalizationEventType.CONTAINER_RESOURCES_LOCALIzED.
                                                                                                                           ResourceLocalizationService 处理  LocalizationEventType.CONTAINER_RESOURCES_LOCALIzED.
                                                                                                                               ResourceLocalizationService.handleContainerResourcesLocalized(...);
                                                                                                                                   localizerTracker.endContainerLocalization(locId);
                                                                                                                                       privLocalizers.get(locId).endContainerLocalization();
                                                                                                                     contianer.sendLaunchEvent();
                                                                                                                          ContainersLauncherEventType.LAUNCH_CONTAINER
                                                                                                                          
                                                                                                                             ContainersLauncher 处理, 
                                                                                                                               Application app = context.getApplications().get(applicationId);
                                                                                                                               ContainerLaunch launch = new ContainerLauch(....);
                                                                                                                               containerLauncher.submit(launch);// containerLauncher是一个线程池
                                                                                                                                 ContainerLaunch.call(){
                                                                                                                                     构造运行的命令
                                                                                                                                     LinuxContainerExecutor.launchContainer(....);
                                                                                                                                 }
                                                                                                                                   
                                                                                                          
 ;
                                                                                          }
                                                                                       
                                                                                   
                                                                             
                                                                     
                                                                  }
                                                                
                                                             
                                                        
                               
                           
         context.getNMStateStore().storeContainer(...);
         触发事件: ApplicaitonContainerInitEvent(container); ApplicationEventType.INIT_CONTAINER
         MRAppMaster.main
    MRAppMaster appMaster = new MRAppMaster(...)//从参数中获取applicationAttemptId, containerId,... 构造一个MRAppMaster
    JobConf conf = new JobConf(new YarnConfiguration())
    conf.addREsource(new Path(job.xml));
    initAndStartAppMaster(appMaster, conf,jobUserName)
        appMaster.init(conf);
        appMaster.start() == appMaster.serverStart()
           createJob(); //job = new JobImpl(...)
           触发JobEventType.JOB_INIT事件
           如果初始化成功,调用startJob() == 发送JobEventType.JOB_START 事件
           JobImpl.StartTransition.transition()// 
               触发 new CommitterJobSetupEvent() : CommitterEventType.JOB_SETUP
                  CommitterEventHanlder.handleJobSetup(); //设置JobCommitter的环境
                     触发JobEventType.JOB_SETUP_COMPLETED
                        JobImpl.SetupCompletedTransition.transition()
                            job.scheduleTasks(job.mapTasks, job.numReduceTasks == 0 ) //触发 new TaskEvent(taskID, TaskEventType.T_SCHEDULE);
                            job.scheduleTasks(job.reduceTasks, true)  //触发 new TaskEvent(taskID, TaskEventType.T_SCHEDULE);
                                TaskImpl.InitialScheduleTransition.transition()
                                   task.addAndScheduleAttempt(Avataar.VIRGIN);
                                       TaskAttempt attempt = addAttempt(avataar);
                                       eventHandler.handle(new TaskAttemptEvent(attempt.getID(), TaskAttemptEventType.TA_SCHEDULE));
                                          TaskAttempImpl.RequestContainerTransition.transition()
                                              触发new ContainerRequestEvent(...) : ContainerAllocator.EventType.CONTAINER_REQ
                                                  RMContainerAllocator.hanlde() 把Event加入queue, 然后在while循环从queue中取出Event处理。
                                                      RMContainerAllocator.hanldeEvent(event);
                                                      如果是map handleMapContainerRequest();
                                                      如果是Reduce handleReduceContainerRequest()
                                                          ScheduledRequests.addMap(reqEvent); //
                                                             addContainerReq(request);
                                                             在RMCommunicator里有一个ContainerAllocatorThread, 定期的调用heartbeat, RMContainerAllocator 是RMCommunicator 的子类,调用的是RMCommunicator.heartbeat()
                                                                List<Container> allocatedContainers = getResources(); //从RM获取Containers
                                                                scheduledRequests.assign(allocatedContainers)
                                                                   assignContainers(allocatedContainers)
                                                                   releaseContainsers if not not assign
                                                                      assignMapsWithLocality(allocatedContainers);
                                                                         containerAssigned(allocated, assigned)
                                                                            触发 new TaskAttemptContainerAssignedEvent : TaskAttemptEventType.TA_ASSIGNED
                                                                               TaskAttempImpl.ContainerAssignedTransition.transition()
                                                                                   创建一个ContainerLaunchContext, 并触发 new ContainerRemoteLaunchEvent() : ContainerLauncher.EventType.CONTAINER_REMOTE_LAUNCH 事件
                                                                                       ContainerLauncherImpl处理,
                                                                                           Container c = getContainer(event); 
                                                                                           ContainerRemoteLaunchEvent launchEvent = (ContainerRemoteLaunchEvent)event;
                                                                                           c.launch(launchEvent)
                                                                                              ContainerManagementProtocal proxy = getProxy(...);
                                                                                              proxy.startContainers();