flink的变量名、方法名之类的命名是真的好啊

Flink 中的执行图可以分成四层:StreamGraph -> JobGraph -> ExecutionGraph -> 物理执行图。

StreamGraph:是根据用户通过 Stream API 编写的代码生成的最初的图。用来表示程序 的拓扑结构。

JobGraph:StreamGraph 经过优化后生成了 JobGraph,提交给 JobManager 的数据结构。 主要的优化为,将多个符合条件的节点 chain 在一起作为一个节点,这样可以减少数据在节 点之间流动所需要的序列化/反序列化/传输消耗。

ExecutionGraph:JobManager 根据 JobGraph 生成 ExecutionGraph。ExecutionGraph 是 JobGraph 的并行化版本,是调度层最核心的数据结构。

物 理 执 行 图 : JobManager 根 据 ExecutionGraph 对 Job 进 行 调 度 后 , 在 各 个 TaskManager 上部署 Task 后形成的“图”,并不是一个具体的数据结构。

flink中jobgraph和streamgraph flink jobgraph_java

StreamGraph 在 Client 生成

在StreamExecutionEnvironment类的execute方法中

public JobExecutionResult execute(String jobName) throws Exception {
   ...
       
   return execute(getStreamGraph(jobName));
}
public StreamGraph getStreamGraph(String jobName, boolean clearTransformations) {
   StreamGraph streamGraph = getStreamGraphGenerator().setJobName(jobName).generate();  // generate()
   ...
}

点generate方法到StreamGraphGenerator类

public StreamGraph generate() {
   ...

   /*TODO transformations是一个list,依次存放了 用户代码里的 算子*/
   for (Transformation<?> transformation: transformations) {
      transform(transformation);
   }

   ...
}

transformations是一个list,依次存放了 用户代码里的 算子

看看如何将算子添加到这个list里的

以map为例

去DataStream类找到map方法

public <R> SingleOutputStreamOperator<R> map(MapFunction<T, R> mapper) {

   ...

   return map(mapper, outType);
}

点map直到doTransform方法

protected <R> SingleOutputStreamOperator<R> doTransform(
      String operatorName,
      TypeInformation<R> outTypeInfo,
      StreamOperatorFactory<R> operatorFactory) {

   // read the output type of the input Transform to coax out errors about MissingTypeInfo
   transformation.getOutputType();

   OneInputTransformation<T, R> resultTransform = new OneInputTransformation<>(
         this.transformation,
         operatorName,
         operatorFactory,
         outTypeInfo,
         environment.getParallelism());

   @SuppressWarnings({"unchecked", "rawtypes"})
   SingleOutputStreamOperator<R> returnStream = new SingleOutputStreamOperator(environment, resultTransform);

   getExecutionEnvironment().addOperator(resultTransform);

   return returnStream;
}

点addOperator方法到StreamExecutionEnvironment类,就在这添加的

public void addOperator(Transformation<?> transformation) {
   Preconditions.checkNotNull(transformation, "transformation must not be null.");
   this.transformations.add(transformation);
}

回到generate方法点transform方法直到translate方法,在这判断是批还是流处理

private Collection<Integer> translate(...) {
   ...

   return shouldExecuteInBatchMode
         ? translator.translateForBatch(transform, context)
         : translator.translateForStreaming(transform, context);
}

继续点,点translateForStreaming方法SimpleTransformationTranslator类

public Collection<Integer> translateForStreaming(final T transformation, final Context context) {
   ...

   // 区分 map之类的转换算子(OneInput) 和 keyby值类的分区算子(partition)
   final Collection<Integer> transformedIds =
         translateForStreamingInternal(transformation, context);
   ...
}

这个translateForStreamingInternal方法是个抽象类SimpleTransformationTranslator里的抽象方法,他有几个的实现类

flink中jobgraph和streamgraph flink jobgraph_大数据_02

①先看OneInputTransformationTranslator类

像map算子就走OneInputTransformationTranslator类

继续点直到translateInternal方法,本次咱就只看与streamGraph对象相关的,像addXXX,其他几个实现类也一样

protected Collection<Integer> translateInternal(...) {
   ...
      
   final StreamGraph streamGraph = context.getStreamGraph();
   
   ...

   /*TODO 添加 StreamNode*/
   streamGraph.addOperator(
      transformationId,
      slotSharingGroup,
      transformation.getCoLocationGroupKey(),
      operatorFactory,
      inputType,
      transformation.getOutputType(),
      transformation.getName());
   ...
       
   /*TODO 添加StreamEdge*/
   for (Integer inputId: context.getStreamNodeIds(parentTransformations.get(0))) {
      streamGraph.addEdge(inputId, transformationId, 0);
   }

   ...
}

点addOperator方法直到StreamGraph类的addNode方法

protected StreamNode addNode(
      Integer vertexID,
      @Nullable String slotSharingGroup,
      @Nullable String coLocationGroup,
      Class<? extends AbstractInvokable> vertexClass,
      StreamOperatorFactory<?> operatorFactory,
      String operatorName) {

   if (streamNodes.containsKey(vertexID)) {
      throw new RuntimeException("Duplicate vertexID " + vertexID);
   }

   StreamNode vertex = new StreamNode(
         vertexID,
         slotSharingGroup,
         coLocationGroup,
         operatorFactory,
         operatorName,
         vertexClass);

   streamNodes.put(vertexID, vertex);

   return vertex;
}

有必要看一下StreamGraph、StreamNode、StreamEdge的部分内容

StreamGraph类里的部分成员变量

private String jobName;
private ScheduleMode scheduleMode;
private boolean chaining;

private Map<Integer, StreamNode> streamNodes;
private Set<Integer> sources;
private Set<Integer> sinks;
private Map<Integer, Tuple2<Integer, OutputTag>> virtualSideOutputNodes;
private Map<Integer, Tuple3<Integer, StreamPartitioner<?>, ShuffleMode>> virtualPartitionNodes;

protected Map<Integer, String> vertexIDtoBrokerID;
protected Map<Integer, Long> vertexIDtoLoopTimeout;
private StateBackend stateBackend;
private Set<Tuple2<StreamNode, StreamNode>> iterationSourceSinkPairs;
private InternalTimeServiceManager.Provider timerServiceProvider;

StreamNode类的部分成员变量和方法

private List<StreamEdge> inEdges = new ArrayList<StreamEdge>();
private List<StreamEdge> outEdges = new ArrayList<StreamEdge>();

public void addInEdge(StreamEdge inEdge) {
	...
	inEdges.add(inEdge);
}

public void addOutEdge(StreamEdge outEdge) {
	...
	outEdges.add(outEdge);	
}

StreamEdge类的部分成员变量和方法

private final String edgeId;
private final int sourceId;
private final int targetId;

public StreamEdge(
	StreamNode sourceVertex,
	StreamNode targetVertex,
	int typeNumber,
	StreamPartitioner<?> outputPartitioner,
	OutputTag outputTag,
	ShuffleMode shuffleMode) {

	...
}

public int getSourceId() {
   return sourceId;
}

public int getTargetId() {
   return targetId;
}

回到AbstractOneInputTransformationTranslator类的translateInternal方法,点addEdge方法直到StreamGraph类的addEdgeInternal方法

private void addEdgeInternal(Integer upStreamVertexID,
      Integer downStreamVertexID,
      int typeNumber,
      StreamPartitioner<?> partitioner,
      List<String> outputNames,
      OutputTag outputTag,
      ShuffleMode shuffleMode) {

   /*TODO 当上游是侧输出时,递归调用,并传入侧输出信息*/
   if (virtualSideOutputNodes.containsKey(upStreamVertexID)) {
      int virtualId = upStreamVertexID;
      upStreamVertexID = virtualSideOutputNodes.get(virtualId).f0;
      if (outputTag == null) {
         outputTag = virtualSideOutputNodes.get(virtualId).f1;
      }
      addEdgeInternal(upStreamVertexID, downStreamVertexID, typeNumber, partitioner, null, outputTag, shuffleMode);
   } else if (virtualPartitionNodes.containsKey(upStreamVertexID)) {
      /*TODO 当上游是partition时,递归调用,并传入partitioner信息*/
      int virtualId = upStreamVertexID;
      upStreamVertexID = virtualPartitionNodes.get(virtualId).f0;
      if (partitioner == null) {
         partitioner = virtualPartitionNodes.get(virtualId).f1;
      }
      shuffleMode = virtualPartitionNodes.get(virtualId).f2;
      addEdgeInternal(upStreamVertexID, downStreamVertexID, typeNumber, partitioner, outputNames, outputTag, shuffleMode);
   } else {
      /*TODO 真正构建StreamEdge*/
      StreamNode upstreamNode = getStreamNode(upStreamVertexID);
      StreamNode downstreamNode = getStreamNode(downStreamVertexID);

      // If no partitioner was specified and the parallelism of upstream and downstream
      // operator matches use forward partitioning, use rebalance otherwise.
      /*TODO 未指定partitioner的话,会为其选择 forward 或 rebalance 分区*/
      if (partitioner == null && upstreamNode.getParallelism() == downstreamNode.getParallelism()) {
         partitioner = new ForwardPartitioner<Object>();
      } else if (partitioner == null) {
         partitioner = new RebalancePartitioner<Object>();
      }

      // 健康检查,forward 分区必须要上下游的并发度一致
      if (partitioner instanceof ForwardPartitioner) {
         if (upstreamNode.getParallelism() != downstreamNode.getParallelism()) {
            throw new UnsupportedOperationException("Forward partitioning does not allow " +
                  "change of parallelism. Upstream operation: " + upstreamNode + " parallelism: " + upstreamNode.getParallelism() +
                  ", downstream operation: " + downstreamNode + " parallelism: " + downstreamNode.getParallelism() +
                  " You must use another partitioning strategy, such as broadcast, rebalance, shuffle or global.");
         }
      }

      if (shuffleMode == null) {
         shuffleMode = ShuffleMode.UNDEFINED;
      }

      /*TODO 创建 StreamEdge*/
      StreamEdge edge = new StreamEdge(upstreamNode, downstreamNode, typeNumber,
         partitioner, outputTag, shuffleMode);

      /*TODO 将该 StreamEdge 添加到上游的输出,下游的输入*/
      getStreamNode(edge.getSourceId()).addOutEdge(edge);
      getStreamNode(edge.getTargetId()).addInEdge(edge);
   }
}

②再看PartitionTransformationTranslator类

keyBy就走这个类

private Collection<Integer> translateInternal(...) {
   ...

   for (Integer inputId: context.getStreamNodeIds(input)) {
      /*TODO 生成一个新的虚拟id*/
      final int virtualId = Transformation.getNewNodeId();
      /*TODO 添加一个虚拟分区节点,不会生成StreamNode*/
      streamGraph.addVirtualPartitionNode(
            inputId,
            virtualId,
            transformation.getPartitioner(),
            transformation.getShuffleMode());
      ...
   }
   ...
}

点addVirtualPartitionNode方法到StreamGraph类

public void addVirtualPartitionNode(
      Integer originalId,
      Integer virtualId,
      StreamPartitioner<?> partitioner,
      ShuffleMode shuffleMode) {

   if (virtualPartitionNodes.containsKey(virtualId)) {
      throw new IllegalStateException("Already has virtual partition node with id " + virtualId);
   }

   virtualPartitionNodes.put(virtualId, new Tuple3<>(originalId, partitioner, shuffleMode));
}

③再看SourceTransformationTranslator类

private Collection<Integer> translateInternal(...) {
   ...
       
   streamGraph.addSource(
         transformationId,
         slotSharingGroup,
         transformation.getCoLocationGroupKey(),
         operatorFactory,
         null,
         transformation.getOutputType(),
         "Source: " + transformation.getName());

   ...
}

点addSource到StreamGraph类,在这里添加了sources

public <IN, OUT> void addSource(...) {
   ...
   sources.add(vertexID);
}

④其他几个实现类都差不多

JobGraph 在 Client 生成

StreamGraph 转变成 JobGraph 也是在 Client 完成,主要作了三件事:

⚫ StreamNode 转成 JobVertex。

⚫ StreamEdge 转成 JobEdge。

⚫ JobEdge 和 JobVertex 之间创建 IntermediateDataSet 来连接。

乱七八糟的,咱就抓住StreamGraph对象,看他都被转换为了什么形式

在AbstractJobClusterExecutor类中

public CompletableFuture<JobClient> execute(...) throws Exception {
    
   /*TODO 将 流图(StreamGraph) 转换成 作业图(JobGraph)*/
   final JobGraph jobGraph = PipelineExecutorUtils.getJobGraph(pipeline, configuration);
   
   ...
}

点getJobGraph方法直到StreamingJobGraphGenerator类的createJobGraph方法,在这new了一个StreamingJobGraphGenerator对象

public static JobGraph createJobGraph(StreamGraph streamGraph, @Nullable JobID jobID) {
	return new StreamingJobGraphGenerator(streamGraph, jobID).createJobGraph();
}

点createJobGraph方法,在这个方法中streamGraph转换为了hashes

// traverseStreamGraphAndGenerateHashes方法名为 遍历流图并生成哈希

private JobGraph createJobGraph() {
   ...

    // Generate deterministic hashes for the nodes in order to identify them across
	// submission iff they didn't change.
	// 广度优先遍历 StreamGraph 并且为每个SteamNode生成hash id,
	// 保证如果提交的拓扑没有改变,则每次生成的hash都是一样的
	Map<Integer, byte[]> hashes = defaultStreamGraphHasher.traverseStreamGraphAndGenerateHashes(streamGraph);

	// Generate legacy version hashes for backwards compatibility
	List<Map<Integer, byte[]>> legacyHashes = new ArrayList<>(legacyStreamGraphHashers.size());
	for (StreamGraphHasher hasher : legacyStreamGraphHashers) {
		legacyHashes.add(hasher.traverseStreamGraphAndGenerateHashes(streamGraph));
	}
       
   /* TODO 最重要的函数,生成 JobVertex,JobEdge等,并尽可能地将多个节点chain在一起*/
   setChaining(hashes, legacyHashes);

   /*TODO 将每个JobVertex的入边集合也序列化到该JobVertex的StreamConfig中 (出边集合已经在setChaining的时候写入了)*/
   setPhysicalEdges();

   /*TODO 根据group name,为每个 JobVertex 指定所属的 SlotSharingGroup 以及针对 Iteration的头尾设置  CoLocationGroup*/
   setSlotSharingAndCoLocation();

   ...
   /*TODO 将 StreamGraph 的 ExecutionConfig 序列化到 JobGraph 的配置中*/
   jobGraph.setExecutionConfig(streamGraph.getExecutionConfig());
   ...
   return jobGraph;
}

点setChaining方法 ,这里的hashes转换为了chainEntryPoints,拿出chainEntryPoints的value转换为initialEntryPoints,再遍历元素

private void setChaining(Map<Integer, byte[]> hashes, List<Map<Integer, byte[]>> legacyHashes) {
   // we separate out the sources that run as inputs to another operator (chained inputs)
   // from the sources that needs to run as the main (head) operator.
   final Map<Integer, OperatorChainInfo> chainEntryPoints = buildChainedInputsAndGetHeadInputs(hashes, legacyHashes);
   final Collection<OperatorChainInfo> initialEntryPoints = new ArrayList<>(chainEntryPoints.values());

   // iterate over a copy of the values, because this map gets concurrently modified
   /*TODO 从source开始建⽴ node chains*/
   for (OperatorChainInfo info : initialEntryPoints) {
      /*TODO 构建node chains,返回当前节点的物理出边;startNodeId != currentNodeId 时,说明currentNode是chain中的子节点*/
      createChain(
            info.getStartNodeId(),
            1,  // operators start at position 1 because 0 is for chained source inputs
            info,
            chainEntryPoints);
   }
}

点buildChainedInputsAndGetHeadInputs方法,得到的是一个以source为键Map,值为new OperatorChainInfo(sourceNodeId, hashes, legacyHashes, chainedSources, streamGraph))的map,有几个source就有几个元素,我的理解就是将原先的链按source分开,上面英文注释好像也是这个意思,最重要的就是sourceNodeId,返回值的名字不就是入口吗,sourceNodeId就是入口啊,像下面递归调用createChain时,在newChain(nonChainable.getTargetId())方法里面就是new OperatorChainInfo(startNodeId, hashes, legacyHashes, chainedSources, streamGraph),只是这里的startNodeId就成nonChainable.getTargetId()了,即不能链接的边的出节点,就是新建了一个链

private  Map<Integer, OperatorChainInfo> buildChainedInputsAndGetHeadInputs(
      final Map<Integer, byte[]> hashes,
      final List<Map<Integer, byte[]>> legacyHashes) {

   ...

   for (Integer sourceNodeId : streamGraph.getSourceIDs()) {

      ...

      chainEntryPoints.put(
         sourceNodeId,
         new OperatorChainInfo(sourceNodeId, hashes, legacyHashes, chainedSources, streamGraph));
   }

   return chainEntryPoints;
}

先看看StreamingJobGraphGenerator类的部分成员变量

private final StreamGraph streamGraph;
// id -> JobVertex
private final Map<Integer, JobVertex> jobVertices;
private final JobGraph jobGraph;
// 已经构建的JobVertex的id集合
private final Collection<Integer> builtVertices;
// 物理边集合(排除了chain内部的边), 按创建顺序排序
private final List<StreamEdge> physicalEdgesInOrder;
// 保存chain信息,部署时用来构建 OperatorChain,startNodeId -> (currentNodeId -> StreamConfig)
private final Map<Integer, Map<Integer, StreamConfig>> chainedConfigs;
// 所有节点的配置信息,id -> StreamConfig
private final Map<Integer, StreamConfig> vertexConfigs;
// 保存每个节点的名字,id -> chainedName
private final Map<Integer, String> chainedNames;

点createChain方法,这个就是主要分析的方法

private List<StreamEdge> createChain(
      final Integer currentNodeId,
      final int chainIndex,
      final OperatorChainInfo chainInfo,
      final Map<Integer, OperatorChainInfo> chainEntryPoints) {

   Integer startNodeId = chainInfo.getStartNodeId();
   if (!builtVertices.contains(startNodeId)) {
      /*TODO 过渡用的出边集合, 用来生成最终的 JobEdge, 注意不包括 chain 内部的边*/
      List<StreamEdge> transitiveOutEdges = new ArrayList<StreamEdge>();

      List<StreamEdge> chainableOutputs = new ArrayList<StreamEdge>();
      List<StreamEdge> nonChainableOutputs = new ArrayList<StreamEdge>();

      StreamNode currentNode = streamGraph.getStreamNode(currentNodeId);

       // 这的isChainable方法分析在下面
      /*TODO 将当前节点的出边分成 chainable 和 nonChainable 两类*/
      for (StreamEdge outEdge : currentNode.getOutEdges()) {
         if (isChainable(outEdge, streamGraph)) {
            chainableOutputs.add(outEdge);
         } else {
            nonChainableOutputs.add(outEdge);
         }
      }

       // 这两个递归调用的解释在下面
      /*TODO 递归调用 createChain*/
      for (StreamEdge chainable : chainableOutputs) {
         transitiveOutEdges.addAll(  // 这是需要有返回值的
               createChain(chainable.getTargetId(), chainIndex + 1, chainInfo, chainEntryPoints));
      }

      /*TODO 递归调用 createChain*/
      for (StreamEdge nonChainable : nonChainableOutputs) {
         transitiveOutEdges.add(nonChainable);
         createChain(    // 这不需要返回值
               nonChainable.getTargetId(),
               1, // operators start at position 1 because 0 is for chained source inputs
               chainEntryPoints.computeIfAbsent(
                  nonChainable.getTargetId(),
                  (k) -> chainInfo.newChain(nonChainable.getTargetId())),
               chainEntryPoints);
      }

      ...

      /*TODO 如果当前节点是起始节点, 则直接创建 JobVertex 并返回 StreamConfig, 否则先创建一个空的 StreamConfig */
      StreamConfig config = currentNodeId.equals(startNodeId)
            ? createJobVertex(startNodeId, chainInfo)
            : new StreamConfig(new Configuration());

      /*TODO 设置 JobVertex 的 StreamConfig, 基本上是序列化 StreamNode 中的配置到 StreamConfig中.*/
      setVertexConfig(currentNodeId, config, chainableOutputs, nonChainableOutputs, chainInfo.getChainedSources());

      if (currentNodeId.equals(startNodeId)) {
         /*TODO 如果是chain的起始节点,标记成chain start(不是chain中的节点,也会被标记成 chain start)*/
         config.setChainStart();
         config.setChainIndex(chainIndex);
         config.setOperatorName(streamGraph.getStreamNode(currentNodeId).getOperatorName());

         /*TODO 将当前节点(headOfChain)与所有出边相连*/
         for (StreamEdge edge : transitiveOutEdges) {
            /*TODO 通过StreamEdge构建出JobEdge,创建 IntermediateDataSet,用来将JobVertex和JobEdge相连*/
            connect(startNodeId, edge);
         }

         /*TODO 把物理出边写入配置, 部署时会用到*/
         config.setOutEdgesInOrder(transitiveOutEdges);
         /*TODO 将chain中所有子节点的StreamConfig写入到 headOfChain 节点的 CHAINED_TASK_CONFIG 配置中*/
         config.setTransitiveChainedTaskConfigs(chainedConfigs.get(startNodeId));

      } else {
         /*TODO 如果是 chain 中的子节点*/
         chainedConfigs.computeIfAbsent(startNodeId, k -> new HashMap<Integer, StreamConfig>());

         config.setChainIndex(chainIndex);
         StreamNode node = streamGraph.getStreamNode(currentNodeId);
         config.setOperatorName(node.getOperatorName());
         /*TODO 将当前节点的StreamConfig添加到该chain的config集合中*/
         chainedConfigs.get(startNodeId).put(currentNodeId, config);
      }

      config.setOperatorID(currentOperatorId);

      if (chainableOutputs.isEmpty()) {
         config.setChainEnd();
      }
      /*TODO 返回连往chain外部的出边集合*/
      return transitiveOutEdges;

   } else {
      return new ArrayList<>();
   }
}

A.分类那点isChainable方法

public static boolean isChainable(StreamEdge edge, StreamGraph streamGraph) {
   StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);

   return downStreamVertex.getInEdges().size() == 1
         && isChainableInput(edge, streamGraph);
}

再点isChainableInput方法,这就是能不能链在一起的核心逻辑了

private static boolean isChainableInput(StreamEdge edge, StreamGraph streamGraph) {
   StreamNode upStreamVertex = streamGraph.getSourceVertex(edge);
   StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);

   if (!(upStreamVertex.isSameSlotSharingGroup(downStreamVertex)
      && areOperatorsChainable(upStreamVertex, downStreamVertex, streamGraph)
      && (edge.getPartitioner() instanceof ForwardPartitioner)
      && edge.getShuffleMode() != ShuffleMode.BATCH
      && upStreamVertex.getParallelism() == downStreamVertex.getParallelism()
      && streamGraph.isChainingEnabled())) {

      return false;
   }

   // check that we do not have a union operation, because unions currently only work
   // through the network/byte-channel stack.
   // we check that by testing that each "type" (which means input position) is used only once
   for (StreamEdge inEdge : downStreamVertex.getInEdges()) {
      if (inEdge != edge && inEdge.getTypeNumber() == edge.getTypeNumber()) {
         return false;
      }
   }
   return true;
}

条件如下:

1、下游节点只有一个输入边
2、上下游算子在同一个 slotSharingGroup
3、上下游算子可以 chain 在一起
4、上下游算子数据分发策略是 ForwardPartitioner
5、上下游算子数据 shuffleMode 不是 BATCH
6、上下游算子并发度一样
7、启用链接

第3条相关代码

static boolean areOperatorsChainable(
      StreamNode upStreamVertex,
      StreamNode downStreamVertex,
      StreamGraph streamGraph) {
   StreamOperatorFactory<?> upStreamOperator = upStreamVertex.getOperatorFactory();
   StreamOperatorFactory<?> downStreamOperator = downStreamVertex.getOperatorFactory();
   if (downStreamOperator == null || upStreamOperator == null) {
      return false;
   }

   // yielding operators cannot be chained to legacy sources
   // unfortunately the information that vertices have been chained is not preserved at this point
   if (downStreamOperator instanceof YieldingOperatorFactory &&
         getHeadOperator(upStreamVertex, streamGraph).isLegacySource()) {
      return false;
   }

   // we use switch/case here to make sure this is exhaustive if ever values are added to the
   // ChainingStrategy enum
   boolean isChainable;

   switch (upStreamOperator.getChainingStrategy()) {
      case NEVER:
         isChainable = false;
         break;
      case ALWAYS:
      case HEAD:
      case HEAD_WITH_SOURCES:
         isChainable = true;
         break;
      default:
         throw new RuntimeException("Unknown chaining strategy: " + upStreamOperator.getChainingStrategy());
   }

   switch (downStreamOperator.getChainingStrategy()) {
      case NEVER:
      case HEAD:
         isChainable = false;
         break;
      case ALWAYS:
         // keep the value from upstream
         break;
      case HEAD_WITH_SOURCES:
         // only if upstream is a source
         isChainable &= (upStreamOperator instanceof SourceOperatorFactory);
         break;
      default:
         throw new RuntimeException("Unknown chaining strategy: " + upStreamOperator.getChainingStrategy());
   }

   return isChainable;
}

B.递归调用那,递归调用会向transitiveOutEdges添加下零个或几个要链接的边,这几个边一定是不可链接的边,即非当前链里面的边,比如说如果StreamGraph中某个顶点(startNode)的后面全是可以链接的顶点,那这个方法最后返回的应该是个空的ArrayList,当前顶点也没有必要构建IntermediateDataSet和jobEdge了,如果在几个可以链接的点后面又有一个不可以链接的点,那返回的就是这个点前面的这个不可链接的边,那就要当前顶点(startNode)构建IntermediateDataSet和jobEdge。就递归的返回值全部收集齐需要递归到结束或者遇到一个不可链接的边。那些链中的子节点的StreamConfig都放到了chain的chainedConfigs中。应该是这个意思吧,小弟能力尚浅,不敢肯定。

for (StreamEdge chainable : chainableOutputs) {
   transitiveOutEdges.addAll(   // 这需要返回值
         createChain(chainable.getTargetId(), chainIndex + 1, chainInfo, chainEntryPoints));
}

for (StreamEdge nonChainable : nonChainableOutputs) {
   transitiveOutEdges.add(nonChainable); 
   createChain(...);     // 这不需要返回值
}

逻辑搞清了下面就简单了,看看如何构建IntermediateDataSet和jobEdge的,点connect方法

private void connect(Integer headOfChain, StreamEdge edge) {
    
   Integer downStreamVertexID = edge.getTargetId();

   JobVertex headVertex = jobVertices.get(headOfChain);
   JobVertex downStreamVertex = jobVertices.get(downStreamVertexID);

   ...

   JobEdge jobEdge;
   if (isPointwisePartitioner(partitioner)) {
      jobEdge = downStreamVertex.connectNewDataSetAsInput(
         headVertex,
         DistributionPattern.POINTWISE,
         resultPartitionType);
   } else {
      jobEdge = downStreamVertex.connectNewDataSetAsInput(
            headVertex,
            DistributionPattern.ALL_TO_ALL,
            resultPartitionType);
   }
   // set strategy name so that web interface can show it.
   jobEdge.setShipStrategyName(partitioner.toString());
   jobEdge.setDownstreamSubtaskStateMapper(partitioner.getDownstreamSubtaskStateMapper());
   jobEdge.setUpstreamSubtaskStateMapper(partitioner.getUpstreamSubtaskStateMapper());

   ...
}

点connectNewDataSetAsInput方法到JobVertex类

public JobEdge connectNewDataSetAsInput(
      JobVertex input,
      DistributionPattern distPattern,
      ResultPartitionType partitionType) {

   IntermediateDataSet dataSet = input.createAndAddResultDataSet(partitionType);

   JobEdge edge = new JobEdge(dataSet, this, distPattern);
   this.inputs.add(edge);
   dataSet.addConsumer(edge);
   return edge;
}

再点createAndAddResultDataSet

public IntermediateDataSet createAndAddResultDataSet(ResultPartitionType partitionType) {
   return createAndAddResultDataSet(new IntermediateDataSetID(), partitionType);
}

再点createAndAddResultDataSet

public IntermediateDataSet createAndAddResultDataSet(
      IntermediateDataSetID id,
      ResultPartitionType partitionType) {

   IntermediateDataSet result = new IntermediateDataSet(id, partitionType, this);
   this.results.add(result);
   return result;
}

ExecutionGraph 在 JobManager (DefaultScheduler)生成

在创建JobMaster的时候,在上一篇的中new JobMaster那

public JobMaster createJobMasterService(
      JobGraph jobGraph,
      ...) throws Exception {

   return new JobMaster(
      ...
       
      jobGraph,
       
      ...);
}

点进JobMaster方法找(JobMaster类里)

/*TODO 创建 调度器DefaultScheduler,创建的时候把 JobGraph转换成 ExecutionGraph*/
this.schedulerNG = createScheduler(executionDeploymentTracker, jobManagerJobMetricGroup);

点点点直到SchedulerBase类的构造方法里找

this.executionGraph = createAndRestoreExecutionGraph(jobManagerJobMetricGroup, checkNotNull(shuffleMaster), checkNotNull(partitionTracker), checkNotNull(executionDeploymentTracker), initializationTimestamp);

再点点点直到ExecutionGraphBuilder类的buildGraph方法

public static ExecutionGraph buildGraph(
		@Nullable ExecutionGraph prior,
		JobGraph jobGraph,
		...)... { 
    
    ...

	// create a new execution graph, if none exists so far
    final ExecutionGraph executionGraph;

    // 如果不存在执⾏图,就创建⼀个新的执⾏图
    executionGraph = (prior != null) ? prior : new ExecutionGraph(...);

    ...

    // topologically sort the job vertices and attach the graph to the existing one
    /*TODO 对JobGraph进⾏拓扑排序,获取所有的JobVertex列表*/
    List<JobVertex> sortedTopology = jobGraph.getVerticesSortedTopologicallyFromSources();
    
    ...
        
    /*TODO 核心逻辑:将拓扑排序过的JobGraph添加到 executionGraph数据结构中。*/
    executionGraph.attachJobGraph(sortedTopology);  
    
    ...
}

点attachJobGraph方法到ExecutionGraph类

public void attachJobGraph(List<JobVertex> topologiallySorted) throws JobException {

   ...
    // 遍历Job Vertex
	for (JobVertex jobVertex : topologiallySorted) {

          // create the execution job vertex and attach it to the graph
          /*TODO 实例化执行图节点,根据每⼀个jobvertex,创建对应的 ExecutionVertex*/
          ExecutionJobVertex ejv = new ExecutionJobVertex(
                this,
                jobVertex,
                1,
                maxPriorAttemptsHistoryLength,
                rpcTimeout,
                globalModVersion,
                createTimestamp);

          /*TODO 核心逻辑:将创建的ExecutionJobVertex与前置的IntermediateResult连接起来*/
          ejv.connectToPredecessors(this.intermediateResults);

          for (IntermediateResult res : ejv.getProducedDataSets()) {
              // 这个intermediateResults一开始就是空的,这里是惟一的添加东西的地方,我日了,我一开始没仔细看这,找了半天,还奇怪在上一步里面一个空的变量怎么获取值呢,IntermediateDataSet不都在JobEdge里吗,哎
                IntermediateResult previousDataSet = this.intermediateResults.putIfAbsent(res.getId(), res);
                ...
          }

          this.verticesInCreationOrder.add(ejv);
          // 节点总数量需要加上当前执行图节点的并⾏度,因为执行图是作业图的并行化版本
          this.numVerticesTotal += ejv.getParallelism();
          /*TODO 将当前执⾏图节点加⼊到图中*/
          newExecJobVertices.add(ejv);
       ...
    }
}

①点ExecutionJobVertex方法,在这实例化了并行度个ExecutionVertex节点和相应的IntermediateDataSet数量个IntermediateResult。当然还有判断是否大于最大并行度之类的代码

...

int vertexParallelism = jobVertex.getParallelism();
int numTaskVertices = vertexParallelism > 0 ? vertexParallelism : defaultParallelism;
...
this.taskVertices = new ExecutionVertex[numTaskVertices];
...

    
// create the intermediate results
this.producedDataSets = new IntermediateResult[jobVertex.getNumberOfProducedIntermediateDataSets()];
for (int i = 0; i < jobVertex.getProducedDataSets().size(); i++) {
	final IntermediateDataSet result = jobVertex.getProducedDataSets().get(i);  // 就是点后面的IntermediateDataSet
	this.producedDataSets[i] = new IntermediateResult(
			result.getId(),
			this,
			numTaskVertices,
			result.getResultType());
}

// create all task vertices
for (int i = 0; i < numTaskVertices; i++) {
	ExecutionVertex vertex = new ExecutionVertex(
			this,
			i,
			producedDataSets,
			timeout,
			initialGlobalModVersion,
			createTimestamp,
			maxPriorAttemptsHistoryLength);

	this.taskVertices[i] = vertex;
}

这个taskVertices就是ExecutionVertex

private final ExecutionVertex[] taskVertices;

而在ExecutionVertex的构造方法中会执行下面这行代码,赋值在本类connectSource方法里

this.inputEdges = new ExecutionEdge[jobVertex.getJobVertex().getInputs().size()][];

IntermediateResult构造方法中会执行下面这行代码

this.partitions = new IntermediateResultPartition[numParallelProducers];  //numTaskVertices为上面调用时的numTaskVertices

赋值的在ExecutionVertex的构造方法里

for (IntermediateResult result : producedDataSets) {
   IntermediateResultPartition irp = new IntermediateResultPartition(result, this, subTaskIndex);
   result.setPartition(subTaskIndex, irp);

   resultPartitions.put(irp.getPartitionId(), irp);
}

反正乱七八糟的应该就是实例化ExecutionJobVertex时,实例化了并行度个ExecutionVertex和相应的IntermediateDataSet数量个IntermediateResult,在实例化IntermediateResult时又建立了数组IntermediateResultPartition[],其赋值在实例化ExecutionVertex时,实例化ExecutionVertex的同时,建立了二维数组ExecutionEdge[] [],其赋值在本类connectSource方法里

②点connectToPredecessors方法到ExecutionJobVertex类

public void connectToPredecessors(Map<IntermediateDataSetID, IntermediateResult> intermediateDataSets) throws JobException {

   /* TODO 获取输入的JobEdge列表 */
   List<JobEdge> inputs = jobVertex.getInputs();

   ...

   // 遍历每条JobEdge
   for (int num = 0; num < inputs.size(); num++) {
      JobEdge edge = inputs.get(num);

      ...

      // fetch the intermediate result via ID. if it does not exist, then it either has not been created, or the order
      // in which this method is called for the job vertices is not a topological order
      /*TODO 通过 ID获取当前JobEdge的输入所对应的 IntermediateResult*/
      IntermediateResult ires = intermediateDataSets.get(edge.getSourceId());
      ...

      /*TODO 将IntermediateResult加入到当前ExecutionJobVertex的输入中*/
      this.inputs.add(ires);

      /*TODO 为 IntermediateResult 注册 consumer,就是当前节点*/
      int consumerIndex = ires.registerConsumer();

      // 由于每⼀个并行度都对应⼀个节点。所以要把每个节点都和前面中间结果相连。
      for (int i = 0; i < parallelism; i++) {
         ExecutionVertex ev = taskVertices[i];
         /*TODO 将 ExecutionVertex与 IntermediateResult关联起来*/
         ev.connectSource(num, ires, edge, consumerIndex);
      }
   }
}

点connectSource方法到ExecutionVertex类

public void connectSource(int inputNumber, IntermediateResult source, JobEdge edge, int consumerNumber) {

   // 只有forward的方式的情况下,pattern才是 POINTWISE的,否则均为 ALL_TO_ALL
   final DistributionPattern pattern = edge.getDistributionPattern();
   final IntermediateResultPartition[] sourcePartitions = source.getPartitions();

   ExecutionEdge[] edges;

   switch (pattern) {
      case POINTWISE:
         edges = connectPointwise(sourcePartitions, inputNumber);
         break;

      case ALL_TO_ALL:
         edges = connectAllToAll(sourcePartitions, inputNumber);
         break;

      default:
         throw new RuntimeException("Unrecognized distribution pattern.");

   }

   inputEdges[inputNumber] = edges;

   // add the consumers to the source
   // for now (until the receiver initiated handshake is in place), we need to register the
   // edges as the execution graph
   /*TODO 为IntermediateResultPartition添加consumer,即关联到ExecutionEdge上(之前已经为IntermediateResult添加了consumer)*/
   for (ExecutionEdge ee : edges) {
      ee.getSource().addConsumer(ee, consumerNumber);
   }
}

点connectAllToAll方法

private ExecutionEdge[] connectAllToAll(IntermediateResultPartition[] sourcePartitions, int inputNumber) {
   ExecutionEdge[] edges = new ExecutionEdge[sourcePartitions.length];

   for (int i = 0; i < sourcePartitions.length; i++) {
      IntermediateResultPartition irp = sourcePartitions[i];
      edges[i] = new ExecutionEdge(irp, this, inputNumber);
   }

   return edges;
}