本文目录结构:

1. 引言
2. StreamPark项目导入与调试
|____Step1: 物料准备
|____Step2: 导入项目
|____Step3: 配置与打包
|____Step4: 启动与调试
3. 演示(新建作业并上线)
|____Step1: 下载Flink安装包并启动集群
|____Step2: 配置Flink插件及集群
|____Step3: 配置作业并上线
4. 源码分析
|____4.1 运行作业接口(ApplicationController#start)
|____4.2 启动作业方法(ApplicationActionService#start)
|____4.3 启动作业实现(FlinkClientTrait#submit)
|____4.4 源码总结
5.文末

1、 引言   

StreamPark官网地址:

https://streampark.apache.org/

实时即未来,而StreamPark作为一个神奇的软件,可以让流处理更简单。现在越来越多的用户甚至企业都在使用,作为开发者,非常有必要去学习和使用。

博主接下来带大家从零开始,在本地运行StreamPark并进行调试,调试的源码主要为典型的"运行作业接口",看看StreamPark是如何跑不同作业到不同集群的。

2、 StreamPark项目导入与调试   

如果要导入StreamPark项目并调试,需要经过如下步骤:

  • Step1:准备物料
  • Step2:导入项目
  • Step3:配置并打包
  • Step4:启动项目
  • Step5:调试

Step1:物料准备

主要先准备以下物料:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目

博主默认开发者已经安装好以上的内容了,此处不再详述了,读者自行去百度或谷歌搜索安装方式

Step2: 导入项目

本地直接git克隆项目(如果访问不了Github,自行寻找科学上网方式),命令如下:

git clone https://github.com/apache/incubator-streampark.git

IDEA点击菜单栏“File->Open...”导入项目:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_02

导入成功之后,会发现默认自动切换到了dev分支,本文就使用这个分支,无需切换至其它的分支。

Step3: 配置与打包

打开/incubator-streampark/streampark-console/streampark-console-service/src/main/resources/application.yml配置文件,需要修改以下两个配置:

① 可选择去修改默认数据库,这里就使用默认的H2数据库(H2是基于内存的数据库,重启之后数据就会丢失),建议就使用默认的。

②最主要的是修改streampark.workspace.local配置,否则无法启动

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_03

我们可以使用项目根目录下的build.sh脚本进行打包,但是可能会打包失败,失败这里就不贴出来了,直接使用最通用的方式,也就是使用我们本地配置的Maven,在根目录下,执行mvn打包命令(这是前后端打包的命令):

mvn -Pshaded,webapp,dist -DskipTests clean package

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_04

ok,打包成功后效果如下图所示:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_05

Step4: 启动与调试

项目根目录下可以看到自动创建了dist目录,且在里面有完整的压缩包,我们这里解压安装包,如下图:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_06

复制上图解压后目录的全路径,在启动前,我们需要先去配置启动参数。首先打开StreamParkConsoleBootstrap类,然后按下图操作,修改

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_07

打开后,继续点击修改配置:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_08

选择Add VM Options

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_09

添加启动参数,格式:“-Dapp.home=${安装包的解压全路径},如下图所示:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_10

接下来,我们可以启动项目,先启动前端再启动后端。

打开/incubator-streampark/streampark-console/streampark-console-webapp/package.json文件,可以看到有一个启动按钮,直接点击

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_11

启动成功后,可以在控制台看到:


StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_12

放开断点,浏览器输入首页地址:http://localhost:10001,预料之中打开了StreamPark的登录页了(登录账号地址为:admin/streampark

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_13

接下来可以试着打断点去调试我们想看的源码了。

3、 演示(新建作业并上线)   

如果需要新建作业和调试一个简单的作业,这里有几个步骤:

  • Step1:下载Flink安装包,并启动集群
  • Step2:配置Flink插件和Flink集群;
  • Step3:配置Flink作业并上线。

Step1:下载Flink安装包并启动集群


Flink安装包下载地址(本文以下载flink-1.13.6-bin-scala_2.12.tgz为例子):https://flink.apache.org/downloads/

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_14

解压,进入bin目录,启动Flink集群:

./start-cluster.sh

启动成功之后,访问浏览器http://localhost:8081,可以看到已经启动成功了:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_15

接下来,登录Streampark可以去配置相关的插件及集群了。

Step2: 配置Flink插件及集群


登录StreamPark之后(账号密码:admin/streampark),在左侧菜单栏,选择Flink版本,然后添加:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_16

添加成功之后,继续添加集群:


StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_17

添加成功后,StreamPark会帮我们自动刷新集群的状态:


StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_18


Step3: 配置作业并上线


博主打算新建一个作业,主要用来跑Flink安装目录下examplesTopSpeedWindowing.jar包。

首先,需要在Streampark资源管理里上传jar包:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_19


接着新建一个作业,配置如下:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_20


保存之后,点击上线:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_21


上线之后点击运行:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_22


运行成功之后,可以在主页面里很清晰的看清楚作业的资源占用情况:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_23


同时,在Flink集群里也看到了我们的任务已经提交成功并运行了:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_24


到此,我们已经成功的跑起了一个作业了,接下来我们看看它的源码。

4、 源码分析   

在前面的步骤,我们知道了“上线”之后点击“运行”就可以把作业提交到Flink集群了,那么点击“运行”之后,背后代码发生了什么呢?博主带大家阅读下。

4.1 运行作业接口(ApplicationController#start)


通过F12查看,可以看到点击作业运行之后,调的接口地址为“/basic-api/flink/app/start”:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_25


经过查找,发现是进入了ApplicationController#start方法(代码已添加注释):

/**
 * 作业启动接口
 *
 * @author : YangLinWei
 * @createTime: 2024/1/12 22:55
 * @version: 1.0.0
 */
@PostMapping(value = "start")
@RequiresPermissions("app:start")
public RestResponse start(@Parameter(hidden = true) Application app) {
  try {
      // ApplicationActionService去启动
    applicationActionService.start(app, false);
    return RestResponse.success(true);
  } catch (Exception e) {
    return RestResponse.success(false).message(e.getMessage());
  }
}

4.2 启动作业方法(ApplicationActionService#start)


继续进入ApplicationActionServicestart方法查看(代码已添加注释):

/**
 * 启动作业
 * 
 * @author : YangLinWei
 * @createTime: 2024/1/12 22:59
 * @version: 1.0.0
 */
@Override
public void start(Application appParam, boolean auto) throws Exception {


    // 校验:判断应用内容是否为空,且不能重复启动
    final Application application = getById(appParam.getId());
    Utils.requireNotNull(application);
    ApiAlertException.throwIfTrue(
        !application.isCanBeStart(), "[StreamPark] The application cannot be started repeatedly.");


    // 校验:Remote模式和Session模式
    if (FlinkExecutionMode.isRemoteMode(application.getFlinkExecutionMode())
        || FlinkExecutionMode.isSessionMode(application.getFlinkExecutionMode())) {
        checkBeforeStart(application);
    }


    // 校验:相同的作业名不能运行到同一个yarn队列
    if (FlinkExecutionMode.isYarnMode(application.getFlinkExecutionMode())) {


        ApiAlertException.throwIfTrue(
            !applicationInfoService.getYarnAppReport(application.getJobName()).isEmpty(),
            "[StreamPark] The same task name is already running in the yarn queue");
    }


    //校验:判断作业是否已经“上线”(构建)
    AppBuildPipeline buildPipeline = appBuildPipeService.getById(application.getId());
    Utils.requireNotNull(buildPipeline);




    //校验:flink的插件配置不能为空
    FlinkEnv flinkEnv = flinkEnvService.getByIdOrDefault(application.getVersionId());


    ApiAlertException.throwIfNull(flinkEnv, "[StreamPark] can no found flink version");


    // 根据入参判断是否为重启,并更新应用重启次数
    if (!auto) {
        application.setRestartCount(0);
    } else {
        if (!application.isNeedRestartOnFailed()) {
            return;
        }
        appParam.setSavePointed(true);
        application.setRestartCount(application.getRestartCount() + 1);
    }


    // 更新作业状态为启动中
    starting(application);


    String jobId = new JobID().toHexString();


    // 初始化应用日志
    ApplicationLog applicationLog = new ApplicationLog();
    applicationLog.setOptionName(OperationEnum.START.getValue());
    applicationLog.setAppId(application.getId());
    applicationLog.setOptionTime(new Date());
    applicationLog.setUserId(commonService.getUserId());


    // 获取最新的应用配置(取版本控制最新的版本)
    applicationManageService.toEffective(application);


    Map<String, Object> extraParameter = new HashMap<>(0);
    if (application.isFlinkSqlJob()) { // 如果是FlinkSQL作业,获取最新有效的FlinkSQL然后替换全局变量
        FlinkSql flinkSql = flinkSqlService.getEffective(application.getId(), true);
        // Get the sql of the replaced placeholder
        String realSql = variableService.replaceVariable(application.getTeamId(), flinkSql.getSql());
        flinkSql.setSql(DeflaterUtils.zipString(realSql));
        extraParameter.put(ConfigKeys.KEY_FLINK_SQL(null), flinkSql.getSql());
    }


    // 构造提交到k8s的参数(感觉这里有点突兀)
    KubernetesSubmitParam kubernetesSubmitParam =
        KubernetesSubmitParam.apply(
            application.getClusterId(),
            application.getK8sName(),
            application.getK8sNamespace(),
            application.getFlinkImage(),
            application.getK8sRestExposedTypeEnum(),
            flinkK8sDataTypeConverter.genDefaultFlinkDeploymentIngressDef());


    // 获取启动jar包参数
    Tuple2<String, String> userJarAndAppConf = getUserJarAndAppConf(flinkEnv, application);
    String flinkUserJar = userJarAndAppConf.f0;
    String appConf = userJarAndAppConf.f1;


    BuildResult buildResult = buildPipeline.getBuildResult();
    if (FlinkExecutionMode.YARN_APPLICATION == application.getFlinkExecutionMode()) {
        buildResult = new ShadedBuildResponse(null, flinkUserJar, true);
    }


    // Get the args after placeholder replacement
    String applicationArgs =
        variableService.replaceVariable(application.getTeamId(), application.getArgs());


    // 构造提交请求
    SubmitRequest submitRequest =
        new SubmitRequest(
            flinkEnv.getFlinkVersion(),
            FlinkExecutionMode.of(application.getExecutionMode()),
            getProperties(application),
            flinkEnv.getFlinkConf(),
            FlinkDevelopmentMode.of(application.getJobType()),
            application.getId(),
            jobId,
            application.getJobName(),
            appConf,
            application.getApplicationType(),
            getSavePointed(appParam),
            appParam.getRestoreMode() == null
                ? null
                : FlinkRestoreMode.of(appParam.getRestoreMode()),
            applicationArgs,
            application.getHadoopUser(),
            buildResult,
            kubernetesSubmitParam,
            extraParameter);


    // 开始提交
    CompletableFuture<SubmitResponse> future =
        CompletableFuture.supplyAsync(() -> FlinkClient.submit(submitRequest), executorService);


    startFutureMap.put(application.getId(), future);


    // 异步等待结果
    future.whenComplete(
        (response, throwable) -> {
            // 1) remove Future
            startFutureMap.remove(application.getId());


            // 2) exception
            if (throwable != null) {
                String exception = ExceptionUtils.stringifyException(throwable);
                applicationLog.setException(exception);
                applicationLog.setSuccess(false);
                applicationLogService.save(applicationLog);
                if (throwable instanceof CancellationException) {
                    doStopped(application);
                } else {
                    Application app = getById(appParam.getId());
                    app.setState(FlinkAppStateEnum.FAILED.getValue());
                    app.setOptionState(OptionStateEnum.NONE.getValue());
                    updateById(app);
                    if (isKubernetesApp(app)) {
                        k8SFlinkTrackMonitor.unWatching(toTrackId(app));
                    } else {
                        FlinkAppHttpWatcher.unWatching(appParam.getId());
                    }
                }
                return;
            }


            // 3) success
            applicationLog.setSuccess(true);
            if (response.flinkConfig() != null) {
                String jmMemory = response.flinkConfig().get(ConfigKeys.KEY_FLINK_JM_PROCESS_MEMORY());
                if (jmMemory != null) {
                    application.setJmMemory(MemorySize.parse(jmMemory).getMebiBytes());
                }
                String tmMemory = response.flinkConfig().get(ConfigKeys.KEY_FLINK_TM_PROCESS_MEMORY());
                if (tmMemory != null) {
                    application.setTmMemory(MemorySize.parse(tmMemory).getMebiBytes());
                }
            }
            application.setAppId(response.clusterId());
            if (StringUtils.isNoneEmpty(response.jobId())) {
                application.setJobId(response.jobId());
            }


            if (StringUtils.isNoneEmpty(response.jobManagerUrl())) {
                application.setJobManagerUrl(response.jobManagerUrl());
                applicationLog.setJobManagerUrl(response.jobManagerUrl());
            }
            applicationLog.setYarnAppId(response.clusterId());
            application.setStartTime(new Date());
            application.setEndTime(null);


            // if start completed, will be added task to tracking queue
            if (isKubernetesApp(application)) {
                application.setRelease(ReleaseStateEnum.DONE.get());
                k8SFlinkTrackMonitor.doWatching(toTrackId(application));
                if (FlinkExecutionMode.isKubernetesApplicationMode(application.getExecutionMode())) {
                    String domainName = settingService.getIngressModeDefault();
                    if (StringUtils.isNotBlank(domainName)) {
                        try {
                            IngressController.configureIngress(
                                domainName, application.getClusterId(), application.getK8sNamespace());
                        } catch (KubernetesClientException e) {
                            log.info("Failed to create ingress, stack info:{}", e.getMessage());
                            applicationLog.setException(e.getMessage());
                            applicationLog.setSuccess(false);
                            applicationLogService.save(applicationLog);
                            application.setState(FlinkAppStateEnum.FAILED.getValue());
                            application.setOptionState(OptionStateEnum.NONE.getValue());
                        }
                    }
                }
            } else {
                FlinkAppHttpWatcher.setOptionState(appParam.getId(), OptionStateEnum.STARTING);
                FlinkAppHttpWatcher.doWatching(application);
            }
            // update app
            updateById(application);
            // save log
            applicationLogService.save(applicationLog);
        });
}

以上代码有点多,其实主要做了“校验”→“封装提交参数”→“提交作业”→“异步等待提交结果”这几个操作,我们主要看看提交作业的代码,即:FlinkClient.submit(submitRequest),也是在这里,我们进入了神奇的Scala语法块了,代码如下(已添加注释):

/**
 * 提交作业
 *  @param submitRequest 作业请求内容
 * 
 * @author : YangLinWei
 * @createTime: 2024/1/12 22:59
 * @version: 1.0.0
 */
 def submit(submitRequest: SubmitRequest): SubmitResponse = {
   proxy[SubmitResponse](submitRequest, submitRequest.flinkVersion, SUBMIT_REQUEST)
 }


 /**
 * 代理方式调用submit方法
 *  
 * 
 * @author : YangLinWei
 * @createTime: 2024/1/12 22:59
 * @version: 1.0.0
 */
 private[this] def proxy[T: ClassTag](
                                       request: Object,
                                       flinkVersion: FlinkVersion,
                                       requestBody: (String, String)): T = {
   flinkVersion.checkVersion()
   FlinkShimsProxy.proxy(
     flinkVersion,
     (classLoader: ClassLoader) => {
       val submitClass = classLoader.loadClass(FLINK_CLIENT_ENDPOINT_CLASS)
       val requestClass = classLoader.loadClass(requestBody._1)
       val method = submitClass.getDeclaredMethod(requestBody._2, requestClass)
       method.setAccessible(true)
       // 调用submit方法
       val obj = method.invoke(null, FlinkShimsProxy.getObject(classLoader, request))
       if (obj == null) null.asInstanceOf[T]
       else {
         FlinkShimsProxy.getObject[T](this.getClass.getClassLoader, obj)
       }
     }
   )
 }

通过断点调试,可以看到代码通过了代理+反射的方式,即将调用FlinkClientEndpoint类的submit方法:


StreamPark从零快速入门(本地调试、功能演示及源码分析)_安装包_26

4.3启动作业实现(FlinkClientTrait#submit)


进入FlinkClientEndpoint类的submit方法,可以看到了这个类先初始化了不同执行模式的提交客户端,然后提交前,会先拿出客户端并进行提交(这里用到了简单的策略模式):

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_27


可以知道,这里的执行模式为Remote,那么对应的客户端为RemoteClient,该类继承了FlinkClientTraitsubmit方法在FlinkClientTrait类:

/**
 * 提交
 * @author : YangLinWei
 * @createTime: 2024/1/12 22:59
 * 
 * @param submitRequest 提交请求
 */
@throws[Exception]
def submit(submitRequest: SubmitRequest): SubmitResponse = {
  logInfo(
    s"""
       |--------------------------------------- flink job start ---------------------------------------
       |    userFlinkHome    : ${submitRequest.flinkVersion.flinkHome}
       |    flinkVersion     : ${submitRequest.flinkVersion.version}
       |    appName          : ${submitRequest.appName}
       |    devMode          : ${submitRequest.developmentMode.name()}
       |    execMode         : ${submitRequest.executionMode.name()}
       |    k8sNamespace     : ${submitRequest.k8sSubmitParam.kubernetesNamespace}
       |    flinkExposedType : ${submitRequest.k8sSubmitParam.flinkRestExposedType}
       |    clusterId        : ${submitRequest.k8sSubmitParam.clusterId}
       |    applicationType  : ${submitRequest.applicationType.getName}
       |    savePoint        : ${submitRequest.savePoint}
       |    properties       : ${submitRequest.properties.mkString(" ")}
       |    args             : ${submitRequest.args}
       |    appConf          : ${submitRequest.appConf}
       |    flinkBuildResult : ${submitRequest.buildResult}
       |-------------------------------------------------------------------------------------------
       |""".stripMargin)


  val (commandLine, flinkConfig) = getCommandLineAndFlinkConfig(submitRequest)
  // 根据不同的作业类型,设置Flink配置
  submitRequest.developmentMode match {
    case FlinkDevelopmentMode.PYFLINK =>
      val flinkOptPath: String = System.getenv(ConfigConstants.ENV_FLINK_OPT_DIR)
      if (StringUtils.isBlank(flinkOptPath)) {
        logWarn(s"Get environment variable ${ConfigConstants.ENV_FLINK_OPT_DIR} fail")
        val flinkHome = submitRequest.flinkVersion.flinkHome
        SystemPropertyUtils.setEnv(ConfigConstants.ENV_FLINK_OPT_DIR, s"$flinkHome/opt")
        logInfo(
          s"Set temporary environment variables ${ConfigConstants.ENV_FLINK_OPT_DIR} = $flinkHome/opt")
      }
    case _ =>
      if (submitRequest.userJarFile != null) {
        val uri = PackagedProgramUtils.resolveURI(submitRequest.userJarFile.getAbsolutePath)
        val programOptions = ProgramOptions.create(commandLine)
        val executionParameters = ExecutionConfigAccessor.fromProgramOptions(
          programOptions,
          Collections.singletonList(uri.toString))
        executionParameters.applyToConfiguration(flinkConfig)
      }
  }


  // 设置一些通用的Flink配置(下面都是设置Flink的配置)
  flinkConfig
    .safeSet(PipelineOptions.NAME, submitRequest.effectiveAppName)
    .safeSet(DeploymentOptions.TARGET, submitRequest.executionMode.getName)
    .safeSet(SavepointConfigOptions.SAVEPOINT_PATH, submitRequest.savePoint)
    .safeSet(ApplicationConfiguration.APPLICATION_MAIN_CLASS, submitRequest.appMain)
    .safeSet(ApplicationConfiguration.APPLICATION_ARGS, extractProgramArgs(submitRequest))
    .safeSet(PipelineOptionsInternal.PIPELINE_FIXED_JOB_ID, submitRequest.jobId)


  if (
    !submitRequest.properties.containsKey(CheckpointingOptions.MAX_RETAINED_CHECKPOINTS.key())
  ) {
    val flinkDefaultConfiguration = getFlinkDefaultConfiguration(
      submitRequest.flinkVersion.flinkHome)
    // state.checkpoints.num-retained
    val retainedOption = CheckpointingOptions.MAX_RETAINED_CHECKPOINTS
    flinkConfig.safeSet(retainedOption, flinkDefaultConfiguration.get(retainedOption))
  }


  // 设置savepoint参数
  if (submitRequest.savePoint != null) {
    flinkConfig.safeSet(SavepointConfigOptions.SAVEPOINT_PATH, submitRequest.savePoint)
    flinkConfig.setBoolean(
      SavepointConfigOptions.SAVEPOINT_IGNORE_UNCLAIMED_STATE,
      submitRequest.allowNonRestoredState)
    if (
      submitRequest.flinkVersion.checkVersion(
        FlinkRestoreMode.SINCE_FLINK_VERSION) && submitRequest.restoreMode != null
    ) {
      flinkConfig.setString(FlinkRestoreMode.RESTORE_MODE, submitRequest.restoreMode.getName);
    }
  }


  // 设置JVM相关参数
  if (MapUtils.isNotEmpty(submitRequest.properties)) {
    submitRequest.properties.foreach(
      x =>
        javaEnvOpts.find(_.key == x._1.trim) match {
          case Some(p) => flinkConfig.set(p, x._2.toString)
          case _ =>
        })
  }


  // 子类也可以实现自己做Flink配置
  setConfig(submitRequest, flinkConfig)


  // 子类做作业提交操作
  doSubmit(submitRequest, flinkConfig)


}

从上述代码,可以知道FlinkClientTrait这个基类,主要做了初始化Flink的配置,最后一步是doSubmit,即让子类去做实际的提交操作(这里用到了模版方法模式),看看RemoteClient这个类具体是怎么去做doSubmit的:

/**
 * 开始提交
 * @author : YangLinWei
 * @createTime: 2024/1/12 22:59
 * 
 * @param submitRequest 提交请求
 * @param flinkConfig 提交配置
 */
override def doSubmit(
    submitRequest: SubmitRequest,
    flinkConfig: Configuration): SubmitResponse = {
  //  使用了jobGraphSubmit或者使用RestAPI的方式提交
  super.trySubmit(submitRequest, flinkConfig)(jobGraphSubmit, restApiSubmit)
}


/**
  * 使用JobGraph的方式提交
  * 
  * @author : YangLinWei
  * @createTime: 2024/1/12 22:59
  * @param submitRequest 提交请求
  * @param flinkConfig Flink配置
  */
 @throws[Exception]
 def jobGraphSubmit(submitRequest: SubmitRequest, flinkConfig: Configuration): SubmitResponse = {
   var clusterDescriptor: StandaloneClusterDescriptor = null;
   var packageProgram: PackagedProgram = null
   var client: ClusterClient[StandaloneClusterId] = null
   try {
     val standAloneDescriptor = getStandAloneClusterDescriptor(flinkConfig)
     clusterDescriptor = standAloneDescriptor._2
     // build JobGraph
     val programJobGraph = super.getJobGraph(submitRequest, flinkConfig)
     packageProgram = programJobGraph._1
     val jobGraph = programJobGraph._2
     client = clusterDescriptor.retrieve(standAloneDescriptor._1).getClusterClient
     val jobId = client.submitJob(jobGraph).get().toString
     logInfo(
       s"${submitRequest.executionMode} mode submit by jobGraph, WebInterfaceURL ${client.getWebInterfaceURL}, jobId: $jobId")
     val result = SubmitResponse(null, flinkConfig.toMap, jobId, client.getWebInterfaceURL)
     result
   } catch {
     case e: Exception =>
       logError(s"${submitRequest.executionMode} mode submit by jobGraph fail.")
       e.printStackTrace()
       throw e
   } finally {
     if (submitRequest.safePackageProgram) {
       Utils.close(packageProgram)
     }
     Utils.close(client, clusterDescriptor)
   }
 } 


/**
  * 使用RestApi的方式提交
  * 
  * @author : YangLinWei
  * @createTime: 2024/1/12 22:59
  * @param submitRequest 提交请求
  * @param flinkConfig Flink配置
  */
@throws[Exception]
 def restApiSubmit(submitRequest: SubmitRequest, flinkConfig: Configuration): SubmitResponse = {
   // retrieve standalone session cluster and submit flink job on session mode
   var clusterDescriptor: StandaloneClusterDescriptor = null;
   var client: ClusterClient[StandaloneClusterId] = null
   Try {
     val standAloneDescriptor = getStandAloneClusterDescriptor(flinkConfig)
     val yarnClusterId: StandaloneClusterId = standAloneDescriptor._1
     clusterDescriptor = standAloneDescriptor._2


     client = clusterDescriptor.retrieve(yarnClusterId).getClusterClient
     val jobId =
       FlinkSessionSubmitHelper.submitViaRestApi(
         client.getWebInterfaceURL,
         submitRequest.userJarFile,
         flinkConfig)
     logInfo(
       s"${submitRequest.executionMode} mode submit by restApi, WebInterfaceURL ${client.getWebInterfaceURL}, jobId: $jobId")
     SubmitResponse(null, flinkConfig.toMap, jobId, client.getWebInterfaceURL)
   } match {
     case Success(s) => s
     case Failure(e) =>
       logError(s"${submitRequest.executionMode} mode submit by restApi fail.")
       throw e
   }
 }

接着Streampark拿到作业结果就会去做作业状态的更新、记录日志等操作,源码到这里就结束了。

4.4  源码总结


通过以上源码分析,可以总结出StreamPark运行作业的主流程及核心类如下:

  • ApplicationController:作业启动接口(入口)
  • ApplicationActionServicer:“作业校验”→“封装作业提交参数”→“提交作业”→“异步等待作业提交结果”;
  • FlinkClientTraitr:作业提交客户端模板基类,submit方法封装需要提交的flink参数,并使用模板方法模式,让子类去实现对应的提交;
  • RemoteClientr:继承FlinkClientTrait模板类,提交时使用JobGraph方式或RestApi方式提交作业。

除了RemoteClient的方式提交,可以知道还有如下图的几个类都封装好了提交的业务逻辑:

StreamPark从零快速入门(本地调试、功能演示及源码分析)_flink_28


5、 文末   

到此,博主从零开始带领大家在本地成功启动并调试了StreamPark,并演示了常规的操作,以及分析了其核心的源码。

希望能帮助到大家,也希望越来越多的人认识到StreamPark,希望能帮助到大家,谢谢大家的阅读,本文完!

此文转载阿甘兄弟个人博客,地址:https://mp.weixin.qq.com/s/b4GACZ9-mXp0TgPMy2729g

6、 欢迎大家关注官方公众号

StreamPark从零快速入门(本地调试、功能演示及源码分析)_导入项目_29