1. 集群节点规划及说明

主机 部署服务 说明
172.16.53.179 masterserver,alertserver,zookeeper,mysql 数据库也在该主机
172.16.53.180 zookeeper zk
172.16.53.181 apiserver,zookeeper web服务端口:12345
172.16.53.182 masterserver
172.16.53.183 workserver
172.16.53.189 workserver
172.16.53.193 workserver

2. 安装说明

安装包地址:https://dlcdn.apache.org/dolphinscheduler/1.3.8/apache-dolphinscheduler-1.3.8-bin.tar.gz

  1. 新建dolphinscheduler用户(集群所有主机都执行)
    useradd dolphinscheduler
  2. 创建部署目录,并赋予dolphinscheduler用户操作权限(集群所有主机都执行)
    mkdir -p /opt/dolphinscheduler
    chown -R dolphinscheduler:dolphinscheduler /opt/dolphinscheduler
  3. 配置 sudo 免密(集群所有主机都执行)
    echo 'dolphinscheduler  ALL=(ALL)  NOPASSWD: NOPASSWD: ALL' >> /etc/sudoers
  4. 配置dolphinscheduler用户免密
    注:172.16.53.179到所有主机的免密登陆
    172.16.53.179主机执行如下:
    ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
    cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys

    将172.16.53.179主机的公钥拷贝至其它主机的~/.ssh/authorized_keys下

  5. 配置hosts映射(集群所有主机都执行)
    所有主机的/etc/hosts下配置
    172.16.53.179 node01
    172.16.53.180 node02
    172.16.53.181 node03
    172.16.53.182 node04
    172.16.53.183 node05
    172.16.53.189 node06
    172.16.53.193 node07
  6. 创建运行任务目录,并赋予777权限(集群所有主机都执行)
    mkdir -p /tmp/dolphinscheduler
    chmod -R 777 /tmp/dolphinscheduler
  7. 创建hdfs资源中心目录(任一台主机)
    su - hdfs
    hdfs dfs -mkdir /dolphinscheduler
    hdfs dfs -chown -R 777 /dolphinscheduler
  8. 下载并解压apache-dolphinscheduler-1.3.8-bin.tar.gz
    172.16.53.179主机dolphinscheduler用户下执行:
    cd /opt/dolphinscheduler
    wget https://dlcdn.apache.org/dolphinscheduler/1.3.8/apache-dolphinscheduler-1.3.8-bin.tar.gz
    tar -zxvf apache-dolphinscheduler-1.3.8-bin.tar.gz -C /opt/dolphinscheduler
    mv apache-dolphinscheduler-1.3.8-bin  dolphinschedulerinstall

    9.所有节点安装jdk(root用户下执行)并配置环境变量如下:
    vim /etc/profile.d/jdk.sh

    JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
    JRE_HOME=/usr/java/jdk1.8.0_181-cloudera/jre
    PATH=$PATH:$JAVA_HOME/bin
    export JAVA_HOME JRE_HOME PATH

    3. 数据库初始化

    注: 以下操作全部在dolphinscheduler用户下执行

  9. 先添加mysql的驱动包至DolphinScheduler 的 lib 目录下
    [dolphinscheduler@node01 lib]$ ls mysql-connector-java-5.1.47.jar
    mysql-connector-java-5.1.47.jar
  10. 登陆数据库创建dolphinscheduler库及用户
    mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
    mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';
    mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'localhost' IDENTIFIED BY 'dolphinscheduler';
    mysql> flush privileges;
  11. 修改datasource.properties配置
    vim conf/datasource.properties
    # postgre
    # spring.datasource.driver-class-name=org.postgresql.Driver
    # spring.datasource.url=jdbc:postgresql://localhost:5432/dolphinscheduler
    # mysql
    spring.datasource.driver-class-name=com.mysql.jdbc.Driver
    spring.datasource.url=jdbc:mysql://172.16.53.179:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&allowMultiQueries=true
    spring.datasource.username=dolphinscheduler
    spring.datasource.password=dolphinscheduler
  12. 创建表及导入基础数据脚本
    sh script/create-dolphinscheduler.sh

    注意: 如果执行上述脚本报 “/bin/java: No such file or directory” 错误,请在/etc/profile 下配置 JAVA_HOME 及 PATH 变量

    4. 修改运行参数

  13. 修改 conf/env 目录下的 dolphinscheduler_env.sh(修改成真实环境的地址即可)
    以下是我的部署环境:
    export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
    export HADOOP_CONF_DIR=/etc/hadoop/conf
    export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
    export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
    export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
    export FLINK_HOME=/opt/flink-1.12.2
    export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$PATH
  14. 修改一键部署配置文件
    vim conf/config/install_config.conf
    
    # postgresql or mysql
    dbtype="mysql"

db address and port

dbhost="172.16.53.179:3306"

db username

username="dolphinscheduler"

mysql库

dbname="dolphinscheduler"

mysql密码

password="dolphinscheduler"

zk 地址

zkQuorum="172.16.53.179:2181,172.16.53.180:2181,172.16.53.181:2181"

安装路径

installPath="/opt/dolphinscheduler"

部署用户

deployUser="dolphinscheduler"

邮箱告警设置,默认未配置

mailServerHost="smtp.exmail.qq.com"

mail server port

note: Different protocols and encryption methods correspond to different ports, when SSL/TLS is enabled, make sure the port is correct.

mailServerPort="25"

sender

mailSender="xxxxxxxxxx"

user

mailUser="xxxxxxxxxx"

sender password

note: The mail.passwd is email service authorization code, not the email login password.

mailPassword="xxxxxxxxxx"

TLS mail protocol support

starttlsEnable="true"

SSL mail protocol support

only one of TLS and SSL can be in the true state.

sslEnable="false"

#note: sslTrust is the same as mailServerHost
sslTrust="smtp.exmail.qq.com"

任务运行的本地目录

dataBasedirPath="/tmp/dolphinscheduler"

resource storage type: HDFS, S3, NONE

resourceStorageType="HDFS"

hdfs资源上传根路径,由于 hdfs支持本地文件系统,需要确保本地文件夹存在且有读写权限

resourceUploadPath="/dolphinscheduler"

如果上传资源保存想保存在 hadoop 上,hadoop 集群的 NameNode 启用了 HA 的话,需要将 hadoop 的配置文件 core-site.xml 和 hdfs-site.xml 放到安装路径的 conf 目录下,

本例即是放到 /opt/soft/dolphinscheduler/conf 下面,并配置 namenode cluster 名称;如果 NameNode 不是 HA,则只需要将 mycluster 修改为具体的 ip 或者主机名即可.

defaultFS="hdfs://tfdservice:8020"

if resourceStorageType is S3, the following three configuration is required, otherwise please ignore

s3Endpoint="http://192.168.xx.xx:9010"
s3AccessKey="xxxxxxxxxx"
s3SecretKey="xxxxxxxxxx"

resourcemanager 端口

resourceManagerHttpAddressPort="8088"

if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty

yarnHaIps="172.16.53.179,172.16.53.180"

如果 ResourceManager 是 HA 或者没有使用到 Yarn 保持默认值即可;如果是单 ResourceManager,请配置真实的 ResourceManager 主机名或者 ip

singleYarnIp="yarnIp1"

具备权限创建 resourceUploadPath的用户

hdfsRootUser="hdfs"

kerberos服务 默认未开启,无需配置

kerberosStartUp="false"

kdc krb5 config file path

krb5ConfPath="$installPath/conf/krb5.conf"

keytab username

keytabUserName="hdfs-mycluster@ESZ.COM"

username keytab path

keytabPath="$installPath/conf/hdfs.headless.keytab"

kerberos expire time, the unit is hour

kerberosExpireTime="2"

api server 默认端口

apiServerPort="12345"

在哪些机器上部署 DS 服务,本机选 localhost

ips="172.16.53.179,172.16.53.181,172.16.53.182,172.16.53.183,172.16.53.189,172.16.53.193"

ssh 端口,默认22

sshPort="22"

master 服务部署在哪台机器上

masters="172.16.53.179,172.16.53.182"

worker 服务部署在哪台机器上,并指定此 worker 属于哪一个 worker 组,下面示例的 default 即为组名

workers="172.16.53.183:default,172.16.53.189:default,172.16.53.193:default"

报警服务部署在哪台机器上

alertServer="172.16.53.179"

后端 api 服务部署在在哪台机器上

apiServers="172.16.53.181"

**特别注意:如果需要用资源上传到 Hadoop 集群功能, 并且 Hadoop 集群的 NameNode 配置了 HA 的话 ,需要开启 HDFS 类型的资源上传,同时需要将 Hadoop 集群下的 core-site.xml 和 hdfs-site.xml 复制到 /opt/dolphinscheduler/conf,非 NameNode HA 跳过次步骤**

# 5. 一键部署(dolphinscheduler执行)
```bash
sh install.sh

部署成功后,可以进行日志查看,日志统一存放于 logs 文件夹内

logs/
    ├── dolphinscheduler-alert-server.log
    ├── dolphinscheduler-master-server.log
    |—— dolphinscheduler-worker-server.log
    |—— dolphinscheduler-api-server.log
    |—— dolphinscheduler-logger-server.log

6. 登陆系统

访问前端页面地址 http://172.16.53.181:12345/dolphinscheduler
image.png
默认登陆用户:admin 初始密码:dolphinscheduler123

7. 启停服务

  1. 一键停止集群所有服务
    sh ./bin/stop-all.sh

    2.一键开启集群所有服务

    sh ./bin/start-all.sh
  2. 启停 Master
    sh ./bin/dolphinscheduler-daemon.sh start master-server
    sh ./bin/dolphinscheduler-daemon.sh stop master-server
  3. 启停 Worker
    sh ./bin/dolphinscheduler-daemon.sh start worker-server
    sh ./bin/dolphinscheduler-daemon.sh stop worker-server
  4. 启停 Api
    sh ./bin/dolphinscheduler-daemon.sh start api-server
    sh ./bin/dolphinscheduler-daemon.sh stop api-server
  5. 启停 Logger
    sh ./bin/dolphinscheduler-daemon.sh start logger-server
    sh ./bin/dolphinscheduler-daemon.sh stop logger-server

    7.启停 Alert

    sh ./bin/dolphinscheduler-daemon.sh start alert-server
    sh ./bin/dolphinscheduler-daemon.sh stop alert-server