CDH-5.13.1完全离线安装
前不久实施的一个项目,机房环境无法访问外部网络,无奈只能离线部署
安装准备
- 安装环境
系统 | vCPU | 内存/GB | 系统盘/GB | 数据盘/GB | JDK | CDH | Presto |
CentOS-7.4 | 16 | 32 | 60GB | 500GB | jdk1.8.0_261 | 5.13.1 | 0.216 |
- 集群规划
IP | 主机名 | 角色 |
10.0.0.1 | cdh01 | master |
10.0.0.2 | cdh02 | core |
10.0.0.3 | cdh03 | core |
10.0.0.4 | cdh04 | core |
10.0.0.5 | cdh05 | core |
10.0.0.6 | cdh06 | core |
10.0.0.7 | cdh07 | core |
10.0.0.8 | cdh08 | core |
10.0.0.9 | cdh09 | core |
10.0.0.10 | cdh10 | core |
基础配置
1. ( 可选)下载以下基础软件并打包上传到所有机器,该步骤需在一台可连接互联网的机器上执行
yum install --downloadonly --downloaddir=./rpm/ rpcbind vim wget lrzsz zip unzip bzip2 bunzip2 cronolog git nc sysstat gcc+ gcc-c++ ntp
tar -zcvf rpm.tar.gz rpm
systemctl start rpcbind
2. 所有机器:解压并批量安装基础软件
tar -zxvf rpm.tar.gz
cd rpm
rpm -Uvh --force --nodeps *rpm
3. 所有机器:修改主机名和hosts
- 执行如下命令 修改
/etc/hosts
cat >> /etc/hosts << EOF
10.0.0.1 cdh01 master
10.0.0.2 cdh02
10.0.0.3 cdh03
10.0.0.4 cdh04
10.0.0.5 cdh05
10.0.0.6 cdh06
10.0.0.7 cdh07
10.0.0.8 cdh08
10.0.0.9 cdh09
10.0.0.10 cdh10
EOF
- 修改主机名
hostnamectl set-hostname master
5. 所有机器:关闭防火墙和selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
setenforce 0
systemctl stop firewalld
systemctl disable firewalld
6. 所有机器:执行如下三段命令添加配置
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
- 添加到开机启动
cat << EOF >> /etc/rc.local
# Disable transparent_hugepage
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
EOF
- 修改内核配置
cat << EOF >> /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
vm.swappiness = 10
EOF
- 加载配置
sysctl -p
7. 所有机器:启用 NTP
服务。将cdh01服务器 设置为 ntp-server
- cdh01机器上操作: 编辑
/etc/ntp.conf
注释掉server0~n
这几行,增加下面配置
server 127.127.1.0 ## 设置本机为NTP服务器
fudge 127.127.1.0 stratum 10 ## 没有时间来源以本机时间为准
restrict 10.0.0.0 mask 255.255.255.0 nomodify notrap ## 允许客户端10.0.0.0网段的所有主机向本机请求时间同步
- cdh02~cdh10机器上操作:编辑
/etc/ntp.conf
注释掉server0~n
这几行,增加下面配置
server cdh01
- 所有机器重启ntp,并添加开机启动
systemctl restart ntpd
systemctl enable ntpd.service
- 使用以下命令检测ntp服务是否可用,初次检测有延迟,待ntp服务启动几分钟后再检测
ntpstat
ntpq -p
8. 所有机器:卸载系统自带JDK 安装指定版本JDK
- 卸载系统自带jdk
yum remove -y *jdk*
yum remove -y *java*
- oracle官网下载
jdk-8u261
并安装到所有机器,也可以使用其他版本jdk,但由于本次部署需要presto-0.216,该版本preso需要较高版本jdk.。不支持openjdk
rpm -ivh jdk-8u261-linux-x64.rpm
9. cdh01到cdh02~cdh10免密互信
- cdh01机器操作
ssh-keygen -t rsa
ssh-copy-id cdh02
. . . . .
ssh-copy-id cdh10
MySQL安装
1. mysql下载:https://downloads.mysql.com/archives/get/p/23/file/mysql-5.7.30-1.el7.x86_64.rpm-bundle.tar2. 上传到cdh01服务器并解压,删除多余组件,只保留以下6个
mysql-community-client-5.7.30-1.el7.x86_64.rpm
mysql-community-common-5.7.30-1.el7.x86_64.rpm
mysql-community-devel-5.7.30-1.el7.x86_64.rpm
mysql-community-libs-5.7.30-1.el7.x86_64.rpm
mysql-community-libs-compat-5.7.30-1.el7.x86_64.rpm
mysql-community-server-5.7.30-1.el7.x86_64.rpm
3. 安装mysql
rpm -Uvh --force --nodeps *rpm
4. mysql初始化
- 新建用户和授权
mysql>uninstall plugin validate_password;
mysql>alter user 'root'@'localhost' identified by 'rootpasswd';
mysql>grant all privileges on *.* to 'root'@'cdh01'identified by 'cdh01' with grant option;
mysql>grant all privileges on hive.* to 'hiveuser'@'%'identified by 'hivepasswd';
mysql>grant all privileges on monitor.* to'monitoruser'@'%' identified by 'monitorpasswd';
mysql>CREATE DATABASE hive CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';
mysql>CREATE DATABASE monitor CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';
Cloudera-Manager安装
1. 下载如下安装包并上传到 cdh01:/opt/tmp/
现在不能直接下载了,还好有库存
- cdh-parcels: https://archive.cloudera.com/cdh5/parcels/5.13.1.2/CDH-5.13.1-1.cdh5.13.1.p0.2-el7.parcel
- cm-manager: http://archive.cloudera.com/cm5/cm/5/cloudera-manager-centos7-cm5.13.1_x86_64.tar.gz
- cdh-sha: https://archive.cloudera.com/cdh5/parcels/5.13.1.2/CDH-5.13.1-1.cdh5.13.1.p0.2-el7.parcel.sha1
- mainfest.json: https://archive.cloudera.com/cdh5/parcels/5.13.1.2/manifest.json
- manifest.json: (manifest.json需要网页上打开后复制全部内容,在桌面新建txt文件,粘贴复制的内容后修改后缀名为.json)
- presto-server: https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.216/presto-server-0.216.tar.gz
- presto-cli: https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.216/presto-cli-0.216-executable.jar
- mysql-connector: https://cdn.mysql.com//archives/mysql-connector-java-5.1/mysql-connector-java-5.1.47.tar.gz
2. cdh01机器操作:安装cloudera-manager
## 将cloudera-manager解压至 /opt/目录下
tar -zxf /tmp/cloudera-manager-centos7-cm5.13.1_x86_64.tar.gz -C /opt/
## 将cdh主要文件拷贝到 /opt/cloudera/parcel-repo/下面
mv /opt/tmp/CDH-5.13.1-1.cdh5.13.1.p0.2-el7.parcel /opt/cloudera/parcel-repo/
mv /opt/tmp/CDH-5.13.1-1.cdh5.13.1.p0.2-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.13.1-1.cdh5.13.1.p0.2-el7.parcel.sha ## 必须将后缀改为 ".sha"
mv /opt/tmp/manifest.json /opt/cloudera/parcel-repo/
## 添加软链
ln -s /opt/cm-5.13.1 /opt/cm
3. 拷贝mysql连接驱动
- cdh01机器操作
mkdir /usr/share/java/
tar -zxf /tmp/mysql-connector-java-5.1.47.tar.gz
cp mysql-connector-java-5.1.47/mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar
## 拷贝到其余所有子节点
scp -r /usr/share/java cdh02:/usr/share/
.......
scp -r /usr/share/java cdh10:/usr/share/
4. cdh数据库初始化
/opt/cm/share/cmf/schema/scm_prepare_database.sh mysql scm -hcdh01 -uroot -p'rootpasswd' --scm-host cdh01 scm scm123 scm123
5. 修改cloudera-scm-agen配置:**/opt/cm-5.13.1/etc/cloudera-scm-agent/config.ini**
server_host=cdh01
parcel_dir=/opt/cloudera/parcels ## 默认安装路径,可根据情况,选择空间较大的分区安装
6. 拷贝cm目录到其余所有节点cdh02~cdh10
scp -r /opt/cm cdh02:/opt/
........
scp -r /opt/cm cdh10:/opt/
7. 所有机器:添加用户
useradd --system --home=/opt/cm/run/cloudera-scm-server/ --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
8. CDH01机器:对**/opt/cloudera/**
** 目录赋权**
chown -R cloudera-scm:cloudera-scm /opt/cloudera/
9. 所有机器:启动: **cloudera-scm-agent**
cdh01机器启动:**cloudera-scm-server**
## 主节点cdh01
/opt/cm/etc/init.d/cloudera-scm-server restart ## 第⼀次启动需要初始化数据库,监听端⼝启动会延迟出现
## 所有节点cdh01~cdh10
/opt/cm/etc/init.d/cloudera-scm-agent restart
10. 开始安装CDH集群
- 浏览器访问 cdh01:7180 (用户名、密码:admin/admin)
- 安装步骤:参考 CDH安装步骤详解
Presto安装
- 由于CDH并不包含Presto,这里单独安装
- Presto集群角色规划
IP | 主机名 | 角色 |
10.0.0.1 | cdh01 | coordinator |
10.0.0.2 | cdh02 | worker |
10.0.0.3 | cdh03 | worker |
10.0.0.4 | cdh04 | worker |
10.0.0.5 | cdh05 | worker |
10.0.0.6 | cdh06 | worker |
1. cdh01机器安装Presto
- 解压
presto
至/opt/cloudera/parcels/
tar -zxf /opt/tmp/presto-server-0.216.tar.gz -C /opt/cloudera/parcels
ln -s /opt/cloudera/parcels/presto-server-0.216 /opt/cloudera/parcels/presto
2. 准备配置文件
注意:
query.max-memory:表示单个查询在分布在所有相关节点上能用的内存之和的最大值。
query.max-memory-per-node:单个查询在单个节点上用户内存能用的最大值,从定义上就能看出:query.max-memory-per-node 必须小于query.max-total-memory-per-node
同样: query.max-memory 也必须小于query.max-total-memory
另外:query.max-total-memory-per-node 与memory.heap-headroom-per-node 之和必须小于 jvm max memory .也就是jvm.config 中配置的-Xmx
- 进入配置文件路径
/opt/cloudera/parcels/presto/etc
- 创建
node.properties
内容如下
node.environment=myprestocluster ## 集群名称,同一个集群中的所有Presto节点必须拥有相同的集群名称
node.id=cdh01 ## 每个Presto节点的唯一标识,且必须唯一
node.data-dir=/opt/cloudera/parcels/presto/data ## 数据存储目录的位置
- 创建
jvm.config
内容如下
-server
-Xmx60G
-XX:+UseConcMarkSweepGC
-XX:+ExplicitGCInvokesConcurrent
-XX:+CMSClassUnloadingEnabled
-XX:+AggressiveOpts
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p
-XX:ReservedCodeCacheSize=150M
- 创建
config.properties
内容如下
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8889
query.max-memory=50GB
query.max-memory-per-node=6GB
discovery-server.enabled=true
discovery.uri=http://cdh01:8889
- 创建
log.properties
内容如下
com.facebook.presto=INFO
3. Presto集成Hive
- 进入配置文件路径
/opt/cloudera/parcels/presto/etc
- 新建
catalog
目录 -
catalog
目录下新建hive.properties
配置文件,添加如下内容
connector.name=hive-hadoop2
hive.metastore.uri=thrift://cdh01:9083
hive.parquet.fail-on-corrupted-statistics=false
hive.metastore-cache-ttl=20m
hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
hive.allow-drop-table = true
hive.compression-codec=NONE
hive.metastore-refresh-interval=1m
4. worker节点部署
- 将Presto目录分发至worker节点
scp -r /opt/cloudera/parcels/presto cdh02:/opt/cloudera/parcels/
.................
scp -r /opt/cloudera/parcels/presto cdh06:/opt/cloudera/parcels/
- 依次修改worker节点的
node.properties
和config.properties
- 编辑
node.properties
node.id=cdh04 ## 每个节点的node.id必须唯一
- 编辑
config.properties
coordinator=false ## worker节点均设置为false
#discovery-server.enabled=true ##注释掉
5. 启动Presto
/opt/cloudera/parcels/presto/bin/launcher restart
5. 验证
presto --server localhost:8889 --catalog=hive --schema=default
SELECT * from system.runtime.nodes