Centos7下部署CDH6大数据服务
CDH集群最少3台机器,生产环境,推荐8台或更多
官方安装步骤 https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/install_cm_cdh.html
基础环境准备,安装包准备
安装包
1.python包(centos预装python2.7+) 2.mysql-connector-java包(推荐5.7) 3.scala包(2.13.0) scala官方下载地址:https://www.scala-lang.org/ 4.CM相关(版本根据自己需求而定,作者使用的是cdh6.3) CM官方下载地址:https://archive.cloudera.com/cm6/ allkeys.asc cloudera-manager.repo oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm cloudera-manager-server-db-2-6.3.0-1281944.el7.x86_64.rpm cloudera-manager-server-6.3.0-1281944.el7.x86_64.rpm cloudera-manager-daemons-6.3.0-1281944.el7.x86_64.rpm cloudera-manager-agent-6.3.0-1281944.el7.x86_64.rpm CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel manifest.json
基础网络环境
网络
新机器修改本机ip 开启网络服务 ONBOOT=no 改成 ONBOOT=yes 网络改成静态 centos7默认网络配置文件是ifcfg-ens33
vim /etc/sysconfig/network-scripts/ifcfg-ens***
重启网络
service restart network
检查网络情况
ifconfig 或者 ip addr
修改hostname和hosts表
重命名机器,新机器默认全部是localhost,直接修改即可,如果是旧机器,或者已经在跑其他服务的机器,慎重修改.修改的话推荐命名方式xxx-cdh-1 xxx-cdh-2 xxx为项目名或公司名
vim /etc/hostname
修改hosts表 将集群主机添加进去 格式如下: 127.0.0.1 xxx-cdh-1 127.0.0.2 xxx-cdh-2 127.0.0.3 xxx-cdh-4 127.0.0.4 xxx-cdh-mysql-1
vim /etc/hosts
修改时区
查找中国时区的完整名称(中国时区为上海,不是北京)
timedatectl list-timezones |grep Shanghai
设定本机为上海时区
timedatectl set-timezone Asia/Shanghai
#其他时区依此类推 查看时间确定已修改完成
date
安装系统常用工具
yum install -y expect bc net-tools iotop zip unzip telnet wget iperf3 fio ntfs-3g lzo iftop vim
安装JDK
作者因网络情况不能使用wget命令,所以适用ftp工具上传JDK包至主机tmp下, 精简重命名命令
mv /tmp/oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm oracle-j2sdk1.8.rpm
安装JDK到/usr/java/下
yum localinstall -y /tmp/oracle-j2sdk1.8.rpm && ln -s /usr/java/jdk1.8.0_181-cloudera /usr/java/default"
安装Scala
创建scala目录
rm -rf /usr/scala && mkdir -p /usr/scala
移动安装包
cp /tmp/scala-2.13.0.tgz /usr/scala
安装scala
cd /usr/scala && tar -zxvf scala-2.13.0.tgz && rm -rf scala-2.13.0.tgz
编辑环境变量配置
在tmp目录下直接创建编辑临时环境变量文件 env 命令:$ vim /tmp/env
vim /tmp/env
写入到系统环境变量
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
SCALA_HOME=/usr/scala/scala-2.13.0
CLASSPATH=$JAVA_HOME/bin:$SCALA_HOME/bin
export PATH=$JAVA_HOME:$SCALA_HOME:$CLASSPATH:$PATH
将env文件内容 追加到系统环境变量
cat /tmp/env >> /etc/profile
重启环境变量使其生效
source /etc/profile
source /etc/profile 这条命令.在部分linux系统下可能不生效,如果不生效,就需要重启机器使环境变量生效.
检查python版本
python
安装mysql驱动
首先卸载linux 自带的mariadb 查看本机mariadb版本
rpm -qa|grep -i mariadb
卸载本机mariadb
rpm -e mariadb-libs-5.5.64-1.el7.x86_64 --nodeps
安装mysql驱动 上传mysql驱动tar包至tmp文件夹下,然后解压
tar zxvf /tmp/mysql-connector-java-5.1.46.tar.gz
然后创建文件夹命令
mkdir -p /usr/share/java/
将解压出的jar包复制到新路径下(复制过程中一定要精简命名,最终jar包名称不能携带版本信息)
cp /tmp/mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
删除tmp下mysql相关文件即可
rm -rf /tmp/mysql-connector-java-5.1.46*
系统优化项
解决系统乱码,设置中文
编辑locale.conf文件
vim /etc/locale.conf
删除文件内原有内容,写入(最后一行,=号后边就是没有内容,不要担心)
LANG=zh_CN.UTF-8
LC_CTYPE=zh_CN.UTF-8
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_PAPER="zh_CN.UTF-8"
LC_NAME="zh_CN.UTF-8"
LC_ADDRESS="zh_CN.UTF-8"
LC_TELEPHONE="zh_CN.UTF-8"
LC_MEASUREMENT="zh_CN.UTF-8"
LC_IDENTIFICATION="zh_CN.UTF-8"
LC_ALL=
重启语言服务
source /etc/locale.conf
关闭tuned
首先开启tuned 命令:$ systemctl start tuned 看其状态:$ systemctl status tuned 关闭tuned adm工具 命令:$ tuned-adm off 查看tuned工具列表:命令:$ tuned-adm list 关闭tuned服务命令: systemctl stop tuned 关闭tuned开机自启命令: systemctl disable tuned
systemctl start tuned
systemctl status tuned
tuned-adm off
tuned-adm list
systemctl stop tuned
systemctl disable tuned
大页面关闭
查看thp是否启用,[]选中always为开启, nerver为关闭 查看命令
cat /sys/kernel/mm/transparent_hugepage/enabled
查看命令
cat /sys/kernel/mm/transparent_hugepage/defrag
关闭thp命令
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
设置开机关闭
echo "echo never > /sys/kernel/mm/transparent_hugepage/defrag" >> /etc/rc.local
echo "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >> /etc/rc.local
在GRUB_CMDLINE_LINUX项目后面添加一个参数:transparent_hugepage=never,编辑/etc/default/grub
vim /etc/default/grub
重新生成gurb.cfg文件
grub2-mkconfig -o /boot/grub2/grub.cfg
调整swappiness
查看swappiness现在的值
cat /proc/sys/vm/swappiness
将值统一修改为1
sysctl -w vm.swappiness=1"
将vm.swappiness=1 追加写入etc/sysctl.conf
echo "vm.swappiness=1" >> /etc/sysctl.conf
会话超时调整
在/etc/profile 文件追加超时设定
echo "TMOUT=900">>/etc/profile
内核优化
在etc/sysctl.conf文件追加如下内容( 此命令我写入经常出现乱码,也不知道哪儿的错,执行完之后,一定要看一下sysctl.conf文件) :
echo -e "\nnet.ipv4.tcp_tw_reuse = 1
\nnet.ipv4.tcp_tw_recycle = 1
\nnet.ipv4.tcp_keepalive_time = 1200
\nnet.ipv4.ip_local_port_range = 10000 65000
\nnet.ipv4.tcp_max_syn_backlog = 8192
\nnet.ipv4.tcp_max_tw_buckets = 5000
\nfs.file-max = 655350
\nnet.ipv4.route.gc_timeout = 100
\nnet.ipv4.tcp_syn_retries = 1
\nnet.ipv4.tcp_synack_retries = 1
\nnet.core.netdev_max_backlog = 16384
\nnet.ipv4.tcp_max_orphans = 16384
\nnet.ipv4.tcp_fin_timeout = 2
\net.core.somaxconn=32768
\kernel.threads-max=196605
\kernel.pid_max=196605
\vm.max_map_count=393210" >> /etc/sysctl.conf
修改最大打开文件数
查看现在最大打开文件数
ulimit -a
设定文件打开最大数量
$ sed -i '$ a\* soft nofile 196605' /etc/security/limits.conf
$ sed -i '$ a\* hard nofile 196605' /etc/security/limits.conf
$ echo "* soft nproc 196605" >> /etc/security/limits.conf
$ echo "* hard nproc 196605" >> /etc/security/limits.conf
安装mysql
搭建CDH集群,mysql可以装在集群内任意主机,但是不推荐装在集群内,建议另开一台mysql专属机器.后期维护方便,减少麻烦. Centos7安装mysql流程在另一篇文章,这篇文章内只写使用, 编辑mysql配置文件, 以下配置为CDH官方推荐模版,各位根据自己的需求进行修改
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
创建CM相关的表
mysql -u root -p
# Configure the Cloudera Manager Server, Activity Monitor, Reports Manager, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server databases to support the utf8mb4 character set encoding.
# Configure all other databases to use the utf8 character set.
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm@DW';
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon@DW';
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman@DW';
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue@DW';
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive@DW';
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry@DW';
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav@DW';
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms@DW';
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie@DW';
SHOW DATABASES;
看到的其他大神建议: 注意事项:
在大型集群中Activity Monitor 与 Service Monitor 使用的数据库应该分配不同的磁盘卷来进行读写。
在超过50个节点的集群中,不要将所有服务的数据库装在一个节点中,否则该节点的数据库压力会很大。最好能为每个服务配置不同位于不同节点上的数据库。
不需要使用专门的数据库服务器,但是每个服务的数据库应该分散在不同的节点上。
如果集群节点超过1000个,将mysql的max_allowed_packet值设置为16M。
For MySQL 5.6 and 5.7, you must install the MySQL-shared-compat or MySQL-shared package. This is required for the Cloudera Manager Agent package installation
安装Cloudera Manager服务
创建免密root权限用户, 编辑/etc/sudoers文件,确定有这样一行"Defaults secpath = /sbin:/bin:/usr/sbin:/usr/bin" 然后在 添加行
%cloudera-scm ALL=(ALL) NOPASSWD:ALL
在tmp下新建cm文件夹 然后把所有的CM相关的安装包放入进去 主节点安装所有服务
rpm -ivh --force --nodeps /tmp/cm/*.rpm
所有子节点安装daemons和agent
rpm -ivh --force --nodeps /tmp/cm/daemons*.rpm
rpm -ivh --force --nodeps /tmp/cm/agent*.rpm
CDH parcel包 在线安装耗时太长且不稳定,推荐离线安装方式。 需要的文件
manifest.json
CDH-6.1.0-1.cdh6.1.0.p0.770702-el7.parcel
主机创建安装parcel包路径(此路径下放parcel包.CDH生态安装其他拓展服务,相关parcel包和包的sha验证文件都是放到这个文件夹下)
mkdir -p /opt/cloudera/parcel-repo
把cm下的parcel包移动到parcel-repo目录
mv /tmp/cm/ CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel /opt/cloudera/parcel-repo/
把cm下的manifest.json移动到parcel-repo目录
mv /tmp/cm/manifest.json /opt/cloudera/parcel-repo
创建签名文件
cd /opt/cloudera/parcel-repo && sha1sum CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel | awk '{ print $1 }' > CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
启动CM服务
首先修改CM服务的数据库选项 修改数据库地址,端口,数据库名称,数据库密码,最后一项 如果是主机本机的数据库 保留默认值INIT 如果数据库在其他主机修改为EXTERNA
vim /etc/cloudera-scm-server/db.properties
然后修改所有从机的agent配置文件,修改为监听追寻主机 修改 server_host为主机hosts 或者ip地址 ,并且确保hosts主机7182端口开放,且无防火墙拦截(推荐使用hostsname,作者使用的时候,有出现写ip搜寻不到的问题),如果不修改agent的配置文件,在web安装服务的时候,会出现搜寻不到从机.
vim /etc/cloudera-scm-agent/config.ini
启动主机CM服务(此过程较慢,将近一分钟左右才能起来,耐心等待) 启动的过程建议盯紧日志,
systemctl start cloudera-scm-server
查看日志
tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
正常启动后会出现7180端口已启动,此时再启动所有机器的agent服务(包括主机的agent和从机的agent)
systemctl start cloudera-scm-agent
然后监控所有agent日志
tail -f /var/log/cloudera-scm-agent/cloudera-scm-agent.log
大功告成!然后就进入web界面安装服务,访问主机的7180 端口