CDH入门教程3
第5章 卸载CDH(了解)
集群出现错误异常时,再按照本章步骤操作。但是卸载CDH,重新安装只可以解决部分报错,一些极个别顽固报错还是有可能解决不了,所以如果同学们在安装CDH过程中,报的错误,我建议大家直接释放掉阿里云集群,重新购买三台机器重新安装。
5.1 停止所有服务
1)停止所有集群服务
2)停止CMservice
5.2 停用并移除Parcels
1)停用(选择仅限停用状态)
2)从主机删除
5.3 删除集群及CM
5.4 停止服务
[root@hadoop102 parcel-repo]#
/opt/module/cm/cm-5.16.2/etc/init.d/cloudera-scm-agent stop
Stopping cloudera-scm-agent: [确定]
[root@hadoop103 parcel-repo]#
/opt/module/cm/cm-5.16.2/etc/init.d/cloudera-scm-agent stop
Stopping cloudera-scm-agent: [确定]
[root@hadoop104 parcel-repo]#
/opt/module/cm/cm-5.16.2/etc/init.d/cloudera-scm-agent stop
Stopping cloudera-scm-agent: [确定]
[root@hadoop102 parcel-repo]#
/opt/module/cm/cm-5.16.2/etc/init.d/cloudera-scm-server stop
停止 cloudera-scm-server: [确定]
5.5 删除CM数据(三台机器)
[root@hadoop102 ~]# umount cm_processes
[root@hadoop102 ~]# rm -rf /var/lib/cloudera* /var/log/cloudera* /var/run/cloudera*
#删除用户
[root@hadoop102 cm]# userdel cloudera-scm
#删除CM包
[root@hadoop102 ~]# rm -rf /opt/module/cm/
5.6 移除用户数据(三台机器)
#用户数据目录
[root@hadoop102 /]#
rm -rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper
[root@hadoop102 /]# rm -rf /usr/lib/hadoop /usr/lib/hadoop* /usr/lib/hive /usr/lib/hbase /usr/lib/oozie /usr/lib/sqoop* /usr/lib/zookeeper /usr/lib/bigtop* /usr/lib/flume-ng /usr/lib/hcatalog
[root@hadoop102 /]# rm -rf /var/run/hadoop* /var/run/flume-ng /var/run/cloudera* /var/run/oozie /var/run/sqoop2 /var/run/zookeeper /var/run/hbase /var/run/hbase /var/run/impala /var/run/hive /var/run/hdfs-sockets
#服务目录
[root@hadoop102 /]# rm -rf /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie
#配置文件目录
[root@hadoop102 /]# rm -rf /etc/cloudera* /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog
[root@hadoop102 /]# rm -rf /etc/alternatives/avro-tools /etc/alternatives/beeline /etc/alternatives/catalogd /etc/alternatives/cli_* /etc/alternatives/flume* /etc/alternatives/hadoop* /etc/alternatives/hbase* /etc/alternatives/hcat /etc/alternatives/hdfs /etc/alternatives/hive* /etc/alternatives/hue* /etc/alternatives/impala* /etc/alternatives/llama* /etc/alternatives/load_gen /etc/alternatives/mahout* /etc/alternatives/mapred /etc/alternatives/oozie* /etc/alternatives/pig* /etc/alternatives/pyspark /etc/alternatives/sentry* /etc/alternatives/solr* /etc/alternatives/spark* /etc/alternatives/sqoop* /etc/alternatives/statestored /etc/alternatives/whirr /etc/alternatives/yarn /etc/alternatives/zookeeper*
[root@hadoop102 /]# rm -rf /var/lib/alternatives/avro-tools /var/lib/alternatives/beeline /var/lib/alternatives/catalogd /var/lib/alternatives/cli_* /var/lib/alternatives/flume* /var/lib/alternatives/hadoop* /var/lib/alternatives/hbase* /var/lib/alternatives/hcat /var/lib/alternatives/hdfs /var/lib/alternatives/hive* /var/lib/alternatives/hue* /var/lib/alternatives/impala* /var/lib/alternatives/llama* /var/lib/alternatives/load_gen /var/lib/alternatives/mahout* /var/lib/alternatives/mapred /var/lib/alternatives/oozie* /var/lib/alternatives/pig* /var/lib/alternatives/pyspark /var/lib/alternatives/sentry* /var/lib/alternatives/solr* /var/lib/alternatives/spark* /var/lib/alternatives/sqoop* /var/lib/alternatives/statestored /var/lib/alternatives/whirr /var/lib/alternatives/yarn /var/lib/alternatives/zookeeper*
#hadoop数据目录
[root@hadoop102 /]# rm -rf /dfs /yarn
#安装目录及离线库目录
[root@hadoop102 /]# rm -rf /opt/cloudera/
5.7 停止并移除数据库
#停止服务
[root@hadoop102 /]# service mysql stop
#卸载数据库
[root@hadoop102 /]# yum remove MySQL*
#删除数据目录
[root@hadoop102 ~]# rm -rf /var/lib/mysql/
[root@hadoop102 ~]# rm -rf /usr/my.cnf
5.8 一键删除脚本
本节内容为5.4之后的操作。
[root@hadoop102 bin]# pwd
/root/bin
[root@hadoop102 bin]# vim delete-cloudera.sh
添加如下内容:
#! /bin/bash
for i in hadoop102 hadoop103 hadoop104
do
echo --------- $i ----------
ssh $i "source /etc/profile && umount cm_processes && rm -rf /var/lib/cloudera* /var/log/cloudera* /var/run/cloudera* /opt/module/cm/ /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper /usr/lib/hadoop /usr/lib/hadoop* /usr/lib/hive /usr/lib/hbase /usr/lib/oozie /usr/lib/sqoop* /usr/lib/zookeeper /usr/lib/bigtop* /usr/lib/flume-ng /usr/lib/hcatalog /var/run/hadoop* /var/run/flume-ng /var/run/cloudera* /var/run/oozie /var/run/sqoop2 /var/run/zookeeper /var/run/hbase /var/run/hbase /var/run/impala /var/run/hive /var/run/hdfs-sockets /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/cloudera* /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog /etc/alternatives/avro-tools /etc/alternatives/beeline /etc/alternatives/catalogd /etc/alternatives/cli_* /etc/alternatives/flume* /etc/alternatives/hadoop* /etc/alternatives/hbase* /etc/alternatives/hcat /etc/alternatives/hdfs /etc/alternatives/hive* /etc/alternatives/hue* /etc/alternatives/impala* /etc/alternatives/llama* /etc/alternatives/load_gen /etc/alternatives/mahout* /etc/alternatives/mapred /etc/alternatives/oozie* /etc/alternatives/pig* /etc/alternatives/pyspark /etc/alternatives/sentry* /etc/alternatives/solr* /etc/alternatives/spark* /etc/alternatives/sqoop* /etc/alternatives/statestored /etc/alternatives/whirr /etc/alternatives/yarn /etc/alternatives/zookeeper* /var/lib/alternatives/avro-tools /var/lib/alternatives/beeline /var/lib/alternatives/catalogd /var/lib/alternatives/cli_* /var/lib/alternatives/flume* /var/lib/alternatives/hadoop* /var/lib/alternatives/hbase* /var/lib/alternatives/hcat /var/lib/alternatives/hdfs /var/lib/alternatives/hive* /var/lib/alternatives/hue* /var/lib/alternatives/impala* /var/lib/alternatives/llama* /var/lib/alternatives/load_gen /var/lib/alternatives/mahout* /var/lib/alternatives/mapred /var/lib/alternatives/oozie* /var/lib/alternatives/pig* /var/lib/alternatives/pyspark /var/lib/alternatives/sentry* /var/lib/alternatives/solr* /var/lib/alternatives/spark* /var/lib/alternatives/sqoop* /var/lib/alternatives/statestored /var/lib/alternatives/whirr /var/lib/alternatives/yarn /var/lib/alternatives/zookeeper* /dfs /yarn /opt/cloudera/ && userdel cloudera-scm && service mysql stop && yum remove MySQL* && rm -rf /var/lib/mysql/ /usr/my.cnf"
done
[root@hadoop102 bin]# chmod 777 delete-cloudera.sh
[root@hadoop102 bin]# ./delete-cloudera.sh
第6章 项目实战之配置修改
6.1 HDFS配置域名访问
在阿里云环境下 Hadoop集群必须用域名访问,不能用IP访问,开启如下配置dfs.client.use.datanode.hostname
6.2 设置物理核和虚拟核占比
当前购买的阿里云配置物理核一共为6核,为演示效果将虚拟核扩大1倍,一般真实场景下物理核和虚拟核对比值为1:1或1:2
进入yarn配置,搜索‘yarn.nodemanager.resource.cpu-vcores’修改配置,每台机器物理核2核虚拟成4核
6.3修改单个容器下最大cpu申请资源
修改yarn.scheduler.maximum-allocation-vcores参数调整4核
6.4 设置每个任务容器内存大小和单节点大小
将每个任务容器默认大小从1G调大至4G,当前集群环境下每个节点的物理内存为8G,设置每个yarn可用每个节点内存为7G
修改yarn.scheduler.maximum-allocation-mb 每个任务容器内存所需大小
修改yarn.nodemanager.resource.memory-mb每个节点内存所需大小
6.5 关闭Spark动态分配资源参数
关闭spark.dynamicAllocation.enabled 参数否则分配的资源不受控制
6.6 修改HDFS副本数
修改 副本数为1
6.7 设置容量调度器
CDH默认公平调度器,修改为容量调度器
默认root队列,可以进行修改
添加两个队列spark与hive,spark资源设置占yarn集群80%,hive设置占yarn集群20%
配置完毕后重启服务,到yarn界面查看调度器,已经发生变化有hive队列和spark队列
6.8 修改hive-site.xml的配置
因为我们删除了yarn的default队列,但是hive里面执行sql默认走的还是default,如果不做设置的话,在hive里面执行sql会报错,所以我们需要在hive里面设置三个参数。
set mapreduce.job.queuename=hive;
set mapred.job.queue.name=hive;
set mapred.queue.names=hive;
说明下,这三个参数在hive窗口里执行是对当前窗口生效,属于临时生效。怎么让这三个参数永久生效呢?答案当然是修改hive-site.xml配置文件了。那么问题又来了,apache原生版本的hive,我们可以直接去服务器里面的hive的安装目录下的conf下直接修改hive-site.xml,cdh环境下又该怎么修改呢?答案请往下看
进入hive,选择配置选项,然后搜索hive-site.xml
在‘hive-site.xml 的 Hive 服务高级配置代码段(安全阀)’添加如图所示三个参数
在‘hive-site.xml 的 Hive 客户端高级配置代码段(安全阀)’再次添加
然后重启过时服务,重新部署过期客户端配置,再次进入hive,就可以正常使用hivesql了。