环境



三台天翼云主机 (node209, node452, node440)



OS:CentOS 6.5 64位



JDK:Oracle JDK 1.7.0_45

安装ZooKeeper(集群模式)



Node Type:



node229, node452, node440



 



1.所有节点安装zookeeper, zookeeper-server



yum install -y zookeeper zookeeper-server



 



2.所有节点修改zookeeper配置文件



vi /etc/zookeeper/conf/zoo.cfg



增加节点的配置



server.1=node229:2888:3888
server.2=node452:2888:3888
server.3=node440:2888:3888



 



3.所有节点初始化zookeeper-server



每个节点的myid唯一



node229:service zookeeper-server init --myid=1



node452:service zookeeper-server init --myid=2



node440:service zookeeper-server init --myid=3



 



4.所有节点启动zookeeper



service zookeeper-server start



 



5.查看zookeeper状态



zookeeper-server status



 



安装CDH(集群模式,HDFS+YARN)



Node Type:



namenode: node229



datanode: node229, node452, node440



yarn-resourcemanager: node452



yarn-nodemanager: node229, node452, node440



mapreduce-historyserver: node440



yarn-proxyserver: node440



 



node1: 



yum install hadoop-hdfs-namenode



node2: 



yum install hadoop-yarn-resourcemanager



node3: 



yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver



所有节点:



yum install hadoop-client



yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce



 



部署CDH



1.部署HDFS



(1) 配置文件



core-site.xml


<property>
    <name>fs.defaultFS</name>
    <value>hdfs://node229:8020</value>
  </property> 
   

     
   
 
   
  <property>
    <name>fs.trash.interval</name>
    <value>1440</value>
  </property>

 



hdfs-site.xml


<property>
    <name>dfs.permissions.superusergroup</name>
    <value>hadoop</value>
  </property> 
   

     
   
 
   
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/hadoop/hdfs/namenode</value>
  </property> 
   
 
 
   
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>/hadoop/hdfs/datanode</value>
  </property> 
   

     
   
 
   
  <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
  </property>



slaves



node209



node452



node440



 



(2)创建namenode和datanode文件夹



namenode:



mkdir -p /hadoop/hdfs/namenode
chown -R hdfs:hdfs /hadoop/hdfs/namenode
chmod 700 /hadoop/hdfs/namenode



datanode:
mkdir -p /hadoop/hdfs/datanode
chown -R hdfs:hdfs /hadoop/hdfs/datanode
chmod 700 /hadoop/hdfs/datanode



 



(3)格式化namenode



sudo -u hdfs hadoop namenode -format



 



(4)启动hdfs



namenode(node209):



service hadoop-hdfs-namenode start



datanode(node209, node452, node440):



service hadoop-hdfs-datanode start



(for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done)



 



(5)查看hdfs状态



sudo -u hdfs hdfs dfsadmin -report



sudo -u hdfs hadoop fs -ls -R -h /



 



(6)创建HDFS临时文件夹



sudo -u hdfs hadoop fs -mkdir /tmp



sudo -u hdfs hadoop fs -chmod -R 1777 /tmp



 



http://101.227.253.62:50070



 



2.部署YARN



(1)配置YARN



mapred-site.xml:


<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property> 
   
 
 
   
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>node440:10020</value>
  </property>
    

  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>node440:19888</value>
  </property>



yarn-site.xml

<property>
    <name>yarn.resourcemanager.address</name>
    <value>node452:8032</value>
  </property>

  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>node452:8030</value>
  </property>

  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>node452:8088</value>
  </property>

  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>node452:8031</value>
  </property>

  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>node452:8033</value>
  </property>

  <property>
    <description>Classpath for typical applications.</description>
     <name>yarn.application.classpath</name>
     <value>
        $HADOOP_CONF_DIR,
        $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
        $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
        $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
        
    $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*
    
     </value>
    
  </property>
    

  <property>
    
    <name>yarn.nodemanager.aux-services</name>
    
    <value>mapreduce_shuffle</value>
    
  </property>
    

  <property>
    
    <name>yarn.nodemanager.aux-services.
    mapreduce_shuffle
    .class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/hadoop/data/yarn/local</value>
  </property>

  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/hadoop/data/yarn/logs</value>
  </property>

  <property>
    <name>yarn.aggregation.enable</name>
    <value>true</value>
  </property>

  <property>
    <description>Where to aggregate logs</description>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/var/log/hadoop-yarn/apps</value>
  </property>


 

<property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>/user</value>
  </property>


 



(2)所有nodemanager创建本地目录



sudo mkdir -p /hadoop/data/yarn/local
sudo chown -R yarn:yarn /hadoop/data/yarn/local




sudo mkdir -p /hadoop/data/yarn/logs
sudo chown -R yarn:yarn /hadoop/data/yarn/logs



 



(3)创建HDFS目录



sudo -u hdfs hadoop fs -mkdir -p /user/history
sudo -u hdfs hadoop fs -chmod -R 1777 /user/history
sudo -u hdfs hadoop fs -chown yarn /user/history



 



sudo -u hdfs hadoop fs -mkdir -p /var/log/hadoop-yarn
sudo -u hdfs hadoop fs -chown yarn:mapred /var/log/hadoop-yarn



 



(4)启动YARN



ResourceManager(node452):
sudo service hadoop-yarn-resourcemanager start




NodeManager(node209, node452, node440):
sudo service hadoop-yarn-nodemanager start




MapReduce JobHistory Server(node440):
sudo service hadoop-mapreduce-historyserver start



 



(5)创建YARN的HDFS用户目录



sudo -u hdfs hadoop fs -mkdir -p /user/$USER
sudo -u hdfs hadoop fs -chown $USER /user/$USER



 



(6)测试



查看节点状态



yarn node -all -list



hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomwriter input



 



(7)关闭



sudo service hadoop-yarn-resourcemanager stop



sudo service hadoop-yarn-nodemanager stop



sudo service hadoop-mapreduce-historyserver stop



 



http://101.227.253.63:8088/



 



安装和部署HBase



Node Type:



hbase-master: node229, node440
hbase-regionserver: node229, node452, node440
hbase-thrift: node440
hbase-rest: node229, node452, node440



 



1.安装HBase



(1)修改配置



/etc/security/limits.conf,增加配置
hdfs - nofile 32768
hbase - nofile 32768



 



hdfs-site.xml,增加配置

<property>
    
    <name>dfs.datanode.max.xcievers</name>
    
    <value>4096</value>
    
  </property>



(2)安装HBase



hbase-master: 



sudo yum install hbase hbase-master
hbase-regionserver: 



sudo yum install hbase hbase-regionserver
hbase-thrift: 



sudo yum install hbase-thrift
hbase-rest: 



sudo yum install hbase-rest



 



(3)配置HBase



hbase-site.xml

<property>
    <name>hbase.rest.port</name>
    <value>60050</value>
  </property>

  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>node229, node452, node440</value>
  </property>

  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>

  <property>
    <name>hbase.tmp.dir</name>
    <value>/hadoop/hbase</value>
  </property>

  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://node229:8020/hbase/
    </value> 
   
  </property>


(4)创建本地目录



mkdir -p /hadoop/hbase



chown -R hbase:hbase /hadoop/hbase



 



(5)创建hbase的HDFS目录



sudo -u hdfs hadoop fs -mkdir /hbase/



sudo -u hdfs hadoop fs -chown hbase /hbase



 



(6)启动HBase



hbase-master: 



sudo service hbase-master start



sudo service hbase-regionserver start



sudo service hbase-thrift start



sudo service hbase-rest start



 



http://101.227.253.62:60010