Hadoop伪分布式模式部署


Hadoop2.x:

官网:hadoop.apache.org
三个组件:


HDFS:分布式文件系统,存储


MapReduce:分布式计算


Yarn:资源(cpu + memory)和JOB调度监控



文档地址:


    http://hadoop.apache.org/docs/r2.8.2/



部署方式:


1.单机模式(Standalone):1个Java进程



2.伪分布模式(Pseudo-Distribute Mode):开发|学习   多个Java进程



3.集群模式(Cluster Mode):生产 多态机器多个Java进程





伪分布式部署:HDFS
1.创建hadoop服务的一个用户



# useradd hadoop



# id hadoop



# vi /etc/sudoers



    hadoop ALL=(root) NOPASSWD:ALL





2.部署Java


使用:Oracle jdk1.8



尽力不要使用: Open JDk



    2.1、解压+环境变量



    2.2、CDH课 /usr/java



        最好:which java



3.部署ssh,确保其是运行的



查看:默认是已经安装



    service sshd status




4.解压Hadoop

解压:



    software]# tar -zxvf hadoop-2.8.1.tar.gz
修改权限:



    chown -R root:root hadoop-2.8.1



配置环境变量

vi /etc/profile
    export HADOOP_HOME=/opt/software/hadoop-2.8.1
    export PATH=$HADOOP_HOME/bin:$PROTOC_HOME/bin:$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$JAVA_HOME/bin:$PATH
    # which hadoop
    # /opt/software/hadoop-2.8.1/bin/hadoop



软连接:



software]# ln -s /opt/software/hadoop-2.8.1 hadoop



修改用户和用户组:



  

software]# chown -R hadoop:hadoop hadoop



        修改软连接



software]# chown -R hadoop:hadoop hadoop/*



        修改软连接文件夹里面的内容


software]# chown -R hadoop:hadoop hadoop-2.8.1



        修改原文件夹



 



    chown -R hadoop:hadoop 文件夹:修改文件夹和文件夹里面的内容



    chown -R hadoop:hadoop 软连接文件夹:修改软连接文件夹,不会修改软连接文件夹里面的



    chown -R hadoop:hadoop/* 软连接文件夹/*:软连接文件夹不修改,只修改软连接文件夹里面的



   



    software]# cd hadoop-2.8.1



    #删除文件夹内.cmd后缀文件避免混淆


hadoop-2.8.1]# rm -rf /hadoop-2.8.1/bin/*.cmd
    hadoop-2.8.1]# rm -rf /hadoop-2.8.1/sbin/*.cmd


        bin:执行命令的shell



        etc:配置文件



        lib:lib库



        sbin:启动和关闭hadoop里面的组件(进程)
        hadoop-2.8.1/share/hadoop/hdfs:jar包




5.hadoop的模块组件配置core-site.xml

Hadoop-env.sh :hadoop配合环境
core-site.xml     :hadoop 核心配置文件
hdfs-site.xml     :针对于hdfs服务的会起进程



[mapred-site.xml:MapReduce计算所需要的配置文件] 只有在jar计算时才有



yarn-site.xml     :yarn服务的会起进程
slaves                :集群的机器名称



配置core-site文件

etc/hadoop/core-site.xml:
 
  
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration> 
  
etc/hadoop/hdfs-site.xml:
 
  
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration> 
  
6.配置root用户的ssh的信任关系 
  
 
   
   
   
  
~]# rm -rf .ssh
 
  
~]# ssh-keygen
~]# cd .ssh
 .ssh]# cat id_rsa.pub >> authorized_keys 添加自己的公钥到信任名单 
  
#第一次执行
 
  
 ~]# ssh localhost date
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is ec:85:86:32:22:94:d1:a9:f2:0b:c5:12:3f:ba:e2:61.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Sun May 13 21:49:14 CST 2018 
  
#第二次执行
~]# ssh localhost date
Sun May 13 21:49:17 CST 2018


7.格式化文件系统


   $ bin/hdfs namenode -format

8.java home配置

hadoop]# vi  hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_45
9.Start NameNode daemon and DataNode daemon:
hadoop-2.8.1]# sbin/start-dfs.sh  执行shell脚本
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-namenode-hadoop000.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-datanode-hadoop000.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is ec:85:86:32:22:94:d1:a9:f2:0b:c5:12:3f:ba:e2:61.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-root-secondarynamenode-hadoop000.out /]# jps
9861 DataNode
9768 NameNode
10056 SecondaryNameNode
10492 Jps