本章将一步一步在win10下安装Hadoop3.0.0,并启动服务进行简单的hdfs操作。

准备工作

1、Hadoop官方下载地址: http://hadoop.apache.org/releases.html

选择最新的3.0.0版本后跳转至下面的页面选择红色框部分链接即可下载,建议采用专业的下载工具下载,速度更快

2、将tar.gz包解压至D盘根目录:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-YOMaQ0fQ-1651103987930)(//img-blog.csdn.net/20180315224259680?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

Note:必须用管理员权限解压

3、配置环境变量:

  • 添加HADOOP_HOME配置:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-mZhSzolt-1651103987931)(//img-blog.csdn.net/20180315224305785?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

  • 在Path中添加如下:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-LLBPEGb3-1651103987931)(//img-blog.csdn.net/20180315224311319?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

4、Hadoop需要依赖JDK,考虑其路径中不能有空格,故直接安装如下目录:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-eIOZuQDf-1651103987932)(//img-blog.csdn.net/20180315224317260?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

Hadoop配置

1、修改D:/hadoop-3.0.0/etc/hadoop/ core-site.xml配置:

<configuration>
	<property>
       <name>fs.default.name</name>
       <value>hdfs://localhost:9000</value>
   </property>
</configuration>

2、修改D:/hadoop-3.0.0/etc/hadoop/ mapred-site.xml配置:

<configuration>
	<property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
   </property>
</configuration>

3、在D:/hadoop-3.0.0目录下创建data目录,作为数据存储路径:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-jLLeEBqi-1651103987933)(//img-blog.csdn.net/20180315224324872?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

  • 在D:/hadoop-3.0.0/data目录下创建datanode目录;
  • 在D:/hadoop-3.0.0/data目录下创建namenode目录;

4、修改D:/hadoop-3.0.0/etc/hadoop/ hdfs-site.xml配置:

<configuration>
	<!-- 这个参数设置为1,因为是单机版hadoop -->
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
	<property> 
     <name>dfs.permissions</name> 
     <value>false</value> 
  </property>
   <property>
       <name>dfs.namenode.name.dir</name>
       <value>/D:/hadoop-3.0.0/data/namenode</value>
   </property>
   <property>
		<name>fs.checkpoint.dir</name>
		<value>/D:/hadoop-3.0.0/data/snn</value>
	</property>
	<property>
		<name>fs.checkpoint.edits.dir</name>
		<value>/D:/hadoop-3.0.0/data/snn</value>
	</property>
	   <property>
       <name>dfs.datanode.data.dir</name>
       <value>/D:/hadoop-3.0.0/data/datanode</value>
   </property>
</configuration>

5、修改D:/hadoop-3.0.0/etc/hadoop/ yarn-site.xml配置:

<configuration>
<!-- Site specific YARN configuration properties -->
	<property>
    	<name>yarn.nodemanager.aux-services</name>
    	<value>mapreduce_shuffle</value>
   </property>
   <property>
      	<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>  
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
   </property>
</configuration>

6、修改D:/hadoop-3.0.0/etc/hadoop/ hadoop-env.cmd配置,找到" set JAVA_HOME=%JAVA_HOME%“替换为"set JAVA_HOME=D:hadoop-3.0.0jdk1.8.0_151”

7、bin目录替换,至 https://github.com/steveloughran/winutils 下载解压

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-DiXAhB2O-1651103987934)(//img-blog.csdn.net/20180315224352154?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

找到对应的版本后完整替换bin目录即可

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-hf9fQejb-1651103987935)(//img-blog.csdn.net/20180315224357646?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

至此配置完成.

启动服务

1、D:hadoop-3.0.0in> hdfs namenode -format

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-kiBoJZmR-1651103987936)(//img-blog.csdn.net/20180315224405136?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

2、通过start-all.cmd启动服务:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Y1oV4k14-1651103987936)(//img-blog.csdn.net/20180315224409341?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

3、此时可以看到同时启动了如下4个服务:

  • Hadoop Namenode
  • Hadoop datanode
  • YARN Resourc Manager
  • YARN Node Manager

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-C3F4wrV2-1651103987937)(//img-blog.csdn.net/20180315224414146?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

HDFS应用

1、通过 http://127.0.0.1:8088/ 即可查看集群所有节点状态:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qKD3QcnY-1651103987938)(//img-blog.csdn.net/20180315224421485?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

2、访问 http://localhost:9870/ 即可查看文件管理页面:

  • 进入文件管理页面:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RzSS5bMF-1651103987939)(//img-blog.csdn.net/20180315224426666?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

  • 创建目录:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-BscLdfyv-1651103987939)(//img-blog.csdn.net/20180315224431741?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

  • 上传文件

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-S4Vjb4nc-1651103987940)(//img-blog.csdn.net/20180315224436281?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

  • 上传成功

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-l0cg5Gak-1651103987940)(//img-blog.csdn.net/20180315224443247?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

Note:在之前的版本中文件管理的端口是50070 ,在3.0.0中替换为了9870端口,具体变更信息来源如下官方说明

http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Web_Interface

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-4TumaHHo-1651103987941)(//img-blog.csdn.net/20180315224450428?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

3、通过hadoop命令行进行文件操作:

  • mkdir命令创建目录:hadoop fs -mkdir hdfs://localhost:9000/user

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fzSGr2SB-1651103987942)(//img-blog.csdn.net/20180315224458915?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

如下新增的user目录

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-reVKnilF-1651103987942)(//img-blog.csdn.net/20180315224502361?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

  • put命令上传文件:hadoop fs -put C:UserssonghaifengDesktop .txt hdfs://localhost:9000/user/

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vq5WSJYs-1651103987942)(//img-blog.csdn.net/2018031522450742?watermark/2/text/Ly9ibG9nLmNzZG4ubmV0L3NvbmdoYWlmZW5nc2h1YWlnZQ==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70)]

如下上传文件

  • ls命令查看指定目录文件列表:hadoop fs -ls hdfs://localhost:9000/user/

/