前言

由于工作的需要,需要对hadoop源码进行修改,所以需要先对源码进行编译。此处主要在两类环境下编译:

  1. linux7.0
  2. mac系统

hadoop源码编译

linux环境

  • 环境说明:
1、Linux系统为centos7.0
3、Hadoop为hadoop-2.8.4-src.tar.gz
4、JDK为1.8.0_201
5、Maven为3.6.1
6、cmake
7、protobuf为protobuf-2.5.0.tar.gz(必须是2.5版本的,而且必须在root用户下安装)
命令如下:
	& tar zvxf protobuf-2.5.0.tar.gz
	& cd protobuf-2.5.0
	& ./configure
	& make  #要编译很久
	& make check
	& make install
	& protoc –version
8、ant(先安ant,yum install ant -y)
9、findbugs为findbugs-3.0.1.tar.gz
下载编译

从hadoop官网下载hadoop源码hadoop-2.8.4-src.tar.gz,并解压

& tar -zxvf hadoop-2.8.4.tar.gz
& cd hadoop-2.8.4
& mvn clean package -Pdist,native -DskipTests -Dtar //编译命令

编译结果如下图所示:

[INFO] Reactor Summary for Apache Hadoop Main 2.8.4:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  0.833 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  0.956 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  0.839 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  2.756 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.159 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.267 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  3.034 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  4.256 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [  5.216 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  3.084 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [01:29 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [  4.358 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [ 14.050 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.037 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [05:32 min]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [02:19 min]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 11.032 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 38.962 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [01:37 min]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [  3.173 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.032 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  0.031 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [ 13.887 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [06:24 min]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  0.032 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [  6.388 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 16.040 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [  2.983 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [01:13 min]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 19.631 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [  1.028 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [  4.461 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [  2.890 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [  2.961 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  0.029 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [  2.342 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [  1.708 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  0.034 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [  4.434 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [  4.995 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [  0.112 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 18.892 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 14.019 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [  3.091 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [  7.572 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [  4.576 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [ 25.799 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [  1.607 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [  4.430 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [  3.405 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [ 37.153 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [ 13.869 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  1.936 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  1.991 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [  4.302 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  3.270 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  2.109 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  1.742 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  2.373 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  9.168 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  3.655 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [01:42 min]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [ 20.256 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [  8.162 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  0.835 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  4.830 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [01:25 min]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  7.986 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.025 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 53.581 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  29:06 min
[INFO] Finished at: 2019-04-15T17:48:04Z
[INFO] ------------------------------------------------------------------------

然后进到路径/hadoop-2.8.4-src/hadoop-dist/target下

[hadoop@bigdata-01 target]$ ll
total 582764
drwxrwxr-x. 2 hadoop hadoop        28 Apr 15 17:47 antrun
drwxrwxr-x. 3 hadoop hadoop        22 Apr 15 17:47 classes
-rw-rw-r--. 1 hadoop hadoop      2118 Apr 15 17:47 dist-layout-stitching.sh
-rw-rw-r--. 1 hadoop hadoop       651 Apr 15 17:47 dist-tar-stitching.sh
drwxrwxr-x. 9 hadoop hadoop       149 Apr 15 17:47 hadoop-2.8.4
-rw-rw-r--. 1 hadoop hadoop 198465540 Apr 15 17:47 hadoop-2.8.4.tar.gz
-rw-rw-r--. 1 hadoop hadoop     30421 Apr 15 17:47 hadoop-dist-2.8.4.jar
-rw-rw-r--. 1 hadoop hadoop 398181013 Apr 15 17:48 hadoop-dist-2.8.4-javadoc.jar
-rw-rw-r--. 1 hadoop hadoop     27738 Apr 15 17:48 hadoop-dist-2.8.4-sources.jar
-rw-rw-r--. 1 hadoop hadoop     27738 Apr 15 17:48 hadoop-dist-2.8.4-test-sources.jar
drwxrwxr-x. 2 hadoop hadoop        51 Apr 15 17:47 javadoc-bundle-options
drwxrwxr-x. 2 hadoop hadoop        28 Apr 15 17:47 maven-archiver
drwxrwxr-x. 3 hadoop hadoop        22 Apr 15 17:47 maven-shared-archive-resources
drwxrwxr-x. 3 hadoop hadoop        22 Apr 15 17:47 test-classes
drwxrwxr-x. 2 hadoop hadoop         6 Apr 15 17:47 test-dir
[hadoop@bigdata-01 target]$ pwd
/home/hadoop/complier/hadoop-2.8.4-src/hadoop-dist/target

看到hadoop-2.8.4.tar.gz就是编译的结果。

⚠️:由于每个人的环境都不一致,所以具体问题具体分析。

Mac环境

由于mac没有yum的工具,只能找到类似的工具brew,安装:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install xxx
  • 环境说明
  • 环境说明:
1、Linux系统为macOS 10.14 (18A391)
3、Hadoop为hadoop-2.8.4-src.tar.gz
4、JDK为1.8.0_201
5、Maven为3.6.1
6、cmake
7、protobuf为protobuf-2.5.0.tar.gz(必须是2.5版本的,而且必须在root用户下安装)
命令如下:
	& tar zvxf protobuf-2.5.0.tar.gz
	& cd protobuf-2.5.0
	& ./configure
	& make  #要编译很久
	& make check
	& make install
	& protoc –version
8、ant [安装步骤](https://www.jianshu.com/p/34bdfb5943e0 "安装步骤")
9、findbugs为findbugs-3.0.tar.gz

编译结果

[INFO] Reactor Summary for Apache Hadoop Main 2.8.4:
[INFO] 
[INFO] Apache Hadoop Main ................................. SUCCESS [  1.392 s]
[INFO] Apache Hadoop Build Tools .......................... SUCCESS [  0.767 s]
[INFO] Apache Hadoop Project POM .......................... SUCCESS [  0.962 s]
[INFO] Apache Hadoop Annotations .......................... SUCCESS [  2.552 s]
[INFO] Apache Hadoop Assemblies ........................... SUCCESS [  0.213 s]
[INFO] Apache Hadoop Project Dist POM ..................... SUCCESS [  1.536 s]
[INFO] Apache Hadoop Maven Plugins ........................ SUCCESS [  2.987 s]
[INFO] Apache Hadoop MiniKDC .............................. SUCCESS [  3.301 s]
[INFO] Apache Hadoop Auth ................................. SUCCESS [  4.738 s]
[INFO] Apache Hadoop Auth Examples ........................ SUCCESS [  2.686 s]
[INFO] Apache Hadoop Common ............................... SUCCESS [01:13 min]
[INFO] Apache Hadoop NFS .................................. SUCCESS [  3.900 s]
[INFO] Apache Hadoop KMS .................................. SUCCESS [  9.791 s]
[INFO] Apache Hadoop Common Project ....................... SUCCESS [  0.045 s]
[INFO] Apache Hadoop HDFS Client .......................... SUCCESS [ 18.451 s]
[INFO] Apache Hadoop HDFS ................................. SUCCESS [ 45.606 s]
[INFO] Apache Hadoop HDFS Native Client ................... SUCCESS [ 10.545 s]
[INFO] Apache Hadoop HttpFS ............................... SUCCESS [ 14.501 s]
[INFO] Apache Hadoop HDFS BookKeeper Journal .............. SUCCESS [  3.362 s]
[INFO] Apache Hadoop HDFS-NFS ............................. SUCCESS [  2.758 s]
[INFO] Apache Hadoop HDFS Project ......................... SUCCESS [  0.047 s]
[INFO] Apache Hadoop YARN ................................. SUCCESS [  0.048 s]
[INFO] Apache Hadoop YARN API ............................. SUCCESS [  9.414 s]
[INFO] Apache Hadoop YARN Common .......................... SUCCESS [ 21.098 s]
[INFO] Apache Hadoop YARN Server .......................... SUCCESS [  0.053 s]
[INFO] Apache Hadoop YARN Server Common ................... SUCCESS [  4.650 s]
[INFO] Apache Hadoop YARN NodeManager ..................... SUCCESS [ 14.975 s]
[INFO] Apache Hadoop YARN Web Proxy ....................... SUCCESS [  2.196 s]
[INFO] Apache Hadoop YARN ApplicationHistoryService ....... SUCCESS [  4.730 s]
[INFO] Apache Hadoop YARN ResourceManager ................. SUCCESS [ 15.117 s]
[INFO] Apache Hadoop YARN Server Tests .................... SUCCESS [  1.095 s]
[INFO] Apache Hadoop YARN Client .......................... SUCCESS [  3.677 s]
[INFO] Apache Hadoop YARN SharedCacheManager .............. SUCCESS [  2.377 s]
[INFO] Apache Hadoop YARN Timeline Plugin Storage ......... SUCCESS [  2.135 s]
[INFO] Apache Hadoop YARN Applications .................... SUCCESS [  0.045 s]
[INFO] Apache Hadoop YARN DistributedShell ................ SUCCESS [  1.953 s]
[INFO] Apache Hadoop YARN Unmanaged Am Launcher ........... SUCCESS [  1.366 s]
[INFO] Apache Hadoop YARN Site ............................ SUCCESS [  0.053 s]
[INFO] Apache Hadoop YARN Registry ........................ SUCCESS [  3.407 s]
[INFO] Apache Hadoop YARN Project ......................... SUCCESS [  4.462 s]
[INFO] Apache Hadoop MapReduce Client ..................... SUCCESS [  0.119 s]
[INFO] Apache Hadoop MapReduce Core ....................... SUCCESS [ 14.239 s]
[INFO] Apache Hadoop MapReduce Common ..................... SUCCESS [ 10.339 s]
[INFO] Apache Hadoop MapReduce Shuffle .................... SUCCESS [  2.236 s]
[INFO] Apache Hadoop MapReduce App ........................ SUCCESS [  6.136 s]
[INFO] Apache Hadoop MapReduce HistoryServer .............. SUCCESS [  3.624 s]
[INFO] Apache Hadoop MapReduce JobClient .................. SUCCESS [  4.289 s]
[INFO] Apache Hadoop MapReduce HistoryServer Plugins ...... SUCCESS [  1.404 s]
[INFO] Apache Hadoop MapReduce Examples ................... SUCCESS [  3.541 s]
[INFO] Apache Hadoop MapReduce ............................ SUCCESS [  2.939 s]
[INFO] Apache Hadoop MapReduce Streaming .................. SUCCESS [  2.875 s]
[INFO] Apache Hadoop Distributed Copy ..................... SUCCESS [  3.669 s]
[INFO] Apache Hadoop Archives ............................. SUCCESS [  1.444 s]
[INFO] Apache Hadoop Archive Logs ......................... SUCCESS [  2.097 s]
[INFO] Apache Hadoop Rumen ................................ SUCCESS [  3.473 s]
[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  2.667 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  1.595 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  1.338 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  1.980 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  8.916 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  2.845 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [  3.588 s]
[INFO] Apache Hadoop Azure support ........................ SUCCESS [  2.859 s]
[INFO] Apache Hadoop Client ............................... SUCCESS [  5.019 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  0.538 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  3.115 s]
[INFO] Apache Hadoop Azure Data Lake support .............. SUCCESS [  1.993 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [  4.831 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.033 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 27.582 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  07:09 min
[INFO] Finished at: 2019-04-16T12:49:29+08:00
[INFO] ------------------------------------------------------------------------

踩坑

1.(linux)

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-common: An Ant BuildException has occured: Execute failed: java.io.IOException: Cannot run program "cmake" (in directory "/home/hadoop/complier/hadoop-2.8.4/hadoop-common-project/hadoop-common/target/native"): error=2, No such file or directory
[ERROR] around Ant part ...<exec failonerror="true" dir="/home/hadoop/complier/hadoop-2.8.4/hadoop-common-project/hadoop-common/target/native" executable="cmake">... @ 4:138 in /home/hadoop/complier/hadoop-2.8.4/hadoop-common-project/hadoop-common/target/antrun/build-main.xml

关于 maven-antrun-plugin:1.7:run 这块的问题出错的比较多,分好几种,要注意看具体原因,比如我碰到的是 cmake 相关的(原因大多是每安装cmake或者版本太低导致) 而有的则是是 findbugs相关的。 2.(linux)

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (dist) on project hadoop-kms: An Ant BuildException has occured: java.net.SocketException: Connection reset
[ERROR] around Ant part ...<get skipexisting="true" src="http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.48/bin/apache-tomcat-6.0.48.tar.gz" dest="downloads/apache-tomcat-6.0.48.tar.gz" verbose="true"/>... @ 5:182 in /home/hadoop/complier/hadoop-2.8.4/hadoop-common-project/hadoop-kms/target/antrun/build-main.xml

//kms这个,换个目录重新编译或者多试几次,因为下载tomcat超时导致的问题。 3.(mac)

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 2
[ERROR] around Ant part ...<exec failonerror="true" dir="/Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/target/native" executable="make">... @ 8:159 in /Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/target/antrun/build-main.xml

这个错误花费了很长时间才解决,解决方式如下所示,参考解决方案:

通过报错信息,我们去对应的路径下查看build-main.xml文件:

<?xml version="1.0" encoding="UTF-8" ?>
<project name="maven-antrun-" default="main"  >
<target name="main">
  <mkdir dir="/Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/target/native"/>
  <exec failonerror="true" dir="/Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/target/native" executable="cmake">
    <arg line="/Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/src/ -DJVM_ARCH_DATA_MODEL=64"/>
  </exec>
  <exec failonerror="true" dir="/Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/target/native" executable="make">
    <arg line="VERBOSE=1"/>
  </exec>
  <exec failonerror="true" dir="/Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/target/native" executable="make"></exec>
</target>
</project>

在终端执行:

cmake /Users/lixuewei/workspace/private/complier-hadoop/hadoop-2.8.4-src/hadoop-tools/hadoop-pipes/src/ -DJVM_ARCH_DATA_MODEL=64

此处一般会出现报错信息或者没有,接下来需要配置环境变量:

vim ~/.bash_profile
增加如下两行(openssl的版本自己去查,openessl version)
export OPENSSL_ROOT_DIR=/usr/local/Cellar/openssl/1.1.1a
export OPENSSL_INCLUDE_DIR=/usr/local/Cellar/openssl/1.1.1a/include
source ~/.bash_profile

cmake  /Users/shihaolin/opt/software/hadoop-2.7.4-src/hadoop-tools/hadoop-pipes/src/ -DJVM_ARCH_DATA_MODEL=6

最后在执行编译

mvn clean package -Pdist,native -DskipTests -Dtar

执行结束并没有解决问题。后来发现是本地openssl是利用anacodna安装的1.1.1b版本。而我自己安装的则是1.0.2r版本。利用上述的解决办法,版本加错了。将配置文件中增加的两个变量注释掉,重新编译,成功了。

hadoop在win环境测试demo问题记录

问题1:java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. 参考:https://wiki.apache.org/hadoop/WindowsProblems

从其中选定版本,然后将其中的winutils.exe与hadoop.dll复制到c://Windows中

如果还没有成功,则在dmeo中添加

System.setProperty("hadoop.home.dir", "D:\\winutils-master\\winutils-master\\hadoop-2.8.3");