前面在Windows 7 下测试了一下Hadoop的安装配置启动,感觉不过瘾。不如再在Linux下做一下试试。鉴于Windows下安装时找了好几次博客,最终才找到合适的教程,这次要认真找一下看按哪个教程做。
从网上找了好几篇文章,终于觉得此处的教程写的不错:
http://www.powerxing.com/install-hadoop/
于是按照这个一步步来。
VMWare之前已经安装了,从网上下个Ubuntu 14.04的安装包就可以。这个安装包还真不是太大,才9百多M。分分钟下载来,开始安装。
安装Ubuntur的教程很多, 过程也很顺利,这里不再叙述。
按照教程安装时直接设定了Hadoop的用户名和密码,完成后为Hadoop用户增加管理员权限,方便部署,避免一些对新手来说比较棘手的权限问题:
sudo adduser hadoop sudo
sudo是linux系统管理指令,是允许系统管理员让普通用户执行一些或者全部的root命令的一个工具,如halt,reboot,su等等。这样不仅减少了root用户的登录 和管理时间,同样也提高了安全性。sudo不是对shell的一个代替,它是面向每个命令的。
在Ubuntu终端窗口中,复制粘贴的快捷键需要加上 shift,即粘贴是 ctrl+shift+v
用 hadoop 用户登录后,我们先更新一下 apt,后续我们使用 apt 安装软件,如果没更新可能有一些软件安装不了。按 ctrl+alt+t 打开终端窗口,执行如下命令:
sudo apt-get update
Advanced Packaging Tool(apt)是Linux下的一款安装包管理工具。apt-get是一条linux命令,适用于deb包管理式的操作系统,主要用于自动从互联网的软件仓库中搜索、安装、升级、卸载软件或操作系统。
建议安装 vim(vi增强版,基本用法相同),(如果你实在还不会用 vi/vim 的,请将后面用到 vim 的地方改为 gedit,这样可以使用文本编辑器进行修改,并且每次文件更改完成后请关闭整个 gedit 程序,否则会占用终端):
sudo apt-get install vim
安装SSH、配置SSH无密码登陆
集群、单节点模式都需要用到 SSH 登陆(类似于远程登陆,你可以登录某台 Linux 主机,并且在上面运行命令),Ubuntu 默认已安装了 SSH client,此外还需要安装 SSH server:
sudo apt-get install openssh-server
安装后,可以使用如下命令登陆本机:
ssh localhost
此时会有如下提示(SSH首次登陆提示),输入 yes 。然后按提示输入密码 hadoop,这样就登陆到本机了。
但这样登陆是需要每次输入密码的,我们需要配置成SSH无密码登陆比较方便。
首先退出刚才的 ssh,就回到了我们原先的终端窗口,然后利用 ssh-keygen 生成密钥,并将密钥加入到授权中:
exit # 退出刚才的 ssh localhost
cd ~/.ssh/ # 若没有该目录,请先执行一次ssh localhost
ssh-keygen -t rsa # 会有提示,都按回车就可以
cat ./id_rsa.pub >> ./authorized_keys # 加入授权
~的含义
在 Linux 系统中,~ 代表的是用户的主文件夹,即 “/home/用户名” 这个目录,如你的用户名为 hadoop,则 ~ 就代表 “/home/hadoop/”。 此外,命令中的 # 后面的文字是注释。
RSA公钥加密算法是1977年由罗纳德·李维斯特(Ron Rivest)、阿迪·萨莫尔(Adi Shamir)和伦纳德·阿德曼(Leonard Adleman)一起提出的。1987年首次公布,当时他们三人都在麻省理工学院工作。RSA就是他们三人姓氏开头字母拼在一起组成的。
此时再用 ssh localhost 命令,无需输入密码就可以直接登陆了,如下图所示。
hadoop@hadoop-virtual-machine:~$ ssh localhost
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)
* Documentation: https://help.ubuntu.com/
733 packages can be updated.
367 updates are security updates.
Last login: Fri Apr 7 10:06:38 2017 from localhost
hadoop@hadoop-virtual-machine:~$
安装Java环境
Java环境可选择 Oracle 的 JDK,或是 OpenJDK,按http://wiki.apache.org/hadoop/HadoopJavaVersions中说的,新版本在 OpenJDK 1.7 下是没问题的。为图方便,这边直接通过命令安装 OpenJDK 7。
sudo apt-get install openjdk-7-jre openjdk-7-jdk
安装好 OpenJDK 后,需要找到相应的安装路径,这个路径是用于配置 JAVA_HOME 环境变量的。执行如下命令:
dpkg -L openjdk-7-jdk | grep '/bin/javac'
该命令会输出一个路径,除去路径末尾的 “/bin/javac”,剩下的就是正确的路径了。如输出路径为 /usr/lib/jvm/java-7-openjdk-amd64/bin/javac,则我们需要的路径为 /usr/lib/jvm/java-7-openjdk-amd64。
接着配置 JAVA_HOME 环境变量,为方便,我们在 ~/.bashrc 中进行设置:
vim ~/.bashrc
在文件最前面添加如下单独一行(注意 = 号前后不能有空格),将“JDK安装路径”改为上述命令得到的路径,并保存:
export JAVA_HOME=JDK安装路径
进入bashrc文件后,按 i 进入编辑模式。将以上内定输入后,按ESC, 再输入”:wq” 存盘退出。这些是vimr 操作,详细可以找下下vim操作教程。比如
如下图所示(该文件原本可能不存在,内容为空,这不影响)
接着还需要让该环境变量生效,执行如下代码:
source ~/.bashrc # 使变量设置生效
设置好后我们来检验一下是否设置正确:
echo $JAVA_HOME # 检验变量值
java -version
$JAVA_HOME/bin/java -version # 与直接执行 java -version 一样
如果设置正确的话,$JAVA_HOME/bin/java -version 会输出 java 的版本信息,且和 java -version 的输出结果一样,如下图所示:
这样,Hadoop 所需的 Java 运行环境就安装好了。
安装 Hadoop 2
这里我选择了2.7.3版的,算是稳定版了。
在Ubuntu中可以直接从浏览器中直接下载,下载的文件放在路径:
/home/hadoop/下载
在 Linux 系统中,/ 表示根目录,是最上层目录,其它所有的路径都在这层目录之下。
~的含义在 Linux 系统中,~ 代表的是用户的主文件夹,即 “/home/Hadoop” 这个目录。(本机的用户名是Hadoop)
./bin/…,./etc/…等带./的路径是相对路径,比如在/home/hadoop/路径下,./就表示/home/hadoop/
然后用tar命令解压,
sudo tar -zxf ~/下载/hadoop-2.7.3.tar.gz -C /usr/local
这是将Hadoop解压到了/usr/local, 用ls命令查看一下/usr/local如下图所示:
可以利用以下命令检查hadoop是否可用:
cd /usr/local/hadoop
./bin/hadoop version
Hadoop 默认模式为非分布式模式,无需进行其他配置即可运行。非分布式即单 Java 进程,方便进行调试。
现在我们可以执行例子来感受下 Hadoop 的运行。
grep的例子:
cd /usr/local/hadoop
mkdir ./input
cp ./etc/hadoop/*.xml ./input # 将配置文件作为输入文件
./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep ./input ./output 'dfs[a-z.]+'
cat ./output/* # 查看运行结果
17/04/07 13:49:51 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
17/04/07 13:49:51 INFO input.FileInputFormat: Total input paths to process : 8
17/04/07 13:49:51 INFO mapreduce.JobSubmitter: number of splits:8
17/04/07 13:49:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1632798238_0001
17/04/07 13:49:52 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/07 13:49:52 INFO mapreduce.Job: Running job: job_local1632798238_0001
17/04/07 13:49:52 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/07 13:49:52 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:52 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/04/07 13:49:52 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/07 13:49:52 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000000_0
17/04/07 13:49:52 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:52 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:52 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/hadoop-policy.xml:0+9683
17/04/07 13:49:53 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:53 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:53 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:53 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:53 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:53 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:53 INFO mapreduce.Job: Job job_local1632798238_0001 running in uber mode : false
17/04/07 13:49:53 INFO mapreduce.Job: map 0% reduce 0%
17/04/07 13:49:53 INFO mapred.LocalJobRunner:
17/04/07 13:49:53 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:53 INFO mapred.MapTask: Spilling map output
17/04/07 13:49:53 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600
17/04/07 13:49:53 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600
17/04/07 13:49:53 INFO mapred.MapTask: Finished spill 0
17/04/07 13:49:53 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000000_0 is done. And is in the process of committing
17/04/07 13:49:53 INFO mapred.LocalJobRunner: map
17/04/07 13:49:53 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000000_0' done.
17/04/07 13:49:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000000_0
17/04/07 13:49:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000001_0
17/04/07 13:49:53 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:53 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:53 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/kms-site.xml:0+5511
17/04/07 13:49:54 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:54 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:54 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:54 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:54 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:54 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:54 INFO mapred.LocalJobRunner:
17/04/07 13:49:54 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:54 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000001_0 is done. And is in the process of committing
17/04/07 13:49:54 INFO mapred.LocalJobRunner: map
17/04/07 13:49:54 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000001_0' done.
17/04/07 13:49:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000001_0
17/04/07 13:49:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000002_0
17/04/07 13:49:54 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:54 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:54 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/capacity-scheduler.xml:0+4436
17/04/07 13:49:54 INFO mapreduce.Job: map 100% reduce 0%
17/04/07 13:49:54 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:54 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:54 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:54 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:54 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:54 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:54 INFO mapred.LocalJobRunner:
17/04/07 13:49:54 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:54 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000002_0 is done. And is in the process of committing
17/04/07 13:49:54 INFO mapred.LocalJobRunner: map
17/04/07 13:49:54 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000002_0' done.
17/04/07 13:49:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000002_0
17/04/07 13:49:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000003_0
17/04/07 13:49:54 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:54 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:54 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/kms-acls.xml:0+3518
17/04/07 13:49:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:55 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:55 INFO mapred.LocalJobRunner:
17/04/07 13:49:55 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:55 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000003_0 is done. And is in the process of committing
17/04/07 13:49:55 INFO mapred.LocalJobRunner: map
17/04/07 13:49:55 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000003_0' done.
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000003_0
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000004_0
17/04/07 13:49:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:55 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/hdfs-site.xml:0+775
17/04/07 13:49:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:55 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:55 INFO mapred.LocalJobRunner:
17/04/07 13:49:55 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:55 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000004_0 is done. And is in the process of committing
17/04/07 13:49:55 INFO mapred.LocalJobRunner: map
17/04/07 13:49:55 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000004_0' done.
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000004_0
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000005_0
17/04/07 13:49:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:55 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/core-site.xml:0+774
17/04/07 13:49:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:55 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:55 INFO mapred.LocalJobRunner:
17/04/07 13:49:55 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:55 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000005_0 is done. And is in the process of committing
17/04/07 13:49:55 INFO mapred.LocalJobRunner: map
17/04/07 13:49:55 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000005_0' done.
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000005_0
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000006_0
17/04/07 13:49:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:55 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/yarn-site.xml:0+690
17/04/07 13:49:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:55 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:55 INFO mapred.LocalJobRunner:
17/04/07 13:49:55 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:55 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000006_0 is done. And is in the process of committing
17/04/07 13:49:55 INFO mapred.LocalJobRunner: map
17/04/07 13:49:55 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000006_0' done.
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000006_0
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_m_000007_0
17/04/07 13:49:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:55 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/httpfs-site.xml:0+620
17/04/07 13:49:55 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:55 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:55 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:55 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:55 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:55 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:55 INFO mapred.LocalJobRunner:
17/04/07 13:49:55 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:55 INFO mapred.Task: Task:attempt_local1632798238_0001_m_000007_0 is done. And is in the process of committing
17/04/07 13:49:55 INFO mapred.LocalJobRunner: map
17/04/07 13:49:55 INFO mapred.Task: Task 'attempt_local1632798238_0001_m_000007_0' done.
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_m_000007_0
17/04/07 13:49:55 INFO mapred.LocalJobRunner: map task executor complete.
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/04/07 13:49:55 INFO mapred.LocalJobRunner: Starting task: attempt_local1632798238_0001_r_000000_0
17/04/07 13:49:55 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:55 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@37a1beff
17/04/07 13:49:55 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=333971456, maxSingleShuffleLimit=83492864, mergeThreshold=220421168, ioSortFactor=10, memToMemMergeOutputsThreshold=10
17/04/07 13:49:56 INFO reduce.EventFetcher: attempt_local1632798238_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000005_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000005_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->2
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000003_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000003_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 2, commitMemory -> 2, usedMemory ->4
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000002_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000002_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 3, commitMemory -> 4, usedMemory ->6
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000006_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000006_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 4, commitMemory -> 6, usedMemory ->8
17/04/07 13:49:56 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/04/07 13:49:56 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000001_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000001_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 5, commitMemory -> 8, usedMemory ->10
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000007_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000007_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 6, commitMemory -> 10, usedMemory ->12
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000004_0 decomp: 2 len: 6 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 2 bytes from map-output for attempt_local1632798238_0001_m_000004_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2, inMemoryMapOutputs.size() -> 7, commitMemory -> 12, usedMemory ->14
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1632798238_0001_m_000000_0 decomp: 21 len: 25 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local1632798238_0001_m_000000_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 8, commitMemory -> 14, usedMemory ->35
17/04/07 13:49:56 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/04/07 13:49:56 INFO mapred.LocalJobRunner: 8 / 8 copied.
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: finalMerge called with 8 in-memory map-outputs and 0 on-disk map-outputs
17/04/07 13:49:56 INFO mapred.Merger: Merging 8 sorted segments
17/04/07 13:49:56 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: Merged 8 segments, 35 bytes to disk to satisfy reduce memory limit
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
17/04/07 13:49:56 INFO mapred.Merger: Merging 1 sorted segments
17/04/07 13:49:56 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 10 bytes
17/04/07 13:49:56 INFO mapred.LocalJobRunner: 8 / 8 copied.
17/04/07 13:49:56 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
17/04/07 13:49:56 INFO mapred.Task: Task:attempt_local1632798238_0001_r_000000_0 is done. And is in the process of committing
17/04/07 13:49:56 INFO mapred.LocalJobRunner: 8 / 8 copied.
17/04/07 13:49:56 INFO mapred.Task: Task attempt_local1632798238_0001_r_000000_0 is allowed to commit now
17/04/07 13:49:56 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1632798238_0001_r_000000_0' to file:/usr/local/hadoop/grep-temp-2146686112/_temporary/0/task_local1632798238_0001_r_000000
17/04/07 13:49:56 INFO mapred.LocalJobRunner: reduce > reduce
17/04/07 13:49:56 INFO mapred.Task: Task 'attempt_local1632798238_0001_r_000000_0' done.
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Finishing task: attempt_local1632798238_0001_r_000000_0
17/04/07 13:49:56 INFO mapred.LocalJobRunner: reduce task executor complete.
17/04/07 13:49:56 INFO mapreduce.Job: map 100% reduce 100%
17/04/07 13:49:56 INFO mapreduce.Job: Job job_local1632798238_0001 completed successfully
17/04/07 13:49:56 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=2892349
FILE: Number of bytes written=4635812
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=745
Map output records=1
Map output bytes=17
Map output materialized bytes=67
Input split bytes=869
Combine input records=1
Combine output records=1
Reduce input groups=1
Reduce shuffle bytes=67
Reduce input records=1
Reduce output records=1
Spilled Records=2
Shuffled Maps =8
Failed Shuffles=0
Merged Map outputs=8
GC time elapsed (ms)=333
Total committed heap usage (bytes)=2652372992
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=26007
File Output Format Counters
Bytes Written=123
17/04/07 13:49:56 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
17/04/07 13:49:56 INFO input.FileInputFormat: Total input paths to process : 1
17/04/07 13:49:56 INFO mapreduce.JobSubmitter: number of splits:1
17/04/07 13:49:56 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1300395988_0002
17/04/07 13:49:56 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/07 13:49:56 INFO mapreduce.Job: Running job: job_local1300395988_0002
17/04/07 13:49:56 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/07 13:49:56 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:56 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Starting task: attempt_local1300395988_0002_m_000000_0
17/04/07 13:49:56 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:56 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:56 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/grep-temp-2146686112/part-r-00000:0+111
17/04/07 13:49:56 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:49:56 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:49:56 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:49:56 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:49:56 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:49:56 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:49:56 INFO mapred.LocalJobRunner:
17/04/07 13:49:56 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:49:56 INFO mapred.MapTask: Spilling map output
17/04/07 13:49:56 INFO mapred.MapTask: bufstart = 0; bufend = 17; bufvoid = 104857600
17/04/07 13:49:56 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600
17/04/07 13:49:56 INFO mapred.MapTask: Finished spill 0
17/04/07 13:49:56 INFO mapred.Task: Task:attempt_local1300395988_0002_m_000000_0 is done. And is in the process of committing
17/04/07 13:49:56 INFO mapred.LocalJobRunner: map
17/04/07 13:49:56 INFO mapred.Task: Task 'attempt_local1300395988_0002_m_000000_0' done.
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Finishing task: attempt_local1300395988_0002_m_000000_0
17/04/07 13:49:56 INFO mapred.LocalJobRunner: map task executor complete.
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Starting task: attempt_local1300395988_0002_r_000000_0
17/04/07 13:49:56 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:49:56 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:49:56 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@3e2c75a9
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=333971456, maxSingleShuffleLimit=83492864, mergeThreshold=220421168, ioSortFactor=10, memToMemMergeOutputsThreshold=10
17/04/07 13:49:56 INFO reduce.EventFetcher: attempt_local1300395988_0002_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
17/04/07 13:49:56 INFO reduce.LocalFetcher: localfetcher#2 about to shuffle output of map attempt_local1300395988_0002_m_000000_0 decomp: 21 len: 25 to MEMORY
17/04/07 13:49:56 INFO reduce.InMemoryMapOutput: Read 21 bytes from map-output for attempt_local1300395988_0002_m_000000_0
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 21, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->21
17/04/07 13:49:56 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/04/07 13:49:56 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
17/04/07 13:49:56 INFO mapred.Merger: Merging 1 sorted segments
17/04/07 13:49:56 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: Merged 1 segments, 21 bytes to disk to satisfy reduce memory limit
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: Merging 1 files, 25 bytes from disk
17/04/07 13:49:56 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
17/04/07 13:49:56 INFO mapred.Merger: Merging 1 sorted segments
17/04/07 13:49:56 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 11 bytes
17/04/07 13:49:56 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/07 13:49:56 INFO mapred.Task: Task:attempt_local1300395988_0002_r_000000_0 is done. And is in the process of committing
17/04/07 13:49:56 INFO mapred.LocalJobRunner: 1 / 1 copied.
17/04/07 13:49:56 INFO mapred.Task: Task attempt_local1300395988_0002_r_000000_0 is allowed to commit now
17/04/07 13:49:56 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1300395988_0002_r_000000_0' to file:/usr/local/hadoop/output/_temporary/0/task_local1300395988_0002_r_000000
17/04/07 13:49:56 INFO mapred.LocalJobRunner: reduce > reduce
17/04/07 13:49:56 INFO mapred.Task: Task 'attempt_local1300395988_0002_r_000000_0' done.
17/04/07 13:49:56 INFO mapred.LocalJobRunner: Finishing task: attempt_local1300395988_0002_r_000000_0
17/04/07 13:49:56 INFO mapred.LocalJobRunner: reduce task executor complete.
17/04/07 13:49:57 INFO mapreduce.Job: Job job_local1300395988_0002 running in uber mode : false
17/04/07 13:49:57 INFO mapreduce.Job: map 100% reduce 100%
17/04/07 13:49:57 INFO mapreduce.Job: Job job_local1300395988_0002 completed successfully
17/04/07 13:49:57 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=1248142
FILE: Number of bytes written=2056416
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=1
Map output records=1
Map output bytes=17
Map output materialized bytes=25
Input split bytes=121
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=25
Reduce input records=1
Reduce output records=1
Spilled Records=2
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=0
Total committed heap usage (bytes)=454033408
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=123
File Output Format Counters
Bytes Written=23
hadoop@hadoop-virtual-machine:/usr/local/hadoop$
用cat查看结果如图:
注意,Hadoop 默认不会覆盖结果文件,因此再次运行上面实例会提示出错,需要先将 ./output 删除
rm -r ./output
wordcount的例子
$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount ./input ./output
输出结果如下:
hadoop@hadoop-virtual-machine:/usr/local/hadoop$ cat ./output/*
1 dfsadmin
hadoop@hadoop-virtual-machine:/usr/local/hadoop$ rm -r ./output
hadoop@hadoop-virtual-machine:/usr/local/hadoop$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount ./input ./output
17/04/07 13:59:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
17/04/07 13:59:19 INFO input.FileInputFormat: Total input paths to process : 8
17/04/07 13:59:19 INFO mapreduce.JobSubmitter: number of splits:8
17/04/07 13:59:19 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1289677090_0001
17/04/07 13:59:19 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
17/04/07 13:59:19 INFO mapreduce.Job: Running job: job_local1289677090_0001
17/04/07 13:59:19 INFO mapred.LocalJobRunner: OutputCommitter set in config null
17/04/07 13:59:19 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:19 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/04/07 13:59:19 INFO mapred.LocalJobRunner: Waiting for map tasks
17/04/07 13:59:19 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000000_0
17/04/07 13:59:19 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:19 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:20 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/hadoop-policy.xml:0+9683
17/04/07 13:59:20 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:20 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:20 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:20 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:20 INFO mapred.LocalJobRunner:
17/04/07 13:59:20 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:20 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufend = 13520; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26209876(104839504); length = 4521/6553600
17/04/07 13:59:20 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:20 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000000_0 is done. And is in the process of committing
17/04/07 13:59:20 INFO mapred.LocalJobRunner: map
17/04/07 13:59:20 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000000_0' done.
17/04/07 13:59:20 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000000_0
17/04/07 13:59:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000001_0
17/04/07 13:59:20 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:20 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:20 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/kms-site.xml:0+5511
17/04/07 13:59:20 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:20 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:20 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:20 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:20 INFO mapred.LocalJobRunner:
17/04/07 13:59:20 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:20 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufend = 6892; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26212452(104849808); length = 1945/6553600
17/04/07 13:59:20 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:20 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000001_0 is done. And is in the process of committing
17/04/07 13:59:20 INFO mapred.LocalJobRunner: map
17/04/07 13:59:20 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000001_0' done.
17/04/07 13:59:20 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000001_0
17/04/07 13:59:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000002_0
17/04/07 13:59:20 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:20 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:20 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/capacity-scheduler.xml:0+4436
17/04/07 13:59:20 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:20 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:20 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:20 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:20 INFO mapred.LocalJobRunner:
17/04/07 13:59:20 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:20 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufend = 5653; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26212716(104850864); length = 1681/6553600
17/04/07 13:59:20 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:20 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000002_0 is done. And is in the process of committing
17/04/07 13:59:20 INFO mapred.LocalJobRunner: map
17/04/07 13:59:20 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000002_0' done.
17/04/07 13:59:20 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000002_0
17/04/07 13:59:20 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000003_0
17/04/07 13:59:20 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:20 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:20 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/kms-acls.xml:0+3518
17/04/07 13:59:20 INFO mapreduce.Job: Job job_local1289677090_0001 running in uber mode : false
17/04/07 13:59:20 INFO mapreduce.Job: map 100% reduce 0%
17/04/07 13:59:20 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:20 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:20 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:20 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:20 INFO mapred.LocalJobRunner:
17/04/07 13:59:20 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:20 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:20 INFO mapred.MapTask: bufstart = 0; bufend = 4413; bufvoid = 104857600
17/04/07 13:59:20 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213072(104852288); length = 1325/6553600
17/04/07 13:59:20 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:20 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000003_0 is done. And is in the process of committing
17/04/07 13:59:21 INFO mapred.LocalJobRunner: map
17/04/07 13:59:21 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000003_0' done.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000003_0
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000004_0
17/04/07 13:59:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:21 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:21 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/hdfs-site.xml:0+775
17/04/07 13:59:21 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:21 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:21 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:21 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:21 INFO mapred.LocalJobRunner:
17/04/07 13:59:21 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:21 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufend = 1154; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213996(104855984); length = 401/6553600
17/04/07 13:59:21 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:21 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000004_0 is done. And is in the process of committing
17/04/07 13:59:21 INFO mapred.LocalJobRunner: map
17/04/07 13:59:21 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000004_0' done.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000004_0
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000005_0
17/04/07 13:59:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:21 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:21 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/core-site.xml:0+774
17/04/07 13:59:21 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:21 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:21 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:21 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:21 INFO mapred.LocalJobRunner:
17/04/07 13:59:21 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:21 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufend = 1154; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213996(104855984); length = 401/6553600
17/04/07 13:59:21 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:21 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000005_0 is done. And is in the process of committing
17/04/07 13:59:21 INFO mapred.LocalJobRunner: map
17/04/07 13:59:21 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000005_0' done.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000005_0
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000006_0
17/04/07 13:59:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:21 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:21 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/yarn-site.xml:0+690
17/04/07 13:59:21 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:21 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:21 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:21 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:21 INFO mapred.LocalJobRunner:
17/04/07 13:59:21 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:21 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufend = 1046; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214020(104856080); length = 377/6553600
17/04/07 13:59:21 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:21 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000006_0 is done. And is in the process of committing
17/04/07 13:59:21 INFO mapred.LocalJobRunner: map
17/04/07 13:59:21 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000006_0' done.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000006_0
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_m_000007_0
17/04/07 13:59:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:21 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:21 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/input/httpfs-site.xml:0+620
17/04/07 13:59:21 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
17/04/07 13:59:21 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
17/04/07 13:59:21 INFO mapred.MapTask: soft limit at 83886080
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
17/04/07 13:59:21 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
17/04/07 13:59:21 INFO mapred.LocalJobRunner:
17/04/07 13:59:21 INFO mapred.MapTask: Starting flush of map output
17/04/07 13:59:21 INFO mapred.MapTask: Spilling map output
17/04/07 13:59:21 INFO mapred.MapTask: bufstart = 0; bufend = 939; bufvoid = 104857600
17/04/07 13:59:21 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214060(104856240); length = 337/6553600
17/04/07 13:59:21 INFO mapred.MapTask: Finished spill 0
17/04/07 13:59:21 INFO mapred.Task: Task:attempt_local1289677090_0001_m_000007_0 is done. And is in the process of committing
17/04/07 13:59:21 INFO mapred.LocalJobRunner: map
17/04/07 13:59:21 INFO mapred.Task: Task 'attempt_local1289677090_0001_m_000007_0' done.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_m_000007_0
17/04/07 13:59:21 INFO mapred.LocalJobRunner: map task executor complete.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Waiting for reduce tasks
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Starting task: attempt_local1289677090_0001_r_000000_0
17/04/07 13:59:21 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1
17/04/07 13:59:21 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
17/04/07 13:59:21 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@c11b00d
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=333971456, maxSingleShuffleLimit=83492864, mergeThreshold=220421168, ioSortFactor=10, memToMemMergeOutputsThreshold=10
17/04/07 13:59:21 INFO reduce.EventFetcher: attempt_local1289677090_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000002_0 decomp: 3979 len: 3983 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 3979 bytes from map-output for attempt_local1289677090_0001_m_000002_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 3979, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->3979
17/04/07 13:59:21 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000005_0 decomp: 1122 len: 1126 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 1122 bytes from map-output for attempt_local1289677090_0001_m_000005_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1122, inMemoryMapOutputs.size() -> 2, commitMemory -> 3979, usedMemory ->5101
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000003_0 decomp: 2376 len: 2380 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 2376 bytes from map-output for attempt_local1289677090_0001_m_000003_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2376, inMemoryMapOutputs.size() -> 3, commitMemory -> 5101, usedMemory ->7477
17/04/07 13:59:21 WARN io.ReadaheadPool: Failed readahead on ifile
EBADF: Bad file descriptor
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:267)
at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:146)
at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:206)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000000_0 decomp: 4637 len: 4641 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 4637 bytes from map-output for attempt_local1289677090_0001_m_000000_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 4637, inMemoryMapOutputs.size() -> 4, commitMemory -> 7477, usedMemory ->12114
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000007_0 decomp: 938 len: 942 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 938 bytes from map-output for attempt_local1289677090_0001_m_000007_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 938, inMemoryMapOutputs.size() -> 5, commitMemory -> 12114, usedMemory ->13052
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000006_0 decomp: 1019 len: 1023 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 1019 bytes from map-output for attempt_local1289677090_0001_m_000006_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1019, inMemoryMapOutputs.size() -> 6, commitMemory -> 13052, usedMemory ->14071
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000001_0 decomp: 4795 len: 4799 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 4795 bytes from map-output for attempt_local1289677090_0001_m_000001_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 4795, inMemoryMapOutputs.size() -> 7, commitMemory -> 14071, usedMemory ->18866
17/04/07 13:59:21 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1289677090_0001_m_000004_0 decomp: 1122 len: 1126 to MEMORY
17/04/07 13:59:21 INFO reduce.InMemoryMapOutput: Read 1122 bytes from map-output for attempt_local1289677090_0001_m_000004_0
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1122, inMemoryMapOutputs.size() -> 8, commitMemory -> 18866, usedMemory ->19988
17/04/07 13:59:21 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
17/04/07 13:59:21 INFO mapred.LocalJobRunner: 8 / 8 copied.
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: finalMerge called with 8 in-memory map-outputs and 0 on-disk map-outputs
17/04/07 13:59:21 INFO mapred.Merger: Merging 8 sorted segments
17/04/07 13:59:21 INFO mapred.Merger: Down to the last merge-pass, with 8 segments left of total size: 19940 bytes
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: Merged 8 segments, 19988 bytes to disk to satisfy reduce memory limit
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: Merging 1 files, 19978 bytes from disk
17/04/07 13:59:21 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
17/04/07 13:59:21 INFO mapred.Merger: Merging 1 sorted segments
17/04/07 13:59:21 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 19968 bytes
17/04/07 13:59:21 INFO mapred.LocalJobRunner: 8 / 8 copied.
17/04/07 13:59:21 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
17/04/07 13:59:21 INFO mapred.Task: Task:attempt_local1289677090_0001_r_000000_0 is done. And is in the process of committing
17/04/07 13:59:21 INFO mapred.LocalJobRunner: 8 / 8 copied.
17/04/07 13:59:21 INFO mapred.Task: Task attempt_local1289677090_0001_r_000000_0 is allowed to commit now
17/04/07 13:59:21 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1289677090_0001_r_000000_0' to file:/usr/local/hadoop/output/_temporary/0/task_local1289677090_0001_r_000000
17/04/07 13:59:21 INFO mapred.LocalJobRunner: reduce > reduce
17/04/07 13:59:21 INFO mapred.Task: Task 'attempt_local1289677090_0001_r_000000_0' done.
17/04/07 13:59:21 INFO mapred.LocalJobRunner: Finishing task: attempt_local1289677090_0001_r_000000_0
17/04/07 13:59:21 INFO mapred.LocalJobRunner: reduce task executor complete.
17/04/07 13:59:21 INFO mapreduce.Job: map 100% reduce 100%
17/04/07 13:59:21 INFO mapreduce.Job: Job job_local1289677090_0001 completed successfully
17/04/07 13:59:21 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=2932255
FILE: Number of bytes written=4795020
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=745
Map output records=2753
Map output bytes=34771
Map output materialized bytes=20020
Input split bytes=869
Combine input records=2753
Combine output records=1160
Reduce input groups=588
Reduce shuffle bytes=20020
Reduce input records=1160
Reduce output records=588
Spilled Records=2320
Shuffled Maps =8
Failed Shuffles=0
Merged Map outputs=8
GC time elapsed (ms)=235
Total committed heap usage (bytes)=2654994432
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=26007
File Output Format Counters
Bytes Written=10072
hadoop@hadoop-virtual-machine:/usr/local/hadoop$
用cat查看结果如下 :
cat 是concatenate的简写, 是一个文本文件查看和连接工具.cat主要有三大功能:
1.一次显示整个文件。$ cat filename
2.从键盘创建一个文件。$ cat > filename
只能创建新文件,不能编辑已有文件.
3.将几个文件合并为一个文件: $cat file1 file2 > file
grep (缩写来自Globally search a Regular Expression and Print)是一种强大的文本搜索工具,它能使用特定模式匹配(包括正则表达式)搜索文本,并默认输出匹配行。