装好虚拟机+Linux。而且主机网络和虚拟机网络互通。
以及Linux上装好JDK
1:在Linux下输入命令vi /etc/profile 加入HADOOP_HOME
export JAVA_HOME=/home/hadoop/export/jdk export HADOOP_HOME=/home/hadoop/export/hadoop export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
2:改动hadoop/conf文件夹以下hadoop-env.sh第九行
export JAVA_HOME=/home/hadoop/export/jdk
3:改动hadoop/conf文件夹以下core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/.../tmp</value> </property> <property> <name>fs.default.name</name> <value>hdfs://127.0.0.1:9000</value> </property> </configuration>
4:改动hadoop/conf文件夹以下hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
5:改动hadoop/conf文件夹以下mapred-site.xml
<configuration> <property> <name>mapred.job.tracker</name> <value>127.0.0.1:9001</value> </property> </configuration>
改动完毕。
转到hadoop/bin以下输入hadoop namenode -format
出现例如以下:(说明成功)
Warning: $HADOOP_HOME is deprecated. 14/07/15 16:06:27 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = ubuntu/127.0.1.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.1 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG: java = 1.7.0_55 ************************************************************/ 14/07/15 16:07:09 INFO util.GSet: Computing capacity for map BlocksMap 14/07/15 16:07:09 INFO util.GSet: VM type = 32-bit 14/07/15 16:07:09 INFO util.GSet: 2.0% max memory = 1013645312 14/07/15 16:07:09 INFO util.GSet: capacity = 2^22 = 4194304 entries 14/07/15 16:07:09 INFO util.GSet: recommended=4194304, actual=4194304 14/07/15 16:07:10 INFO namenode.FSNamesystem: fsOwner=hadoop 14/07/15 16:07:10 INFO namenode.FSNamesystem: supergroup=supergroup 14/07/15 16:07:10 INFO namenode.FSNamesystem: isPermissionEnabled=true 14/07/15 16:07:10 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 14/07/15 16:07:10 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/07/15 16:07:10 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 14/07/15 16:07:10 INFO namenode.NameNode: Caching file names occuring more than 10 times 14/07/15 16:07:10 INFO common.Storage: Image file /home/hadoop/tmp/dfs/name/current/fsimage of size 118 bytes saved in 0 seconds. 14/07/15 16:07:10 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/hadoop/tmp/dfs/name/current/edits 14/07/15 16:07:10 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/hadoop/tmp/dfs/name/current/edits 14/07/15 16:07:10 INFO common.Storage: Storage directory /home/hadoop/tmp/dfs/name has been successfully formatted. 14/07/15 16:07:10 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1 ************************************************************/
在这一部分中有一部分人会出现失败的情况。可是你一定要去查hadoop以下logs里面的输出异常非常具体。
第一次失败一定要记住删掉tmp以下的输出。由于有可能会出现不兼容的情况。
然后输入start-all.sh
Warning: $HADOOP_HOME is deprecated. starting namenode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-namenode-ubuntu.out localhost: starting datanode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-datanode-ubuntu.out localhost: starting secondarynamenode, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-secondarynamenode-ubuntu.out starting jobtracker, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-jobtracker-ubuntu.out localhost: starting tasktracker, logging to /home/hadoop/export/hadoop/libexec/../logs/hadoop-hadoop-tasktracker-ubuntu.out
在上面的过程中可能会提示你输入password,这时你能够设置个ssh免password登陆,我博客里面有。
输入jps 出现例如以下:(少一个datanode。这里我有益设置一个错误)
10666 NameNode
11547 Jps
11445 TaskTracker
11130 SecondaryNameNode
11218 JobTracker
查看logs
2014-07-15 16:13:43,032 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2014-07-15 16:13:43,094 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2014-07-15 16:13:43,098 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2014-07-15 16:13:43,118 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2014-07-15 16:13:43,999 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2014-07-15 16:13:44,044 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2014-07-15 16:13:45,484 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/hadoop/tmp/dfs/data: namenode namespaceID = 224603228; datanode namespaceID = 566757162 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
这时你仅仅要删除tmp下的文件,问题解决。
然后你能够运行一个实例:详细操作例如以下
hadoop@ubuntu:~/export/hadoop$ ls bin hadoop-ant-1.2.1.jar ivy README.txt build.xml hadoop-client-1.2.1.jar ivy.xml sbin c++ hadoop-core-1.2.1.jar lib share CHANGES.txt hadoop-examples-1.2.1.jar libexec src conf hadoop-minicluster-1.2.1.jar LICENSE.txt webapps contrib hadoop-test-1.2.1.jar logs docs hadoop-tools-1.2.1.jar NOTICE.txt
进行上传hdfs文件操作
hadoop@ubuntu:~/export/hadoop$ hadoop fs -put README.txt / Warning: $HADOOP_HOME is deprecated.
如上说明上传成功。
运行一段wordcount程序(进行对README.txt文件处理)
hadoop@ubuntu:~/export/hadoop$ hadoop jar hadoop-examples-1.2.1.jar word count /README.txt /wordcountoutput Warning: $HADOOP_HOME is deprecated. 14/07/15 15:23:01 INFO input.FileInputFormat: Total input paths to process : 1 14/07/15 15:23:01 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/07/15 15:23:01 WARN snappy.LoadSnappy: Snappy native library not loaded 14/07/15 15:23:02 INFO mapred.JobClient: Running job: job_201407141636_0001 14/07/15 15:23:03 INFO mapred.JobClient: map 0% reduce 0% 14/07/15 15:23:15 INFO mapred.JobClient: map 100% reduce 0% 14/07/15 15:23:30 INFO mapred.JobClient: map 100% reduce 100% 14/07/15 15:23:32 INFO mapred.JobClient: Job complete: job_201407141636_0001 14/07/15 15:23:32 INFO mapred.JobClient: Counters: 29 14/07/15 15:23:32 INFO mapred.JobClient: Job Counters 14/07/15 15:23:32 INFO mapred.JobClient: Launched reduce tasks=1 14/07/15 15:23:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=12563 14/07/15 15:23:32 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 14/07/15 15:23:32 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 14/07/15 15:23:32 INFO mapred.JobClient: Launched map tasks=1 14/07/15 15:23:32 INFO mapred.JobClient: Data-local map tasks=1 14/07/15 15:23:32 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=14550 14/07/15 15:23:32 INFO mapred.JobClient: File Output Format Counters 14/07/15 15:23:32 INFO mapred.JobClient: Bytes Written=1306 14/07/15 15:23:32 INFO mapred.JobClient: FileSystemCounters 14/07/15 15:23:32 INFO mapred.JobClient: FILE_BYTES_READ=1836 14/07/15 15:23:32 INFO mapred.JobClient: HDFS_BYTES_READ=1463 14/07/15 15:23:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=120839 14/07/15 15:23:32 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1306 14/07/15 15:23:32 INFO mapred.JobClient: File Input Format Counters 14/07/15 15:23:32 INFO mapred.JobClient: Bytes Read=1366 14/07/15 15:23:32 INFO mapred.JobClient: Map-Reduce Framework 14/07/15 15:23:32 INFO mapred.JobClient: Map output materialized bytes=1836 14/07/15 15:23:32 INFO mapred.JobClient: Map input records=31 14/07/15 15:23:32 INFO mapred.JobClient: Reduce shuffle bytes=1836 14/07/15 15:23:32 INFO mapred.JobClient: Spilled Records=262 14/07/15 15:23:32 INFO mapred.JobClient: Map output bytes=2055 14/07/15 15:23:32 INFO mapred.JobClient: Total committed heap usage (bytes)=212611072 14/07/15 15:23:32 INFO mapred.JobClient: CPU time spent (ms)=2430 14/07/15 15:23:32 INFO mapred.JobClient: Combine input records=179 14/07/15 15:23:32 INFO mapred.JobClient: SPLIT_RAW_BYTES=97 14/07/15 15:23:32 INFO mapred.JobClient: Reduce input records=131 14/07/15 15:23:32 INFO mapred.JobClient: Reduce input groups=131 14/07/15 15:23:32 INFO mapred.JobClient: Combine output records=131 14/07/15 15:23:32 INFO mapred.JobClient: Physical memory (bytes) snapshot=177545216 14/07/15 15:23:32 INFO mapred.JobClient: Reduce output records=131 14/07/15 15:23:32 INFO mapred.JobClient: Virtual memory (bytes) snapshot=695681024 14/07/15 15:23:32 INFO mapred.JobClient: Map output records=179
hadoop@ubuntu:~/export/hadoop$ hadoop fs -ls / Warning: $HADOOP_HOME is deprecated. Found 3 items -rw-r--r-- 1 hadoop supergroup 1366 2014-07-15 15:21 /README.txt drwxr-xr-x - hadoop supergroup 0 2014-07-14 16:36 /home drwxr-xr-x - hadoop supergroup 0 2014-07-15 15:23 /wordcountoutput hadoop@ubuntu:~/export/hadoop$ hadoop fs -get /wordcountoutput /home/hadoop/ Warning: $HADOOP_HOME is deprecated.
你能够下载下来看看这个文件 例如以下:
(see 1 5D002.C.1, 1 740.13) 1 <http://www.wassenaar.org/> 1 Administration 1 Apache 1 BEFORE 1 BIS 1 Bureau 1 Commerce, 1 Commodity 1 Control 1 Core 1 Department 1 ENC 1 Exception 1 Export 2 For 1 Foundation 1 Government 1 Hadoop 1 Hadoop, 1 Industry 1 Jetty 1 License 1 Number 1 Regulations, 1 SSL 1 Section 1 Security 1 See 1 Software 2 Technology 1 The 4 This 1 U.S. 1 Unrestricted 1 about 1 algorithms. 1 and 6 and/or 1 another 1 any 1 as 1 asymmetric 1 at: 2 both 1 by 1 check 1 classified 1 code 1 code. 1 concerning 1 country 1 country's 1 country, 1 cryptographic 3 currently 1 details 1 distribution 2 eligible 1 encryption 3 exception 1 export 1 following 1 for 3 form 1 from 1 functions 1 has 1 have 1