环境:系统centos6.6;hadoop版本:1.0.3;java运行环境:jdk1.6

单节点配置过程:

1.配置系统ssh:hadoop在运行过程中会用访问ssh服务,将ssh服务设置成无密码访问,这样hadoop在访问ssh服务的时候就不需要人工手动输入密码了:

detail:
step 1:生成密钥
[hjchaw@localhost ~]$ ssh-keygen -t rsa -P ""
[hjchaw@localhost ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
step 2:测试ssh ,如果ssh成功连接,说明配置ssh配置成功
[hjchaw@localhost ~]$ ssh localhost
如果ssh访问还提示输入密码:一般是.ssh路径访问权限问题,把权限设置成700,配置的时候注意。

2.hadoop配置过程:
step1:hadoop-env.xml配置,修改其中的JAVA_HOME:如:JAVA_HOME=/usr/local/jdk
step2:core-site.xml文件配置:
<configuration>
  NO1:配置hadoop数据存放路径
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hjchaw/hadoop-datastore/hadoop-${user.name}</value>
    <description>The name of the default file system.  Either the
      literal string "local" or a host:port for NDFS.
    </description>
    <final>true</final>
  </property>

 NO2:设置fs名称
 <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
    <description>The name of the default file system.  Either the
      literal string "local" or a host:port for NDFS.
    </description>
    <final>true</final>
  </property>
</configuration>

step3:配置hdfs-site.xml
 
  <!-- file system properties -->
  <property>
    <name>dfs.name.dir</name>
    <value>${hadoop.tmp.dir}/dfs/name</value>
    <description>Determines where on the local filesystem the DFS name node
      should store the name table.  If this is a comma-delimited list
      of directories then the name table is replicated in all of the
      directories, for redundancy. </description>
    <final>true</final>
  </property>

  <property>
    <name>dfs.data.dir</name>
    <value>${hadoop.tmp.dir}/dfs/data</value>
    <description>Determines where on the local filesystem an DFS data node
       should store its blocks.  If this is a comma-delimited
       list of directories, then data will be stored in all named
       directories, typically on different devices.
       Directories that do not exist are ignored.
    </description>
    <final>true</final>
  </property>

  <property>
    <name>dfs.replication</name>
    <value>1</value>
    <final>true</final>
  </property>
step4: mapred-site.xml 配置
<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>
以上几步是hadoop单节点,伪分布式配置。
3.hadoop启动:
可以将hadoop/bin设置到PATH路径中
setup1:格式化文件系统:
   [hjchaw@localhost bin]$  hadoop namenode -format
  
12/05/27 04:25:19 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = localhost.localdomain/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.3
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 'hortonfo' on Tue May  8 20:31:25 UTC 2012
************************************************************/
12/05/27 04:25:19 INFO util.GSet: VM type       = 32-bit
12/05/27 04:25:19 INFO util.GSet: 2% max memory = 19.33375 MB
12/05/27 04:25:19 INFO util.GSet: capacity      = 2^22 = 4194304 entries
12/05/27 04:25:19 INFO util.GSet: recommended=4194304, actual=4194304
12/05/27 04:25:20 INFO namenode.FSNamesystem: fsOwner=hjchaw
12/05/27 04:25:20 INFO namenode.FSNamesystem: supergroup=supergroup
12/05/27 04:25:20 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/05/27 04:25:20 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/05/27 04:25:20 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/05/27 04:25:20 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/05/27 04:25:21 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/05/27 04:25:21 INFO common.Storage: Storage directory /home/hjchaw/hadoop-datastore/hadoop-hjchaw/dfs/name has been successfully formatted.
12/05/27 04:25:21 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1
************************************************************/
step2:启动hadoop:
[hjchaw@localhost bin]$ start-all.sh
starting namenode, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-namenode-localhost.localdomain.out
localhost: starting datanode, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-datanode-localhost.localdomain.out
localhost: starting secondarynamenode, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-secondarynamenode-localhost.localdomain.out
starting jobtracker, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-jobtracker-localhost.localdomain.out
localhost: starting tasktracker, logging to /opt/hadoop/hadoop-1.0.3/libexec/../logs/hadoop-hjchaw-tasktracker-localhost.localdomain.out

如果看到以上结果信息,那么configuration is OK,now!
4.尝试使用hadoop命令行接口操作文件系统:
如:新建一个文件夹:
[hjchaw@localhost bin]$ hadoop fs -mkdir input
查看文件:
[hjchaw@localhost bin]$ hadoop fs -ls
Found 1 items
drwxr-xr-x   - hjchaw supergroup          0 2012-05-27 04:26 /user/hjchaw/input

到此hadoop单节点,伪分布式配置结束。