有四台Linux服务器,其中一台为主,其它为从。服务器使用的是CentOS6.5,jdk选择1.6,hadoop选择1.0.4版本。要到实际环境中应用还要根据实际情况做修改。

    如果是新装好的系统,要先配置好ip,shell脚本如下:


  1. #!bin/bash

  2. read "input ip:" ip

  3. echo 'the default hostname is master'

  4. sed -i '$aIPADDR='$ip /etc/sysconfig/network-scripts/ifcfg-eth0 

  5. sed -i '/BOOTPROTO/cBOOTPROTO="no"' /etc/sysconfig/network-scripts/ifcfg-eth0 

  6. sed -i '/IPV6INIT/cIPV6INIT="no"' /etc/sysconfig/network-scripts/ifcfg-eth0 

  7. sed -i '/NM_CONTROLLED/cNM_CONTROLLED="no"' /etc/sysconfig/network-scripts/ifcfg-eth0 

  8. service network restart

  9. chkconfig network on

  10. service iptables stop

  11. chkconfig iptables off

  12. setenforce 0

  13. hostname master

  14. sed -i '/HOSTNAME/cHOSTNAME=master' /etc/sysconfig/network


    集群的ip配置如下,在实际环境中可以自己作调整。首先检查通信,通信成功才能下一步进行,否则安装终止。如果通信成功,才会执行接下来的步骤,则修改/etc/hosts文件和ssh_conf,然后重启ssh服务。密码要根据实际环境修改。

  1. masterip=192.168.2.254

  2. slave1ip=192.168.2.11

  3. slave2ip=192.168.2.2

  4. slave3ip=192.168.2.3

  5. if [ -e ip.txt ];then

  6. rm -rf ip.txt

  7. fi

  8. touch ip.txt

  9. echo $masterip >>ip.txt

  10. echo $slave1ip >>ip.txt

  11. echo $slave2ip >>ip.txt

  12. echo $slave3ip >>ip.txt

  13. NEWPASS=123456

  14. NETWORK=TRUE

  15. echo "before you install,please make sure network is ok!!!"

  16. echo "now  test the network"

  17. for ip in $(cat ip.txt)

  18. do

  19. ping $ip -c 5 &>/dev/null 

  20. if [ $? == 0 ] ;then 

  21. echo "${ip} is ok"

  22. else 

  23. echo "${ip} conn't connected" 

  24. NETWORK=FALSE

  25. fi

  26. done

  27. echo $NETWORK

  28. if [ $NETWORK != FALSE   ];then

  29. .........

  30. fi


为了方便说明,笔者在此处做了步骤拆分.

    使用root用户实现master的root用户能够免秘钥ssh登陆其他主机。这样root用户就可以很方便的管理其他主机了。


  1. PASS=123456

  2. yum -y install tcl --nogpgcheck

  3. yum -y install expect --nogpgcheck

  4. expect <<EOF 

  5.         spawn ssh-keygen

  6.         expect "Enter file in which to save the key (/root/.ssh/id_rsa):" 

  7.         send "\r"

  8. expect "Enter passphrase (empty for no passphrase):"

  9. send "\r"

  10. expect "Enter same passphrase again:"

  11. send "\r"

  12. expect eof

  13. EOF

  14. for ip in $(cat ip.txt)

  15. do

  16. expect <<EOF

  17.         spawn ssh-copy-id root@${ip} 

  18. expect "(yes/no)?"

  19.         send "yes\r"

  20.         expect "password:"

  21.         send "${PASS}\r"

  22. expect eof

  23. EOF

  24. done

  25. expect <<EOF

  26.         spawn ssh-copy-id hadoop@master

  27. expect "(yes/no)?"

  28.         send "yes\r"

  29.         expect "password:"

  30.         send "${PASS}\r"

  31. expect eof

  32. EOF


   完成root无秘钥登陆其他主机后,各个主机添加hadoop用户,并修改/etc/hosts文件,主机上修改sshd的配置文件并重新启动服务。 

  1. for ip in $(cat ip.txt)

  2. do

  3. ssh root@$ip "useradd hadoop"

  4. ssh root@$ip "echo '123456' | passwd --stdin hadoop"

  5. ssh root@$ip "echo $masterip'  master' >>/etc/hosts"

  6. ssh root@$ip "echo $slave1ip'   slave1' >>/etc/hosts"

  7. ssh root@$ip "echo $slave2ip'   slave2' >>/etc/hosts"

  8. ssh root@$ip "echo $slave3ip'   slave3' >>/etc/hosts"

  9. done

  10. cp ip.txt /home/hadoop

  11. cp hadoopsshconf.sh /home/hadoop

  12. chown hadoop:hadoop /home/hadoop/ip.txt

  13. chown hadoop:hadoop /home/hadoop/hadoopsshconf.sh


  14. ssh hadoop@localhost "sh hadoopsshconf.sh"


  15. sed -i '/#RSAAuthentication yes/cRSAAuthentication yes'  /etc/ssh/sshd_config

  16. sed -i '/#PubkeyAuthentication yes/cPubkeyAuthentication yes' /etc/ssh/sshd_config

  17. sed -i '/#AuthorizedKeysFile     .ssh/authorized_keys/AuthorizedKeysFile     .ssh/authorized_keysc' /etc/ssh/sshd_config

  18. service sshd restart

  19. chkconfig sshd on


    只有root用户能够免秘钥登陆其他主机是不够的,还要本机的hadoop用户能够免密钥登陆其他hadoop用户,上面的脚本中已经将脚本hadoopsshconf.sh拷贝到hadoop用户下方,使用ssh远程命令以hadoop用户执行即可。hadoopsshconf.sh如下:


  1. #!/bin/bash

  2. PASS=123456

  3. expect <<EOF 

  4.         spawn ssh-keygen

  5.         expect "Enter file in which to save the key (/root/.ssh/id_rsa):" 

  6.         send "\r"

  7. expect "Enter passphrase (empty for no passphrase):"

  8. send "\r"

  9. expect "Enter same passphrase again:"

  10. send "\r"

  11. expect eof

  12. EOF

  13. cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys

  14. chmod 600  ~/.ssh/authorized_keys 

  15. for ip in $(cat ip.txt)

  16. do

  17. expect <<EOF

  18.         spawn ssh-copy-id hadoop@${ip} 

  19. expect "(yes/no)?"

  20.         send "yes\r"

  21.         expect "password:"

  22.         send "${PASS}\r"

  23. expect eof

  24. EOF

  25. done

    

    安装jdk1.6脚本,首先清除掉系统上已经安装的jdk,避免版本冲突。


  1. #!/bin/bash

  2. rm -rf tmp.txt

  3. rpm -qa | grep java* > tmp.txt

  4. line=$(cat tmp.txt)

  5. for i in $line

  6. do

  7. rpm -e $i --nodeps

  8. done

  9. mkdir -p /usr/java

  10. cp ./jdk-6u32-linux-x64.bin /usr/java/

  11. cd /usr/java/

  12. ./jdk-6u32-linux-x64.bin 

  13. rm -rf jdk-6u32-linux-x64.bin

  14. echo 'export JAVA_HOME=/usr/java/jdk1.6.0_32' >>/etc/profile

  15. echo 'export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib' >>/etc/profile

  16. echo 'export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin' >>/etc/profile

  17. source /etc/profile

  18. java -version

  19. cd -

    

接下来安装hadoop1.4.0,安装在/usr下边。脚本内容包括/etc/profile的修改,hadoop配置文件,hadop-env.sh,conre-site.xml,hdfs-site.xml,mapred-site.xml,masters和slaves的配置文件修改,兵拷贝到其他机子上去。

  1. mkdir -p /usr/hadoop

  2. tar -zxf hadoop-1.0.4.tar.gz -C /usr/hadoop

  3. mv /usr/hadoop/hadoop-1.0.4 /usr/hadoop/hadoop1

  4. chown -R hadoop:hadoop /usr/hadoop

  5. echo 'export HADOOP_HOME=/usr/hadoop/hadoop1' >> /etc/profile

  6. echo 'export PATH=$PATH:$HADOOP_HOME/bin' >>/etc/profile

  7. source /etc/profile

  8. echo 'export JAVA_HOME=/usr/java/jdk1.6.0_32' >> /usr/hadoop/hadoop1/conf/hadoop-env.sh

  9. echo 'export HADOOP_PID_DIR=/usr/hadoop/hadoop1/pids' >>/usr/hadoop/hadoop1/conf/hadoop-env.sh



  10. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/core-site.xml 

  11. sed -i '6a\\t<value>hdfs://master:9000</value>' /usr/hadoop/hadoop1/conf/core-site.xml 

  12. sed -i '6a\\t<name>fs.default.name</name>' /usr/hadoop/hadoop1/conf/core-site.xml 

  13. sed -i '6a\\t<property>' /usr/hadoop/hadoop1/conf/core-site.xml



  14. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/hdfs-site.xml

  15. sed -i '6a\\t<value>false</value>' /usr/hadoop/hadoop1/conf/hdfs-site.xml

  16. sed -i '6a\\t<name>dfs.permissions</name>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  17. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  18. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/hdfs-site.xml

  19. sed -i '6a\\t<value>/usr/hadoop/hadoop1/tmp/</value>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  20. sed -i '6a\\t<name>hadoop.tmp.dir</name>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  21. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  22. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/hdfs-site.xml

  23. sed -i '6a\\t<value>/usr/hadoop/hadoop1/data/</value>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  24. sed -i '6a\\t<name>dfs.data.dir</name>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  25. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  26. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/hdfs-site.xml

  27. sed -i '6a\\t<value>/usr/hadoop/hadoop1/namenode/</value>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  28. sed -i '6a\\t<name>dfs.name.dir</name>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  29. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  30. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/hdfs-site.xml

  31. sed -i '6a\\t<value>3</value>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  32. sed -i '6a\\t<name>dfs.replication</name>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml

  33. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/hdfs-site.xml



  34. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/mapred-site.xml

  35. sed -i '6a\\t<value>2</value>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  36. sed -i '6a\\t<name>mapred.tasktracker.reduce.tasks.maximum</name>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  37. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  38. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/mapred-site.xml

  39. sed -i '6a\\t<value>2</value>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  40. sed -i '6a\\t<name>mapred.tasktracker.map.tasks.maximum</name>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  41. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  42. sed -i '6a\\t</property>' /usr/hadoop/hadoop1/conf/mapred-site.xml

  43. sed -i '6a\\t<value>master:9001</value>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  44. sed -i '6a\\t<name>mapred.job.tracker</name>'     /usr/hadoop/hadoop1/conf/mapred-site.xml

  45. sed -i '6a\\t<property>'     /usr/hadoop/hadoop1/conf/mapred-site.xml




  46. echo 'master' >> /usr/hadoop/hadoop1/conf/masters



  47. sed -i '1d' /usr/hadoop/hadoop1/conf/slaves

  48. echo 'slave1' >> /usr/hadoop/hadoop1/conf/slaves

  49. echo 'slave2' >> /usr/hadoop/hadoop1/conf/slaves

  50. echo 'slave3' >> /usr/hadoop/hadoop1/conf/slaves




  51. mkdir -p  /usr/hadoop/hadoop1/data/

  52. mkdir -p  /usr/hadoop/hadoop1/tmp/

  53. chown -R hadoop:hadoop  /usr/hadoop/

  54. chmod -R 755 /usr/hadoop/hadoop1/data/

  55. chmod -R 755 /usr/hadoop/hadoop1/tmp/

  56. for i in $(seq 3)

  57. do

  58. ssh slave$i "mkdir -p /usr/hadoop"

  59. done

  60. scp -r /usr/hadoop/hadoop1 root@slave1:/usr/hadoop

  61. scp -r /usr/hadoop/hadoop1 root@slave2:/usr/hadoop

  62. scp -r /usr/hadoop/hadoop1 root@slave3:/usr/hadoop

  63. for i in $(seq 3)

  64. do

  65. ssh slave$i "chown -R hadoop:hadoop /usr/hadoop"

  66. ssh slave$i "chmod -R 755 /usr/hadoop/hadoop1/data/"

  67. ssh slave$i "chmod -R 755 /usr/hadoop/hadoop1/tmp/"

  68. done


这样就完成了hadoop集群的自动化安装。然后就是格式化,启动并验证了