Hadoop 2.7.1 搭建



1.防火墙

#查看防火墙状态 

 

       service iptables status 

 

   #关闭防火墙 

 

    service iptables stop 

 

   #查看防火墙开机启动状态 

 

    chkconfig iptables --list 

 

   #关闭防火墙开机启动 

 

    chkconfig iptables off 

 

  2.ssh无密码验证 

 

      安装ssh和rsync 

 

      配置无密码登陆 

 

            ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 

 

            cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys 

 

              chmod 600 ~/.ssh/authorized_keys 

 

  3.修改hadoop配置文件 

 

      #core-site.xml 

 

  <property> 

 

          <name>fs.defaultFS</name> 

 

          <value>hdfs://<ip>:9000</value> 

 

      </property> 

 

      <property> 

 

          <name>hadoop.tmp.dir</name> 

 

          <value><path></value> 

 

          <description>A base for other temporary directories.</description> 

 

      </property> 

 

      <property> 

 

          <name>ha.zookeeper.quorum</name> 

 

          <value><[ip:port]></value> 

 

      </property> 

 

    #hdfs-site.xml 

 

            <property> 

 

              <name>dfs.replication</name> 

 

              <value>1</value> 

 

            </property> 

 

              <property> 

 

                  <name>dfs.namenode.name.dir</name> 

 

                  <value>file:[path]</value> 

 

              </property> 

 

              <property> 

 

                  <name>dfs.datanode.data.dir</name> 

 

                  <value>file:[path]</value> 

 

              </property> 

 

      #mapred-site.xml 

 

              <property> 

 

                  <name>mapred.job.tracker</name> 

 

                  <value><http://<ip>:9001</value> 

 

              </property> 

 

              运行在yarn上时,更改为: 

 

              <property> 

 

                  <name>mapreduce.framework.name</name> 

 

                  <value>yarn</value> 

 

              </property> 

 

                <property> 

 

          <name>mapreduce.jobhistory.address</name> 

 

          <value>http://<ip>:10020</value> 

 

      </property> 

 

      <property> 

 

                  <name>mapreduce.jobhistory.webapp.address</name> 

 

                  <value>http://<ip>:19888</value> 

 

          </property> 

 

      <property> 

 

          <name>yarn.app.mapreduce.am.staging-dir</name> 

 

          <value><path></value> 

 

      </property> 

 

      <property> 

 

          <name>mapreduce.jobhistory.done-dir</name> 

 

          <value>${yarn.app.mapreduce.am.staging-dir}/history/done</value> 

 

      </property> 

 

      <property> 

 

          <name>mapreduce.jobhistory.intermediate-done-dir</name> 

 

          <value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value> 

 

      </property> 

 

      #yarn-site.xml(仅在yarn上运行时配置) 

 

              <property> 

 

                  <name>yarn.resourcemanager.hostname</name> 

 

                  <value>[hostname]</value> 

 

              </property> 

 

              <property> 

 

                  <name>yarn.nodemanager.aux-services</name> 

 

                  <value>mapreduce_shuffle</value> 

 

              </property> 

 

       <property> 

 

          <name>yarn.resourcemanager.zk-address</name> 

 

          <value><[ip:port]></value> 

 

      </property> 

 

      <property> 

 

          <name>yarn.log-aggregation-enable</name> 

 

          <value>true</value> 

 

      </property> 

 
 
 
 
 
 
 

  建议将history写入start-all.sh/stop-all.sh 

 

  /usr/local/hadoop-2.7.1/sbin/mr-jobhistory-daemon.sh start/stop historyserver 

 

  ============================================================================================ 

 

  编译Hadoop 64位 

 
 
 
 

  1.安装jdk 写入配置 

 

  2.安装maven 写入配置文件 

 

  3.安装g++ yum install gcc gcc-c++ 

 

  4.安装protocal buffers  https://github.com/google/protobuf/ 

 

      执行一下命令 

 

              $ ./configure 

 

              $ make 

 

              $ make check 

 

              $ make install 

 

              $ ldconfig 

 

              写入配置文件 

 

  5.安装openssl-dev yum install openssl-devel 

 

  6.安装cmake yum install cmake 

 

  7.安装ant 写入配置文件 

 

  8.执行mvn package -Pdist,native -DskipTests -Dtar 

 
 
 
 

  ============================================================================================ 

 

  卸载protoc 

 
 
 
 

  1.make uninstall 

 

  2../configure --prefix=/usr 

 

  3.rm `which protoc` 

 
 
 
 

  ============================================================================================ 

 

  hadoop命令 

 
 
 
 

  1.启动hadoop 

 

      ./sbin start-all.sh 

 

      或者 

 

      ./sbin start-dfs.sh 

 

      ./sbin start-yarn.sh 

 

      ./sbin mr-jobhistory-daemon.sh start historyserver 

 

      启动成功如下: 

 

          ResourceManager 

 

          NodeManager 

 

          DataNode 

 

          SecondaryNameNode 

 

          NameNode 

 

          JobHistoryServer 

 

  2.停止hadoop 

 

      ./sbin stop-all.sh 

 

      或者 

 

      ./sbin stop-dfs.sh 

 

      ./sbin stop-yarn.sh 

 

      ./sbin mr-jobhistory-daemon.sh stop historyserver 

 

  3.访问管理界面 

 

      hdfs:ip:50070 

 

      mr:ip:8088 

 

  4.创建文件目录 

 

      hdfs dfs -mkdir [path] 

 

  5.查看目录 

 

      hdfs dfs -ls [path] 

 

  6.上传目录 

 

      hdfs dfs -put [spath] [dpath] 

 

  7.运行wordcount 

 

      hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount [inputpath] [outputpath] 

 

  8.查看结果 

 

      hadoop fs -cat [path] 

 

  9.查看状态 

 

      hdfs dfsadmin -report 

 

  10.删除文件 

 

      hdfs dfs -rmr [path] 

 

  11.修改权限 

 

      hdfs dfs -chmod 777 [path] 

 
 
 
 

  ============================================================================================ 

 

  zookeeper 3.4.8 伪集群搭建 

 
 
 
 

  1.复制zoo_sample.cfg,重命名为zk1.cfg和zk2.cfg 

 

  2.修改配置文件(以zk1为例) 

 

      dataDir=/root/zookeeper-3.4.8/zk1/data 

 

      dataLogDir=/root/zookeeper-3.4.8/zk1/logs 

 

      clientPort=2181 

 

      server.1=192.168.190.20:2888:3888 

 

      server.2=192.168.190.20:2889:3889 

 

      在data文件夹下新建myid,内容为1 

 

  3.启动/停止zookeepr 

 

      ./bin zkServer.sh start zk1.cfg 

 

      ./bin zkServer.sh stop zk1.cfg 

 

  4.查看节点状态 

 

      ./bin zkServer.sh status zk1.cfg 

 
 
 
 
 
 
 

  zk客户端命令: 

 

  ZooKeeper 命令行工具类似于Linux的shell环境,使用它可以对ZooKeeper进行访问,数据创建,数据修改等操作. 使用 zkCli.sh -server 127.0.0.1:2181 连接到 ZooKeeper 服务,连接成功后,系统会输出 ZooKeeper 的相关环境以及配置信息。 

 

  命令行工具的一些简单操作如下: 

 

      1. 显示根目录下、文件: ls / 使用 ls 命令来查看当前 ZooKeeper 中所包含的内容 

 

      2. 显示根目录下、文件: ls2 / 查看当前节点数据并能看到更新次数等数据 

 

      3. 创建文件,并设置初始内容: create /zk "test" 创建一个新的 znode节点“ zk ”以及与它关联的字符串 

 

      4. 获取文件内容: get /zk 确认 znode 是否包含我们所创建的字符串 

 

      5. 修改文件内容: set /zk "zkbak" 对 zk 所关联的字符串进行设置 

 

      6. 删除文件: delete /zk 将刚才创建的 znode 删除 

 

      7. 退出客户端: quit 

 

      8. 帮助命令: help 

 

  ZooKeeper 常用四字命令: 

 

  ZooKeeper 支持某些特定的四字命令字母与其的交互。它们大多是查询命令,用来获取 ZooKeeper 服务的当前状态及相关信息。用户在客户端可以通过 telnet 或 nc 向 ZooKeeper 提交相应的命令 

 

      1. 可以通过命令:echo stat|nc 127.0.0.1 2181 来查看哪个节点被选择作为follower或者leader 

 

      2. 使用echo ruok|nc 127.0.0.1 2181 测试是否启动了该Server,若回复imok表示已经启动。 

 

      3. echo dump| nc 127.0.0.1 2181 ,列出未经处理的会话和临时节点。 

 

      4. echo kill | nc 127.0.0.1 2181 ,关掉server 

 

      5. echo conf | nc 127.0.0.1 2181 ,输出相关服务配置的详细信息。 

 

      6. echo cons | nc 127.0.0.1 2181 ,列出所有连接到服务器的客户端的完全的连接 / 会话的详细信息。 

 

      7. echo envi |nc 127.0.0.1 2181 ,输出关于服务环境的详细信息(区别于 conf 命令)。 

 

      8. echo reqs | nc 127.0.0.1 2181 ,列出未经处理的请求。 

 

      9. echo wchs | nc 127.0.0.1 2181 ,列出服务器 watch 的详细信息。 

 

      10. echo wchc | nc 127.0.0.1 2181 ,通过 session 列出服务器 watch 的详细信息,它的输出是一个与 watch 相关的会话的列表。 

 

      11. echo wchp | nc 127.0.0.1 2181 ,通过路径列出服务器 watch 的详细信息。它输出一个与 session 相关的路径。 

 
 
 
 

  ============================================================================================ 

 

  Nginx 安装 

 
 
 
 

  1./configure --prefix=/usr/local/nginx --conf-path=/usr/local/nginx/nginx.cof 

 

  2.make && make install 

 
 
 
 

  ============================================================================================ 

 

  Kafka 集群搭建 

 
 
 
 

  1. server.properties配置 

 

      broker.id=1 

 

      listeners=PLAINTEXT://virtuoso:9091 

 

      log.dirs=/root/kafka-0.10/kafka-1-logs 

 

      zookeeper.connect=192.168.190.20:2181,192.168.190.20:2182 

 

  2.命令 

 

      启动kafka:./bin/kafka-server-start.sh ./config/server.properties & 

 

      创建topic:./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic 

 

      查看topic:./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic 

 

      列出topic:./bin/kafka-topics.sh --list --zookeeper localhsot:2181 

 

      启动producer:./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic 

 

      启动consumer:./bin/kafka-console-consumer.sh --zookeeper localhost:2182 --from-beginning --topic my-replicated-topic 

 
 
 
 

  ============================================================================================ 

 
 
 
 

  Reids 3.2.3安装 

 
 
 
 

  redis.io 

 
 
 
 

  make && make install 

 
 
 
 

  修改redis.conf配置文件: 

 

       bind 127.0.0.1     加入本机ip 

 

       port 6379     默认6379 

 

       daemonize no     yes时使用守护进程 

 

       pidfile /var/run/redis.pid     守护进程文件 

 

       logfile ""     日志文件 

 

       slaveof <masterip> <masterport> 主从配置 

 

       slaveof-read-only yes     从机是否只读 

 

       appendonly no     yes时每次操作插入一条log 

 

       cluster-enabled yes      是否使用集群 

 

       cluster-config-file nodes.conf     集群配置文件 

 

       cluster-node-timeout 15000     集群节点超时时间 

 
 
 
 

  启动redis:redis-server redis.conf 

 

  进入redis:redis-cli -c -p <port> 

 

    ============================================================================================ 

 
 
 
 
 
 
 

  HBase 伪集群部署 

 
 
 
 

  1.修改hbase-env.sh 

 

       export JAVA_HOME=/usr/local/jdk1.8 

 

       export HBASE_CLASSPATH=${HADOOP_HOME}/etc/hadoop 

 

       export HBASE_LOG_DIR=/root/hbase-1.2.4/logs 

 

  2.修改hbase-site.xml 

 

       <property> 

 

              <name>hbase.rootdir</name> 

 

              <value>hdfs://[ip|hostname]:9000/hbase</value> 

 

          </property> 

 

          <property> 

 

                <name>hbase.cluster.distributed</name> 

 

                <value>true</value> 

 

      </property> 

 

      <property> 

 

            <name>hbase.zookeeper.quorum</name> 

 

            <value>[ip:port]</value> 

 

      </property> 

 

      <property> 

 

            <name>hbase.zookeeper.property.clientPort</name> 

 

            <value>[zkPort]</value> 

 

      </property> 

 

      <property> 

 

            <name>hbase.zookeeper.property.dataDir</name> 

 

            <value>[zkDataDir]</value> 

 

            <description>property from zoo.cfg,the directory where the snapshot is stored</description> 

 

      </property> 

 

  3.修改regionservers 

 

       集群主机名 

 

  4.启动&停止 

 

       start-all.sh 

 

       stop-all.sh 

 

        

 

    ============================================================================================ 
 
 
git clone&build
 
 
> git clone https://github.com/apache/incubator-rocketmq.git
 
 
> cd incubator-rocketmq
 
 
> mvn clean package install assembly:assembly -U
 
 > cd target/apache-rocketmq-broker/apache-rocketmq/ 
 

 
 
start namesrv
 
 
> nohup sh bin/mqnamesrv &
 
 > tail -f ~/logs/rocketmqlogs/namesrv.log 
 

 
 
start brokersrv
 
 
> nohup sh bin/mqbroker -n localhost:9876 &
 
 > tail -f ~/logs/rocketmqlogs/broker.log




set name servers host


> export NAMESRV_ADDR=localhost:9876





start producer&consumer


> sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer 
 > sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer




shutdown servers


> sh bin/mqshutdown broker 
 > sh bin/mqshutdown namesrv


CLI Admin Tool Commands



     sh mqadmin



   updateTopic          Update or create topic

   deleteTopic          Delete topic from broker and NameServer.
    updateSubGroup       Update or create subscription group
    deleteSubGroup       Delete subscription group from broker.
    updateBrokerConfig   Update broker's config
    updateTopicPerm      Update topic perm
    topicRoute           Examine topic route info
    topicStatus          Examine topic Status info
    topicClusterList     get cluster info for topic
    brokerStatus         Fetch broker runtime status data
    queryMsgById         Query Message by Id
    queryMsgByKey        Query Message by Key
    queryMsgByUniqueKey  Query Message by Unique key
    queryMsgByOffset     Query Message by offset
    queryMsgByUniqueKey  Query Message by Unique key
    printMsg             Print Message Detail
    sendMsgStatus        send msg to broker.
    brokerConsumeStats   Fetch broker consume stats data
    producerConnection   Query producer's socket connection and client version
    consumerConnection   Query consumer's socket connection, client version and subscription
    consumerProgress     Query consumers's progress, speed
    consumerStatus       Query consumer's internal data structure
    cloneGroupOffset     clone offset from other group.
    clusterList          List all of clusters
    topicList            Fetch all topic list from name server
    updateKvConfig       Create or update KV config.
    deleteKvConfig       Delete KV config.
    wipeWritePerm        Wipe write perm of broker in all name server
    resetOffsetByTime    Reset consumer offset by timestamp(without client restart).
    updateOrderConf      Create or update or delete order conf
    cleanExpiredCQ       Clean expired ConsumeQueue on broker.
    cleanUnusedTopic     Clean unused topic on broker.
    startMonitoring      Start Monitoring
    statsAll             Topic and Consumer tps stats
    syncDocs             Synchronize wiki and issue to github.com
    allocateMQ           Allocate MQ
    checkMsgSendRT       check message send response time
    clusterRT            List All clusters Message Send RT