我的大数据开发第3章:kafka单节点伪集群安装

kafka 都依赖 zookeeper。

1. zookeeper 单机伪分布式部署

确保环境变量正确设置:

# zookeeper
export ZK_HOME=$APACHE_ROOT/zookeeper-current
export ZK_CONF_DIR=$ZK_HOME/conf
export PATH=$PATH:$ZK_HOME/bin
 
# kafka and manager
export KAFKA_HOME=$APACHE_ROOT/kafka-current
export CMAK_HOME=$APACHE_ROOT/cmak-current
export KAFKA_CONF_DIR=$KAFKA_HOME/config
export PATH=$PATH:$KAFKA_HOME/bin

 

复制 $ZK_CONF_DIR/zoo_sample.cfg 为 zoo1.cfg, zoo2.cfg, zoo3.cfg:

更改的部分内容如下:

# $ZK_CONF_DIR/zoo1.cfg
dataDir=/hacl/zookeeper/data1
dataLogDir=/hacl/zookeeper/log1
clientPort=2181

server.1=zk1:2881:3881
server.2=zk2:2882:3882
server.3=zk3:2883:3883
# $ZK_CONF_DIR/zoo2.cfg
dataDir=/hacl/zookeeper/data2
dataLogDir=/hacl/zookeeper/log2
clientPort=2182

server.1=zk1:2881:3881
server.2=zk2:2882:3882
server.3=zk3:2883:3883
# $ZK_CONF_DIR/zoo3.cfg
dataDir=/hacl/zookeeper/data3
dataLogDir=/hacl/zookeeper/log3
clientPort=2183

server.1=zk1:2881:3881
server.2=zk2:2882:3882
server.3=zk3:2883:3883

分别创建目录:

mkdir -p /hacl/zookeeper/{data1,data2,data3,log1,log2,log3}

echo "1" > /hacl/zookeeper/data1/myid

echo "2" > /hacl/zookeeper/data2/myid

echo "3" > /hacl/zookeeper/data3/myid

启动 | 关闭 | 查看 zookeeper 集群:

start|stop|status zoo1.cfg

start|stop|status zoo2.cfg

start|stop|status zoo3.cfg

至此 zookeeper 集群(单机,3个节点)已经启动了。

2 kafka 单机伪分布式部署

必须按上面的方法先搭建好 zookeeper 集群。然后再开始kafka假集群的部署。创建kafka数据目录:

mkdir -p /hacl/kafka/{logs1,logs2,logs3,tmp/{log1,log2,log3}}

为集群每个节点编制一个配置文件 serverN.properties (N=1,2,3)。下面的文件只列出不同的部分:

# server1.properties
broker.id=1
=devnode
port=9091
advertised.=devnode
advertised.port=9091
listeners=PLAINTEXT://:9091
advertised.listeners=PLAINTEXT://devnode:9091
log.dir=/hacl/kafka/tmp/log1
log.dirs=/hacl/kafka/logs1
num.partitions=64
zookeeper.connect=zk1:2181,zk2:2182,zk3:2183
# server2.properties
broker.id=2
=devnode
port=9092
advertised.=devnode
advertised.port=9092
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://devnode:9092
log.dir=/hacl/kafka/tmp/log2
log.dirs=/hacl/kafka/logs2
num.partitions=64
zookeeper.connect=zk1:2181,zk2:2182,zk3:2183
# server3.properties
broker.id=3
=devnode
port=9093
advertised.=devnode
advertised.port=9093
listeners=PLAINTEXT://:9093
advertised.listeners=PLAINTEXT://devnode:9093
log.dir=/hacl/kafka/tmp/log3
log.dirs=/hacl/kafka/logs3
num.partitions=64
zookeeper.connect=zk1:2181,zk2:2182,zk3:2183

将它们复制到 $KAFKA_CONF_DIR/ 下面。然后分别启动 kafka 节点:

-daemon $KAFKA_CONF_DIR/server1.properties
-daemon $KAFKA_CONF_DIR/server2.properties
-daemon $KAFKA_CONF_DIR/server3.properties

这样集群就启动了。关闭集群要分别杀掉进程。修改 $KAFKA_HOME/bin/,其中第三个PIDS=... 更改如下:

...

else
    PIDS=$(ps ax | grep 'kafka-current' | grep java | grep -v grep | awk '{print $1}')
fi

...

运行:,就可以正常停止 kafka 了(zookeeper 必须在运行,kafka停止之后才可以关闭zookeeper)。

3 kafka集群管理器

下载 cmak-3.0.0.5.zip (Cluster Manager for Apache Kafka - ​​https:///yahoo/CMAK​​)。解压到 $CMAK_HOME。

需要java11,因此下载:jdk-11.0.9_linux-x64_bin.tar.gz (/usr/local/java/Java11 -> /usr/local/java/jdk-11.0.9)。解压后配置 cmak-3.0.0.5/conf/application.conf,只需要设置下面的部分内容:

# Settings prefixed with 'kafka-manager.' will be deprecated, use 'cmak.' instead.
# https:///yahoo/CMAK/issues/713
kafka-manager.zkhosts="zk1:2181,zk2:2182,zk3:2183"
#kafka-manager.zkhosts=${?ZK_HOSTS}
cmak.zkhosts="zk1:2181,zk2:2182,zk3:2183"
#cmak.zkhosts=${?ZK_HOSTS}

然后运行管理器(务必指定了 jdk11 的目录)(cd cmak-3.0.0.5/):

cd cmak-3.0.0.5/ && rm -f RUNNING_PID

bin/cmak -java-home /usr/local/java/java11 -Dhttp.port=8282

后台模式:

bin/cmak -java-home /usr/local/java/java11 -Dhttp.port=8282 > /tmp/cmak-3.log 2>&1 &

使用浏览器打开:http://$IP:8282 可用看到 kafka 管理器页面。

我的大数据开发第3章:kafka单节点伪集群安装_kafka

 

参考

​http://hadoop.apache.org/docs/r3.3.0/hadoop-project-dist/hadoop-common/SingleCluster.html​