最近体验了一下 kudu 1.8 的 Rebalance 功能,感觉很不错。感觉特别不爽的是编译安装之后程序的体量之大,实乃闻所未闻,达到可怕的 46GB。各位小伙伴们安装前要有心里准备。
安装环境
System: CentOS 7.6
解决系统依赖
yum install autoconf automake cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain flex gcc gcc-c++ gdb git java-1.8.0-openjdk-devel krb5-server krb5-workstation libtool make openssl-devel patch pkgconfig redhat-lsb-core rsync unzip vim-common which -y
下载kudu-1.8.0源码包(官方下载或GitHub下载都可以)
cd /data
git clone https://github.com/apache/kudu
cd kudu
build-support/enable_devtoolset.sh
下载程序依赖
mkdir thirdparty/src/
cd thirdparty/src/
wget http://d3dr9sfxru4sde.cloudfront.net/glog-0.3.5.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/googletest-release-1.8.0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/gflags-2.2.0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/gperftools-2.6.90.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/protobuf-3.4.1.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/cmake-3.9.0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/snappy-1.1.4.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/zlib-1.2.8.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/libev-4.20.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/rapidjson-0.11.zip
wget http://d3dr9sfxru4sde.cloudfront.net/squeasel-9335b81317a6451d5a37c5dc7ec088eecbf68c82.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/mustache-87a592e8aa04497764c533acd6e887618ca7b8a8.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/google-styleguide-7a179d1ac2e08a5cc1622bec900d1e0452776713.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/gcovr-3.0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/curl-7.59.0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/crcutil-42148a6df6986a257ab21c80f8eca2e54544ac4d.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/libunwind-1.3-rc1.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/python-2.7.13.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/llvm-6.0.0-iwyu-0.9.src.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/lz4-lz4-r130.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/bitshuffle-55f9b4c.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/kudu-trace-viewer-21d76f8350fea2da2aa25cb6fd512703497d0c11.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/nvml-1.1.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/boost_1_61_0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/breakpad-9eac2058b70615519b2c4d8c6bdbfca1bd079e39.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/sparsehash-c11-47a55825ca3b35eab1ca22b7ab82b9544e32a9af.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/sparsepp-824860bb76893d163efbcff330734b9f62eecb17.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/thrift-0.11.0.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/bison-3.0.4.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/hive-498021fa15186aee8b282d3c032fbd2cede6bec4-stripped.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/hadoop-2.8.5-stripped.tar.gz
wget http://d3dr9sfxru4sde.cloudfront.net/apache-sentry-2c9a927a9e87cba0e4c0f34fc0b55887c6636927-bin.tar.gz
编译程序依赖并构建kudu安装配置
cd ../../
thirdparty/build-if-necessary.sh # 这个命令会把上面下载的依赖全部安装执行Kudu的./configura
创建编译后的安装目录
mkdir build/release -p
cd build/release/
../../build-support/enable_devtoolset.sh
编译kudu并安装
../../thirdparty/installed/common/bin/cmake -DCMAKE_BUILD_TYPE=release ../..
make -j4
make DESTDIR=/data/kudu/build/release/kudu install
对lib文件做软链接
ln -s /data/kudu/build/release/kudu/usr/local/include/* /usr/local/include/
ln -s /data/kudu/build/release/kudu/usr/local/lib64/* /usr/local/lib64/
ln -s /data/kudu/build/release/kudu/usr/local/share/* /usr/local/share/
创建配置文件
======== MASTER ========
mkdir conf
cd conf
cat >>master.gflagfile<<EOF
## Comma-separated list of the RPC addresses belonging to all Masters in this cluster.
## NOTE: if not specified, configures a non-replicated Master.
--master_addresses=kudu1:7051,kudu2:7051,kudu3:7051
--rpc_bind_addresses=kudu1:7051
--log_dir=/data/kudu_data/master/logs
--log_filename=kudu1
--fs_wal_dir=/data/kudu_data/master/wal
--fs_data_dirs=/data/kudu_data/master/data
--enable_process_lifetime_heap_profiling=true
--heap_profile_path=/data/kudu_data/master/heap
--rpc-encryption=disabled
--rpc_authentication=disabled
#--unlock_unsafe_flags=true
#--allow_unsafe_replication_factor=true
#--max_log_size=1800
--max_log_size=2048
#--memory_limit_hard_bytes=0
--memory_limit_hard_bytes=1073741824
--default_num_replicas=3
--max_clock_sync_error_usec=10000000
--consensus_rpc_timeout_ms=30000
--follower_unavailable_considered_failed_sec=300
--leader_failure_max_missed_heartbeat_periods=3
#--block_manager_max_open_files=10240
#--server_thread_pool_max_thread_count=-1
--tserver_unresponsive_timeout_ms=60000
--rpc_num_service_threads=10
--max_negotiation_threads=50
--min_negotiation_threads=10
--rpc_negotiation_timeout_ms=3000
--rpc_default_keepalive_time_ms=65000
#--rpc_num_acceptors_per_address=1
--rpc_num_acceptors_per_address=5
#--master_ts_rpc_timeout_ms=30000
--master_ts_rpc_timeout_ms=60000
#--remember_clients_ttl_ms=60000
--remember_clients_ttl_ms=3600000
#--remember_responses_ttl_ms=60000
--remember_responses_ttl_ms=600000
#--rpc_service_queue_length=50
--rpc_service_queue_length=1000
#--raft_heartbeat_interval_ms=500
--raft_heartbeat_interval_ms=60000
#--heartbeat_interval_ms=1000
--heartbeat_interval_ms=60000
--heartbeat_max_failures_before_backoff=3
## You can avoid the dependency on ntpd by running Kudu with --use-hybrid-clock=false
## This is not recommended for production environment.
## NOTE: If you run without hybrid time the tablet history GC will not work.
## Therefore when you delete or update a row the history of that data will be kept
## forever. Eventually you may run out of disk space.
--use_hybrid_clock=false
--webserver_enabled=true
--metrics_log_interval_ms=60000
--webserver_port=8051
#--webserver_doc_root=/data/kudu/www
EOF
======== TSERVER =========
cat >>tserver.gflagfile<<EOF
## Comma-separated list of the RPC addresses belonging to all Masters in this cluster.
## NOTE: if not specified, configures a non-replicated Master.
--tserver_master_addrs=kudu1:7051,kudu2:7051,kudu3:7051
--rpc_bind_addresses=kudu:7050
--log_dir=/data/kudu_data/tserver/logs
--log_filename=kudu1
--fs_wal_dir=/data/kudu_data/tserver/wal
--fs_data_dirs=/data/kudu_data/tserver/data
--enable_process_lifetime_heap_profiling=true
--heap_profile_path=/data/kudu_data/tserver/heap
--rpc-encryption=disabled
--rpc_authentication=disabled
#--unlock_unsafe_flags=true
#--allow_unsafe_replication_factor=true
#--max_log_size=1800
--max_log_size=2048
#--memory_limit_hard_bytes=0
--memory_limit_hard_bytes=1073741824
--default_num_replicas=3
--max_clock_sync_error_usec=10000000
--consensus_rpc_timeout_ms=30000
--follower_unavailable_considered_failed_sec=300
--leader_failure_max_missed_heartbeat_periods=3
#--block_manager_max_open_files=10240
#--server_thread_pool_max_thread_count=-1
--tserver_unresponsive_timeout_ms=60000
--rpc_num_service_threads=10
--max_negotiation_threads=50
--min_negotiation_threads=10
--rpc_negotiation_timeout_ms=3000
--rpc_default_keepalive_time_ms=65000
#--rpc_num_acceptors_per_address=1
--rpc_num_acceptors_per_address=5
#--master_ts_rpc_timeout_ms=30000
--master_ts_rpc_timeout_ms=60000
#--remember_clients_ttl_ms=60000
--remember_clients_ttl_ms=3600000
#--remember_responses_ttl_ms=60000
--remember_responses_ttl_ms=600000
#--rpc_service_queue_length=50
--rpc_service_queue_length=1000
#--raft_heartbeat_interval_ms=500
--raft_heartbeat_interval_ms=60000
#--heartbeat_interval_ms=1000
--heartbeat_interval_ms=60000
--heartbeat_max_failures_before_backoff=3
## You can avoid the dependency on ntpd by running Kudu with --use-hybrid-clock=false
## This is not recommended for production environment.
## NOTE: If you run without hybrid time the tablet history GC will not work.
## Therefore when you delete or update a row the history of that data will be kept
## forever. Eventually you may run out of disk space.
--use_hybrid_clock=false
--webserver_enabled=true
--metrics_log_interval_ms=60000
--webserver_port=8050
#--webserver_doc_root=/data/kudu/www
EOF
配置系统systemd启动
========= MASTER =========
cat >>/usr/lib/systemd/system/kudu-master.service<<EOF
[Unit]
Description=Apache Kudu Master Server
Documentation=http://kudu.apache.org
[Service]
Environment=KUDU_HOME=/data/kudu
ExecStart=/data/kudu/build/release/bin/kudu-master --flagfile=/data/kudu/build/release/conf/master.gflagfile
TimeoutStopSec=5
Restart=on-failure
User=kudu
#LimitNOFILE=65535
#LimitNPROC=10240
[Install]
WantedBy=multi-user.target
EOF
========= TSERVER =========
cat >>/usr/lib/systemd/system/kudu-tserver.service<<EOF
[Unit]
Description=Apache Kudu Master Server
Documentation=http://kudu.apache.org
[Service]
Environment=KUDU_HOME=/data/kudu
ExecStart=/data/kudu/build/release/bin/kudu-tserver --flagfile=/data/kudu/build/release/conf/tserver.gflagfile
TimeoutStopSec=5
Restart=on-failure
User=kudu
#LimitNOFILE=65535
#LimitNPROC=10240
[Install]
WantedBy=multi-user.target
EOF
创建进程用户
useradd kudu
创建数据目录(根据配置文件创建)
mkdir /data/kudu_data/{master,tserver}/{data,wal,logs,heap} -p
chown -R kudu.kudu /data/kudu_data/
chown -R kudu.kudu /data/kudu/
cd /data
配置环境
cat >>/etc/profile<<EOF
export PATH=${PATH}:/data/kudu/build/release/bin
EOF
source /etc/profile
启动程序
systemctl start kudu-master.service
systemctl start kudu-tserver.service
创建数据表测试
# 使用以下控制台程序创建数据表
# kudu-shell-1.0-SNAPSHOT.jar
cat >>kudu-shell.sh<<EOF
#!/usr/bin/env bash
java -cp ./kudu-shell-1.0-SNAPSHOT.jar org.laowang.kudushell.Main -s kudu:7050,kudu:7051,kudu:7052
EOF
chown kudu.kudu kudu-shell-1.0-SNAPSHOT.jar
kudu-shell.sh
su - kudu
sh kudu-shell.sh
建表语句
CREATE TABLE test(
id string,
partition_month string,
contract_no string,
customer_name string,
product_root_name string,
product_category_name string,
business_mode_name string,
fee_flag string,
transaction_date string,
serial_no string,
client_date string,
bank_name string,
bank_serial_no string,
business_type string,
identity_no string,
pay_date string,
repayment_amount string,
indeed_pre_fee string,
indeed_amount string,
pre_fee_penalty string,
status string,
trust_company_name string,
trust_plan_name string,
clear_date string,
branch_name string,
before_clear string,
after_clear string,
data_month string,
custody_flag string,
print_contime string,
cash_subject string,
original_bank_serial string,
voucher_no string,
lease_way string,
remark string,
primary key(id, partition_month)
)
PARTITION BY HASH(partition_month) PARTITIONS 4;
Rebalance测试
如果想测试 rebalance,可以先配置 3 个 tserver 节点,多建几张表;再添加 1 个或 2 个 tserver 节点,执行以下 rebalance 命令进行数据均衡:
kudu cluster rebalance kudu:7050,kudu:7051,kudu:7052
执行以下命令对指定表进行数据均衡:
kudu cluster rebalance --tables test,test1 kudu:7050,kudu:7051,kudu:7052
查看kudu集群状态
[kudu@localhost ~]$ kudu cluster ksck kudu:7050,kudu:7051,kudu:7052
Master Summary
UUID | Address | Status
----------------------------------+-----------+---------
5378708b53dc49cf9d8c0dd20e8a14f0 | kudu:7050 | HEALTHY
791d7511e2384e2a9f530f343f7c14f2 | kudu:7052 | HEALTHY
b5f447fde73a4426939152e5c9c5ea07 | kudu:7051 | HEALTHY
Flag | Value | Tags | Master
------------------+-------+--------+-------------------------
use_hybrid_clock | false | hidden | all 3 server(s) checked
Tablet Server Summary
UUID | Address | Status
----------------------------------+--------------------+---------
7bd88f7cbd8947f5a4c440874240a026 | 10.143.252.21:7056 | HEALTHY
9d93d1b805834e899a4535d285c8372d | 10.143.252.21:7053 | HEALTHY
b9d7e065ad8347a983bcfb4e5c058c44 | 10.143.252.21:7055 | HEALTHY
ce477c4f2cbd423c898078d34216b966 | 10.143.252.21:7054 | HEALTHY
e67d662ff0da4b6d97741b7a8ec67682 | 10.143.252.21:7057 | HEALTHY
Flag | Value | Tags | Tablet Server
------------------+-------+--------+-------------------------
use_hybrid_clock | false | hidden | all 5 server(s) checked
Version Summary
Version | Servers
---------+-------------------------
1.8.0 | all 8 server(s) checked
Summary by table
Name | RF | Status | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable
-------+----+---------+---------------+---------+------------+------------------+-------------
test | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test1 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test2 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test3 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test4 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test5 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test6 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test7 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
test8 | 3 | HEALTHY | 4 | 4 | 0 | 0 | 0
| Total Count
----------------+-------------
Masters | 3
Tablet Servers | 5
Tables | 9
Tablets | 36
Replicas | 108
==================
Warnings:
==================
Some masters have unsafe, experimental, or hidden flags set
Some tablet servers have unsafe, experimental, or hidden flags set
OK
查看master节点
[kudu@localhost ~]$ kudu master list kudu:7050,kudu:7051,kudu:7052
uuid | rpc-addresses
----------------------------------+--------------------
5378708b53dc49cf9d8c0dd20e8a14f0 | 10.143.252.21:7050
b5f447fde73a4426939152e5c9c5ea07 | 10.143.252.21:7051
791d7511e2384e2a9f530f343f7c14f2 | 10.143.252.21:7052
查看tserver节点
[kudu@localhost ~]$ kudu tserver list kudu:7050,kudu:7051,kudu:7052
uuid | rpc-addresses
----------------------------------+--------------------
e67d662ff0da4b6d97741b7a8ec67682 | 10.143.252.21:7057
7bd88f7cbd8947f5a4c440874240a026 | 10.143.252.21:7056
b9d7e065ad8347a983bcfb4e5c058c44 | 10.143.252.21:7055
ce477c4f2cbd423c898078d34216b966 | 10.143.252.21:7054
9d93d1b805834e899a4535d285c8372d | 10.143.252.21:7053
查看数据表
[kudu@localhost ~]$ kudu table list kudu:7050,kudu:7051,kudu:7052
test1
test5
test6
test2
test
test3
test4
test8
test7
查看表分区分布情况
[kudu@localhost ~]$ kudu table list -tables test -list-tablets kudu:7050,kudu:7051,kudu:7052
test
T 9cb2ec3477134d0396e33ab7acbb3545
L b9d7e065ad8347a983bcfb4e5c058c44 10.143.252.21:7055
V 9d93d1b805834e899a4535d285c8372d 10.143.252.21:7053
V 7bd88f7cbd8947f5a4c440874240a026 10.143.252.21:7056
T 4f24e4a009914ef7ba8d4352c12aaf63
L 9d93d1b805834e899a4535d285c8372d 10.143.252.21:7053
V e67d662ff0da4b6d97741b7a8ec67682 10.143.252.21:7057
V 7bd88f7cbd8947f5a4c440874240a026 10.143.252.21:7056
T 66d9ede2ba774f0787f7e07bc79223cf
L b9d7e065ad8347a983bcfb4e5c058c44 10.143.252.21:7055
V ce477c4f2cbd423c898078d34216b966 10.143.252.21:7054
V 7bd88f7cbd8947f5a4c440874240a026 10.143.252.21:7056
T a6d7bbb5e6ad45208dc794d7352eb1e4
V 9d93d1b805834e899a4535d285c8372d 10.143.252.21:7053
V ce477c4f2cbd423c898078d34216b966 10.143.252.21:7054
L e67d662ff0da4b6d97741b7a8ec67682 10.143.252.21:7057
以上是常用基本操作,更多的操作执行 kudu --help 自行查看