1.安装
2.shell操作
3.python操作Hbase
1)本地操作
a、创建表格
b、写数据
c、读数据
2)集群操作
4.Java操作Hbase(和Storm集合做实时推荐)
1)本地操作
1.安装
1)打开配置文件bashrc
vim ~/.bashrc
export HBASE_HOME=/usr/local/src/hbase-0.98.6-hadoop2
2)打开配置文件hbase-env.sh
vim conf/hbase-env.sh
export JAVA_HOME=/usr/local/src/jdk1.8.0_172
export HBASE_MANAGES_ZK=false
export HBASE_MANAGES_ZK=false:表示使用第三方zk,前提是集群已经安装并且开启了进程(建议使用第三方ZK)
export HBASE_MANAGES_ZK=true:表示使用hbase自带的zk
3)打开配置文件hbase-site.xml
vim conf/hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
**** hbase要创建HDFS目录存储数据
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
**** false表示单机模式(HBase和ZK运行在一个进程中),true表示分布式模式
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
4)打开配置文件(内容为从节点的名字)
vim conf/regionservers
slave1
slave2
5)把修改好的目录,分发到其他节点上(scp)
实战:
1.启动zookeeper
./bin/zkServer.sh start
2.启动Hbase
./bin/start-hbase.sh
启动HBase之后,HDFS目录会创建一个hbase的目录
[root@master badou]# hadoop fs -ls /hbase
Found 7 items
drwxr-xr-x - root supergroup 0 2019-05-27 09:50 /hbase/.tmp
drwxr-xr-x - root supergroup 0 2019-05-27 09:50 /hbase/MasterProcWALs
drwxr-xr-x - root supergroup 0 2019-05-27 09:50 /hbase/WALs
drwxr-xr-x - root supergroup 0 2019-05-27 09:50 /hbase/data
-rw-r--r-- 2 root supergroup 42 2019-05-27 09:50 /hbase/hbase.id
-rw-r--r-- 2 root supergroup 7 2019-05-27 09:50 /hbase/hbase.version
drwxr-xr-x - root supergroup 0 2019-05-27 09:50 /hbase/oldWALs
检验是否启动成功
1)利用jps验证进程是否存在HMaster和HRegionServer
2)页面验证:http://master:60010/master-status
3)命令行验证:./bin/hbase shell ,进入终端后,执行status查看节点状态
[root@master hbase-1.3.2.1]# ./bin/hbase shell
...
hbase(main):001:0> status
1 active master, 0 backup masters, 2 servers, 0 dead, 1.0000 average load
创建表:
hbase(main):009:0> create 'm_table', 'meta_data', 'action'
0 row(s) in 2.3010 seconds
=> Hbase::Table - m_table
Tips:创建表会在zookeeper节点下查询到,
[root@master zookeeper-3.4.5]# ./bin/zkCli.sh
...
[zk: localhost:2181(CONNECTED) 0] ls /hbase/table
[m_table]
zookeeper删除节点:
[zk: localhost:2181(CONNECTED) 10] rmr /hbase/table/m_table
zookeeper创建节点: 节点后面要加一个参数
[zk: localhost:2181(CONNECTED) 15] create /hbase/table 1
Created /hbase/table
[zk: localhost:2181(CONNECTED) 16] ls /hbase/table
[]
查看表的描述: 表名:m_table CF:action 和 meta_data
hbase(main):010:0> desc 'm_table'
Table m_table is ENABLED
m_table
COLUMN FAMILIES DESCRIPTION
{NAME => 'action', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODI
NG => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICAT
ION_SCOPE => '0'}
{NAME => 'meta_data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLI
CATION_SCOPE => '0'}
2 row(s) in 0.1900 seconds
写数据:
put 'm_table', '1001', 'meta_data:name', 'zhang3'
put 'm_table', '1002', 'meta_data:name', 'li4'
put 'm_table', '1001', 'meta_data:age', '18'
put 'm_table', '1002', 'meta_data:gender', 'man'
读数据:
批量读:
hbase(main):007:0> scan 'm_table'
ROW COLUMN+CELL
1001 column=meta_data:age, timestamp=1558924454685, value=18
1001 column=meta_data:name, timestamp=1558924430160, value=zhang3
1002 column=action:click, timestamp=1558924551671, value=man
1002 column=meta_data:gender, timestamp=1558925567303, value=girl
1002 column=meta_data:name, timestamp=1558924445998, value=li4
2 row(s) in 0.1160 seconds
逐条读:
hbase(main):001:0> get 'm_table', '1001'
COLUMN CELL
meta_data:age timestamp=1558924454685, value=18
meta_data:name timestamp=1558924430160, value=zhang3
1 row(s) in 0.4520 seconds
hbase(main):002:0> get 'm_table', '1001', 'meta_data:name'
COLUMN CELL
meta_data:name timestamp=1558924430160, value=zhang3
1 row(s) in 0.1050 seconds
修改数据并查看历史版本:put 'm_table', '1002', 'meta_data:gender', 'girl' 通过时间戳找回元数据
hbase(main):003:0> get 'm_table' , '1002' , 'meta_data:gender'
COLUMN CELL
meta_data:gender timestamp=1558924464771, value=man
1 row(s) in 0.0720 seconds
hbase(main):004:0> put 'm_table', '1002', 'meta_data:gender', 'girl'
0 row(s) in 0.1750 seconds
hbase(main):005:0> get 'm_table' , '1002' , 'meta_data:gender'
COLUMN CELL
meta_data:gender timestamp=1558925567303, value=girl
1 row(s) in 0.0240 seconds
hbase(main):006:0> get 'm_table', '1002', {COLUMN=>'meta_data:gender', TIMESTAMP=>1558924464771}
COLUMN CELL
meta_data:gender timestamp=1558924464771, value=man
1 row(s) in 0.0250 seconds
查看行数:rowkey个数
hbase(main):008:0> count 'm_table'
2 row(s) in 0.0940 seconds
=> 2
数据下载到hdfs上 :df524dd003feea204ed8778af07a9a65代表版本号(region_id),
hbase(main):009:0> flush 'm_table'
0 row(s) in 0.5980 seconds
HDFS路径:hadoop fs -ls /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65
df524dd003feea204ed8778af07a9a65代表版本号(region_id),
[root@master badou]# hadoop fs -ls /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65
Found 4 items
-rw-r--r-- 2 root supergroup 42 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/.regioninfo
drwxr-xr-x - root supergroup 0 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/action
drwxr-xr-x - root supergroup 0 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/meta_data
drwxr-xr-x - root supergroup 0 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/recovered.edits
清空词表:
hbase(main):010:0> truncate 'm_table'
Truncating 'm_table' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 3.8750 seconds
hbase(main):012:0> get 'm_table', '1001', 'meta_data:name'
COLUMN CELL
0 row(s) in 0.1740 seconds
增加列簇CF:cf_new
hbase(main):011:0> alter 'm_table', {NAME=>'cf_new', VERSIONS=>3, IN_MEMORY=>true}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.2850 seconds
再次查看表的描述:多了一个CF:cf_new
hbase(main):012:0> desc 'm_table'
Table m_table is ENABLED
m_table
COLUMN FAMILIES DESCRIPTION
{NAME => 'action', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODI
NG => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICAT
ION_SCOPE => '0'}
{NAME => 'cf_new', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'true', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODIN
G => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATI
ON_SCOPE => '0'}
{NAME => 'meta_data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLI
CATION_SCOPE => '0'}
3 row(s) in 0.0460 seconds
删除列簇:action 再次查看表发现action列簇已经不在了
hbase(main):013:0> alter 'm_table', {NAME=>'action', METHOD=>'delete'}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.5760 seconds
hbase(main):014:0> desc "m_table"
Table m_table is ENABLED
m_table
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf_new', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'true', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODIN
G => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATI
ON_SCOPE => '0'}
{NAME => 'meta_data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLI
CATION_SCOPE => '0'}
2 row(s) in 0.0820 seconds
删除表:m_table 先disable再drop
hbase(main):015:0> disable 'm_table'
0 row(s) in 2.4790 seconds
hbase(main):016:0> drop 'm_table'
0 row(s) in 1.3740 seconds
hbase(main):017:0> list
TABLE
0 row(s) in 0.0350 seconds
=> []
2.shell操作
3.python操作Hbase
python操作Hbase必须通过thrift服务,性能相对Java操作HBase要差
1)本地操作
hbase模块产生
下载thrift源码包:thrift-0.8.0.tar.gz,解压生成thrift-0.8.0文件夹,进入thrift-0.8.0,执行下面命令进行安装:
make是编译,make install是把编译文件放到系统中去
[root@master thrift-0.8.0]# ./configure
[root@master thrift-0.8.0]# make
[root@master thrift-0.8.0]# make install
安装完之后,查找thirft
[root@master thrift-0.8.0]# which thrift
/usr/local/bin/thrift
此时,thrift命令已经可以使用。但我们还需要thrift的python模块,在开源目录中寻找thrift开源模块:
[root@master thrift-0.8.0]# ls lib/py/build/lib.linux-x86_64-2.7/
thrift
这个thrift目录就是thrift的python模块,把thrift模块复制到与create_table.py同级目录下
1.下载hbase源码包:hbase-0.98.24-src.tar.gz
下载源码包:hbase-0.98.24-src.tar.gz,解压,进入下面目录:
[root@master hbase_test]# tar xvzf hbase-0.98.24-src.tar.gz
[root@master hbase_test]# [root@master hbase-0.98.24]# cd ../hbase-0.98.24/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift
[root@master thrift]# ls
Hbase.thrift
执行以下命令,生成Hbase模块(生成gen-py目录下的hbase目录就是Hbase模块)
[root@master thrift]# thrift --gen py Hbase.thrift
[root@master thrift]# ls
gen-py Hbase.thrift
[root@master thrift]# ls gen-py/
hbase __init__.py
然后把hbase模块移动到与create_table.py同级目录下
启动thrift服务
[root@master hbase-1.3.2.1]# ./bin/hbase-daemon.sh start thrift
starting thrift, logging to /usr/local/src/hbase-1.3.2.1/logs/hbase-badou-thrift-master.out
(py27tf) [root@master hbase-1.3.2.1]# netstat -antup | grep 9090
tcp6 0 0 :::9090 :::* LISTEN 13626/java
1.建表
[root@master pyhon_hbase]# ls
create_table.py hbase hbase.tgz thrift
[root@master pyhon_hbase]# python create_table.py
['m_table', 'new_music_table']
2.插入数据
[root@master pyhon_hbase]# ls
create_table.py hbase insert_data.py thrift
(py27tf) [root@master pyhon_hbase]# python insert_data.py
py中函数需要传什么参数,到hbase源码包hbase.py中找
3.读数据
单行读
[root@master pyhon_hbase]# python get_one_line.py
the row is 1100
the name is wangqingshui
the flag is TRUE
多行读
[root@master pyhon_hbase]# python scan_many_lines.py
======
the row is 1100
flags:is_valid TRUE
meta-data:name wangqingshui
meta-data:tag pop
2)集群操作
在batch_insert目录下执行bash run.sh,同时观察hbase终端的变化
hbase(main):026:0> scan "new_music_table"
ROW COLUMN+CELL
00000cb9989b2238d6b6e2846e2f9e34 column=flags:is_valid, timestamp=1558940215081, value=TRUE
00000cb9989b2238d6b6e2846e2f9e34 column=meta-data:name, timestamp=1558940215081, value=00000cb9989b2238d6b6e2846e2f9e34
00001e0296367e9a2650dca709972e3f column=flags:is_valid, timestamp=1558940215133, value=TRUE
00001e0296367e9a2650dca709972e3f column=meta-data:name, timestamp=1558940215133, value=00001e0296367e9a2650dca709972e3f
4.Java操作Hbase(和Storm集合做实时推荐)
1.创建表:
hbase(main):027:0> create 'user_action_table' , 'action_log'
0 row(s) in 2.5800 seconds
=> Hbase::Table - user_action_table
2.java增删读数据
HbaseDelOneRecord.java
HbaseGetOneRecord.java
HbasePutOneRecord.java
HbaseScanManyRecords.java