1.安装

2.shell操作

3.python操作Hbase

1)本地操作

      a、创建表格

      b、写数据

      c、读数据

2)集群操作

4.Java操作Hbase(和Storm集合做实时推荐)

1)本地操作

1.安装

1)打开配置文件bashrc

         

vim ~/.bashrc

export HBASE_HOME=/usr/local/src/hbase-0.98.6-hadoop2

2)打开配置文件hbase-env.sh

vim conf/hbase-env.sh
   
export JAVA_HOME=/usr/local/src/jdk1.8.0_172
export HBASE_MANAGES_ZK=false

export HBASE_MANAGES_ZK=false:表示使用第三方zk,前提是集群已经安装并且开启了进程(建议使用第三方ZK)

export HBASE_MANAGES_ZK=true:表示使用hbase自带的zk

3)打开配置文件hbase-site.xml

vim conf/hbase-site.xml

<configuration>
     <property>
         <name>hbase.rootdir</name>
         <value>hdfs://master:9000/hbase</value>
     </property>
     **** hbase要创建HDFS目录存储数据

     <property>
         <name>hbase.cluster.distributed</name>
         <value>true</value>
     </property>
     **** false表示单机模式(HBase和ZK运行在一个进程中),true表示分布式模式

     <property>
         <name>hbase.zookeeper.quorum</name>
         <value>master,slave1,slave2</value>
     </property>
     <property>
         <name>dfs.replication</name>
         <value>2</value>
     </property>
 </configuration>

4)打开配置文件(内容为从节点的名字)

vim conf/regionservers
     
slave1
slave2

5)把修改好的目录,分发到其他节点上(scp)

 

 

实战:

1.启动zookeeper

./bin/zkServer.sh start

2.启动Hbase

./bin/start-hbase.sh

启动HBase之后,HDFS目录会创建一个hbase的目录

[root@master badou]# hadoop fs -ls /hbase
Found 7 items
drwxr-xr-x   - root supergroup          0 2019-05-27 09:50 /hbase/.tmp
drwxr-xr-x   - root supergroup          0 2019-05-27 09:50 /hbase/MasterProcWALs
drwxr-xr-x   - root supergroup          0 2019-05-27 09:50 /hbase/WALs
drwxr-xr-x   - root supergroup          0 2019-05-27 09:50 /hbase/data
-rw-r--r--   2 root supergroup         42 2019-05-27 09:50 /hbase/hbase.id
-rw-r--r--   2 root supergroup          7 2019-05-27 09:50 /hbase/hbase.version
drwxr-xr-x   - root supergroup          0 2019-05-27 09:50 /hbase/oldWALs

检验是否启动成功

1)利用jps验证进程是否存在HMaster和HRegionServer

2)页面验证:http://master:60010/master-status

3)命令行验证:./bin/hbase shell ,进入终端后,执行status查看节点状态
[root@master hbase-1.3.2.1]# ./bin/hbase shell
...
hbase(main):001:0> status
1 active master, 0 backup masters, 2 servers, 0 dead, 1.0000 average load

创建表:

hbase(main):009:0> create 'm_table', 'meta_data', 'action'
0 row(s) in 2.3010 seconds

=> Hbase::Table - m_table

Tips:创建表会在zookeeper节点下查询到,

[root@master zookeeper-3.4.5]# ./bin/zkCli.sh 
...
[zk: localhost:2181(CONNECTED) 0] ls /hbase/table
[m_table]

zookeeper删除节点:

[zk: localhost:2181(CONNECTED) 10] rmr /hbase/table/m_table

zookeeper创建节点: 节点后面要加一个参数

[zk: localhost:2181(CONNECTED) 15] create /hbase/table 1
Created /hbase/table
[zk: localhost:2181(CONNECTED) 16] ls /hbase/table      
[]

查看表的描述:  表名:m_table   CF:action 和 meta_data

hbase(main):010:0> desc 'm_table'
Table m_table is ENABLED                                                                                                        
m_table                                                                                                                         
COLUMN FAMILIES DESCRIPTION                                                                                                     
{NAME => 'action', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODI
NG => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICAT
ION_SCOPE => '0'}                                                                                                               
{NAME => 'meta_data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLI
CATION_SCOPE => '0'}                                                                                                            
2 row(s) in 0.1900 seconds

写数据:

put 'm_table', '1001', 'meta_data:name', 'zhang3'  
put 'm_table', '1002', 'meta_data:name', 'li4'
put 'm_table', '1001', 'meta_data:age', '18'
put 'm_table', '1002', 'meta_data:gender', 'man'

 读数据:

批量读:

hbase(main):007:0> scan 'm_table'
ROW                               COLUMN+CELL                                                                                   
 1001                             column=meta_data:age, timestamp=1558924454685, value=18                                       
 1001                             column=meta_data:name, timestamp=1558924430160, value=zhang3                                  
 1002                             column=action:click, timestamp=1558924551671, value=man                                       
 1002                             column=meta_data:gender, timestamp=1558925567303, value=girl                                  
 1002                             column=meta_data:name, timestamp=1558924445998, value=li4                                     
2 row(s) in 0.1160 seconds


逐条读:
hbase(main):001:0> get 'm_table', '1001'
COLUMN                            CELL                                                                                          
 meta_data:age                    timestamp=1558924454685, value=18                                                             
 meta_data:name                   timestamp=1558924430160, value=zhang3                                                         
1 row(s) in 0.4520 seconds

hbase(main):002:0> get 'm_table', '1001', 'meta_data:name'
COLUMN                            CELL                                                                                          
 meta_data:name                   timestamp=1558924430160, value=zhang3                                                         
1 row(s) in 0.1050 seconds

修改数据并查看历史版本:put 'm_table', '1002', 'meta_data:gender', 'girl'           通过时间戳找回元数据

hbase(main):003:0> get 'm_table' , '1002' , 'meta_data:gender'
COLUMN                            CELL                                                                                          
 meta_data:gender                 timestamp=1558924464771, value=man                                                            
1 row(s) in 0.0720 seconds

hbase(main):004:0> put 'm_table', '1002', 'meta_data:gender', 'girl'
0 row(s) in 0.1750 seconds

hbase(main):005:0> get 'm_table' , '1002' , 'meta_data:gender'
COLUMN                            CELL                                                                                          
 meta_data:gender                 timestamp=1558925567303, value=girl                                                           
1 row(s) in 0.0240 seconds

hbase(main):006:0> get 'm_table', '1002', {COLUMN=>'meta_data:gender', TIMESTAMP=>1558924464771}
COLUMN                            CELL                                                                                          
 meta_data:gender                 timestamp=1558924464771, value=man                                                            
1 row(s) in 0.0250 seconds

 查看行数:rowkey个数

hbase(main):008:0> count 'm_table'
2 row(s) in 0.0940 seconds

=> 2

数据下载到hdfs上    :df524dd003feea204ed8778af07a9a65代表版本号(region_id),

hbase(main):009:0> flush 'm_table'
0 row(s) in 0.5980 seconds

 HDFS路径:hadoop fs -ls /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65   

df524dd003feea204ed8778af07a9a65代表版本号(region_id),

[root@master badou]# hadoop fs -ls /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65
Found 4 items
-rw-r--r--   2 root supergroup         42 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/.regioninfo
drwxr-xr-x   - root supergroup          0 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/action
drwxr-xr-x   - root supergroup          0 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/meta_data
drwxr-xr-x   - root supergroup          0 2019-05-27 11:03 /hbase/data/default/m_table/df524dd003feea204ed8778af07a9a65/recovered.edits

 

清空词表:

hbase(main):010:0> truncate 'm_table'
Truncating 'm_table' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 3.8750 seconds


hbase(main):012:0>  get 'm_table', '1001', 'meta_data:name'
COLUMN                            CELL                                                                                          
0 row(s) in 0.1740 seconds

 

增加列簇CF:cf_new

hbase(main):011:0>  alter 'm_table', {NAME=>'cf_new', VERSIONS=>3, IN_MEMORY=>true}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.2850 seconds

再次查看表的描述:多了一个CF:cf_new

hbase(main):012:0> desc 'm_table'
Table m_table is ENABLED                                                                                                        
m_table                                                                                                                         
COLUMN FAMILIES DESCRIPTION                                                                                                     
{NAME => 'action', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODI
NG => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICAT
ION_SCOPE => '0'}                                                                                                               
{NAME => 'cf_new', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'true', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODIN
G => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATI
ON_SCOPE => '0'}                                                                                                                
{NAME => 'meta_data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLI
CATION_SCOPE => '0'}                                                                                                            
3 row(s) in 0.0460 seconds

删除列簇:action   再次查看表发现action列簇已经不在了

hbase(main):013:0> alter 'm_table', {NAME=>'action', METHOD=>'delete'}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.5760 seconds

hbase(main):014:0> desc "m_table"
Table m_table is ENABLED                                                                                                        
m_table                                                                                                                         
COLUMN FAMILIES DESCRIPTION                                                                                                     
{NAME => 'cf_new', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'true', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODIN
G => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATI
ON_SCOPE => '0'}                                                                                                                
{NAME => 'meta_data', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLI
CATION_SCOPE => '0'}                                                                                                            
2 row(s) in 0.0820 seconds

删除表:m_table          先disable再drop

hbase(main):015:0> disable 'm_table'
0 row(s) in 2.4790 seconds

hbase(main):016:0> drop 'm_table'
0 row(s) in 1.3740 seconds

hbase(main):017:0> list
TABLE                                                                                                                           
0 row(s) in 0.0350 seconds

=> []

 

 

 

2.shell操作

3.python操作Hbase

python操作Hbase必须通过thrift服务,性能相对Java操作HBase要差

1)本地操作

hbase模块产生

下载thrift源码包:thrift-0.8.0.tar.gz,解压生成thrift-0.8.0文件夹,进入thrift-0.8.0,执行下面命令进行安装:

make是编译,make install是把编译文件放到系统中去

[root@master thrift-0.8.0]#  ./configure
[root@master thrift-0.8.0]#  make
[root@master thrift-0.8.0]#  make install

安装完之后,查找thirft

[root@master thrift-0.8.0]#  which thrift
/usr/local/bin/thrift

此时,thrift命令已经可以使用。但我们还需要thrift的python模块,在开源目录中寻找thrift开源模块:

[root@master thrift-0.8.0]# ls lib/py/build/lib.linux-x86_64-2.7/
thrift

这个thrift目录就是thrift的python模块,把thrift模块复制到与create_table.py同级目录下

 

1.下载hbase源码包:hbase-0.98.24-src.tar.gz

      下载源码包:hbase-0.98.24-src.tar.gz,解压,进入下面目录:

[root@master hbase_test]# tar xvzf  hbase-0.98.24-src.tar.gz
[root@master hbase_test]# [root@master hbase-0.98.24]# cd ../hbase-0.98.24/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift
[root@master thrift]# ls
Hbase.thrift

执行以下命令,生成Hbase模块(生成gen-py目录下的hbase目录就是Hbase模块)

[root@master thrift]# thrift --gen py Hbase.thrift
[root@master thrift]# ls
gen-py  Hbase.thrift
[root@master thrift]# ls gen-py/
hbase  __init__.py

然后把hbase模块移动到与create_table.py同级目录下

 

启动thrift服务

[root@master hbase-1.3.2.1]#  ./bin/hbase-daemon.sh start thrift  
starting thrift, logging to /usr/local/src/hbase-1.3.2.1/logs/hbase-badou-thrift-master.out
(py27tf) [root@master hbase-1.3.2.1]# netstat -antup | grep 9090 
tcp6       0      0 :::9090                 :::*                    LISTEN      13626/java

 

1.建表 

[root@master pyhon_hbase]# ls
create_table.py  hbase  hbase.tgz  thrift
[root@master pyhon_hbase]# python create_table.py 
['m_table', 'new_music_table']

2.插入数据

[root@master pyhon_hbase]# ls
create_table.py  hbase  insert_data.py  thrift
(py27tf) [root@master pyhon_hbase]# python insert_data.py

py中函数需要传什么参数,到hbase源码包hbase.py中找

Hbase的安装一定需要Hadoop吗 hbase 安装_vim

3.读数据

单行读

[root@master pyhon_hbase]# python get_one_line.py 
the row is  1100
the name is  wangqingshui
the flag is  TRUE

多行读

[root@master pyhon_hbase]# python scan_many_lines.py                                       
======
the row is  1100
flags:is_valid  TRUE
meta-data:name  wangqingshui
meta-data:tag   pop

 

2)集群操作

在batch_insert目录下执行bash run.sh,同时观察hbase终端的变化

hbase(main):026:0> scan "new_music_table"
ROW                               COLUMN+CELL                                                                                   
 00000cb9989b2238d6b6e2846e2f9e34 column=flags:is_valid, timestamp=1558940215081, value=TRUE                                    
 00000cb9989b2238d6b6e2846e2f9e34 column=meta-data:name, timestamp=1558940215081, value=00000cb9989b2238d6b6e2846e2f9e34        
 00001e0296367e9a2650dca709972e3f column=flags:is_valid, timestamp=1558940215133, value=TRUE                                    
 00001e0296367e9a2650dca709972e3f column=meta-data:name, timestamp=1558940215133, value=00001e0296367e9a2650dca709972e3f

 

 

4.Java操作Hbase(和Storm集合做实时推荐)

1.创建表:

hbase(main):027:0> create 'user_action_table' , 'action_log'
0 row(s) in 2.5800 seconds

=> Hbase::Table - user_action_table

2.java增删读数据

HbaseDelOneRecord.java

HbaseGetOneRecord.java

HbasePutOneRecord.java

HbaseScanManyRecords.java