文章目录
- 环境准备
- 数据定义
- 创建表
- 更改表结构
- 删除表
- 数据操作
- put
- get
- scan
- delete
环境准备
linux+java+zookeeper+hadoop+hbase
并启动相关进程
网上教程很多,这里不再赘述。
数据定义
与关系型数据库不同,在hbase中,基本组成为表,不存在多个数据库。因此,在hbase中存储数据要先创建表,创建表的同时需要设置列族的数量和属性。
命令 | 描述 |
create | 创建指定模式的新表 |
alter | 修改表的结构,如添加新的列族 |
describe | 展示表结构的信息,包括列族的数量与属性 |
list | 列出hbase中已有的表 |
disable/enable | 为了删除或更改表而禁用一个表,更改完后需要解禁表 |
disable_all | 禁用所有的表,可以用正则表达式匹配 |
is_disable | 判断一个表是否被禁用 |
drop | 删除表 |
truncate | 如果只想删除数据而不是表结构,则可用truncate来禁用表,删除表并自动重建表结构 |
创建表
确保以下命令执行结果是正确的
[root@xwk1 bin]# jps
3184 Jps
2210 NameNode
2567 ResourceManager
2409 SecondaryNameNode
2969 HMaster
3086 HRegionServer
2031 QuorumPeerMain
[root@xwk1 bin]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/software/hbase-2.0.2/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/software/hadoop-2.8.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.2, r1cfab033e779df840d5612a85277f42a6a4e8172, Tue Aug 28 20:50:40 PDT 2018
Took 0.0025 seconds
hbase(main):001:0>
下面进行创建表的操作,并用list查看是否创建成功。其中student
表是我之前做实验时留下的。现在我们创建的是Student
表
hbase(main):001:0> create 'Student','StuInfo','Grades'
Created table Student
Took 3.7318 seconds
=> Hbase::Table - Student
hbase(main):002:0> list
TABLE
Student
student
2 row(s)
Took 0.1358 seconds
=> ["Student", "student"]
hbase(main):003:0>
这条命令创建了名为Student
的表,表中包含两个列族,分别为StuInfo
和Grades
。在HBase Shell语法中,所有字符串参数都必须包含在单引号中,且区分大小写。比如student
表和Student
表是两个完全不同的表。
查看表的列族信息
hbase(main):003:0> describe Student
NameError: uninitialized constant Student
hbase(main):004:0>
这条命令为什么会报错呢?是因为我忘了加单引号,上面说过,在HBase Shell语法中,所有字符串参数都必须包含在单引号中,且区分大小写。因此我们要时刻注意不要忘了加单引号和区分大小写。
hbase(main):004:0> describe 'Student'
Table Student is ENABLED
Student
COLUMN FAMILIES DESCRIPTION
{NAME => 'Grades', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =
> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', P
REFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'StuInfo', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING
=> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2 row(s)
Took 0.3453 seconds
hbase(main):005:0>
这样就对了。
更改表结构
修改列族参数信息。比如修改列族的版本。describe 'Student'
命令返回信息中显示列族Grades
的VERSIONS
为1
.但是实际情况可能需要保存最近的3
个版本,可使用以下命令完成。
hbase(main):007:0> alter 'Student',{NAME=>'Grades','VERSIONS'=>3}
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.3287 seconds
hbase(main):008:0> describe 'Student'
Table Student is ENABLED
Student
COLUMN FAMILIES DESCRIPTION
{NAME => 'Grades', VERSIONS => '3', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =
> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', P
REFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'StuInfo', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING
=> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2 row(s)
Took 0.0346 seconds
hbase(main):009:0>
可以看到修改的信息已经生效。
如果要在表中新增一个列族,名为hobby
,使用以下命令。
hbase(main):009:0> alter 'Student','hobby'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.2090 seconds
查看表结构信息,看是否修改成功
hbase(main):010:0> describe 'Student'
Table Student is ENABLED
Student
COLUMN FAMILIES DESCRIPTION
{NAME => 'Grades', VERSIONS => '3', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =
> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', P
REFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'StuInfo', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING
=> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'hobby', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =>
'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PR
EFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
3 row(s)
Took 0.0411 seconds
hbase(main):011:0>
如果要删除已有的列族,比如删除列族hobby
hbase(main):011:0> alter 'Student','delete'=>'hobby'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.0881 seconds
或者使用下面这条命令,两条命令都可以删除列族
hbase(main):017:0> alter 'Student',{NAME=>'hobby',METHOD=>'delete'}
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.0877 seconds
可以看到hobby列族已经被删除。
hbase(main):018:0> describe 'Student'
Table Student is ENABLED
Student
COLUMN FAMILIES DESCRIPTION
{NAME => 'Grades', VERSIONS => '3', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING =
> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', P
REFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'StuInfo', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING
=> 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2 row(s)
Took 0.0313 seconds
hbase(main):019:0>
删除表
删除表之前需要禁用表,再进行删除
禁用表
hbase(main):019:0> disable 'Student'
Took 0.8861 seconds
hbase(main):020:0>
查看表是否禁用成功
hbase(main):021:0> is_disabled 'Student'
true
Took 0.0173 seconds
=> 1
hbase(main):022:0>
删除表
hbase(main):022:0> drop 'Student'
Took 0.8469 seconds
hbase(main):023:0>
查看是否删除成功
hbase(main):023:0> list
TABLE
student
1 row(s)
Took 0.0258 seconds
=> ["student"]
hbase(main):024:0>
Student
表已被删除,只剩下student
表。
这里再次强调,HBase Shell区分大小写。不要把表删错了。
这里我们再把表给建回来,继续进行后面的操作
数据操作
命令 | 描述 |
put | 添加一个值到指定单元格中 |
get | 通过表名,行键等参数获取行或单元格数据 |
scan | 遍历表并输出满足指定条件的行记录 |
count | 计算表中的逻辑行数 |
delete | 删除表中列族或列的数据 |
put
向表中新增一个新行数据,或覆盖指定行的数据。
hbase(main):026:0> put 'Student','0001','StuInfo:Name','Tom Green',1
Took 0.6287 seconds
hbase(main):027:0>
这里解释一下每个值的意思。
Student为表名
0001为行键的名称
StuInfo列族名称
Name列的名称
StuInfo:Name中间用冒号隔开,列族名必须是已经创建的,
列名为临时定义的,列族里的列可以随意扩展。
Tom Green为单元格的值。在HBase中所有数据都是字符串的格式。
最后的1为时间戳,如果不设置,系统会自动插入当前时间为时间戳。
put只能插入单元格的数据,插入一行数据需要以下几条命令一起完成。
hbase(main):026:0> put 'Student','0001','StuInfo:Name','Tom Green',1
Took 0.6287 seconds
hbase(main):027:0> put 'Student','0001','StuInfo:Age','18'
Took 0.3039 seconds
hbase(main):028:0> put 'Student','0001','StuInfo:Sex','Male'
Took 0.0545 seconds
hbase(main):029:0> put 'Student','0001','Grades:BigData','100'
Took 0.1997 seconds
hbase(main):030:0> put 'Student','0001','Grades:Math','99'
Took 0.0170 seconds
hbase(main):031:0> put 'Student','0001','Grades:Computer','98'
Took 0.0344 seconds
hbase(main):032:0>
将学生姓名修改为Jim Green
put 'Student','0001','StuInfo:Name','Jim Green'
如果在创建表时设置列族VERSIONS参数值为n,则put操作可以保存n个版本数据,即可查询到行键为0001的学生的n个版本的姓名数据。
get
类似于关系型数据库的select操作
获取Student表中行键为0001的所有列族数据。
hbase(main):033:0> get 'Student','0001'
COLUMN CELL
Grades:BigData timestamp=1635416535656, value=100
Grades:Computer timestamp=1635416577731, value=98
Grades:Math timestamp=1635416552108, value=99
StuInfo:Age timestamp=1635416449956, value=18
StuInfo:Name timestamp=1635416672482, value=Jim Green
StuInfo:Sex timestamp=1635416468797, value=Male
1 row(s)
Took 0.8590 seconds
hbase(main):034:0>
将学生姓名修改为Xwk Green
put 'Student','0001','StuInfo:Name','Xwk Green'
获取Student表中行键为0001的StuInfo列族数据。
hbase(main):054:0> get 'Student','0001','StuInfo'
COLUMN CELL
StuInfo:Age timestamp=1635416449956, value=18
StuInfo:Name timestamp=1635417454520, value=Xwk Green
StuInfo:Sex timestamp=1635416468797, value=Male
1 row(s)
Took 0.0211 seconds
hbase(main):055:0>
获取Student表中行键为0001的Grades列族数据。
hbase(main):055:0> get 'Student','0001','Grades'
COLUMN CELL
Grades:BigData timestamp=1635416535656, value=100
Grades:Computer timestamp=1635416577731, value=98
Grades:Math timestamp=1635416552108, value=99
1 row(s)
Took 0.0235 seconds
hbase(main):056:0>
scan
指定表名查询全表数据
hbase(main):056:0> scan 'Student'
ROW COLUMN+CELL
0001 column=Grades:BigData, timestamp=1635416535656, value=100
0001 column=Grades:Computer, timestamp=1635416577731, value=98
0001 column=Grades:Math, timestamp=1635416552108, value=99
0001 column=StuInfo:Age, timestamp=1635416449956, value=18
0001 column=StuInfo:Name, timestamp=1635417454520, value=Xwk Green
0001 column=StuInfo:Sex, timestamp=1635416468797, value=Male
1 row(s)
Took 0.1559 seconds
hbase(main):057:0>
指定列族名称
hbase(main):057:0> scan 'Student',{COLUMN=>'StuInfo'}
ROW COLUMN+CELL
0001 column=StuInfo:Age, timestamp=1635416449956, value=18
0001 column=StuInfo:Name, timestamp=1635417454520, value=Xwk Green
0001 column=StuInfo:Sex, timestamp=1635416468797, value=Male
1 row(s)
Took 0.2345 seconds
hbase(main):058:0>
指定列族和列的名称
hbase(main):058:0> scan 'Student',{COLUMN=>'StuInfo:Name'}
ROW COLUMN+CELL
0001 column=StuInfo:Name, timestamp=1635417454520, value=Xwk Green
1 row(s)
Took 0.0673 seconds
hbase(main):059:0>
指定输出行数
hbase(main):059:0> scan 'Student',{LIMIT=>1}
ROW COLUMN+CELL
0001 column=Grades:BigData, timestamp=1635416535656, value=100
0001 column=Grades:Computer, timestamp=1635416577731, value=98
0001 column=Grades:Math, timestamp=1635416552108, value=99
0001 column=StuInfo:Age, timestamp=1635416449956, value=18
0001 column=StuInfo:Name, timestamp=1635417454520, value=Xwk Green
0001 column=StuInfo:Sex, timestamp=1635416468797, value=Male
1 row(s)
Took 0.0487 second
delete
删除表中行键为0001的Grades列族的所有数据
hbase(main):062:0> delete 'Student','0001','Grades'
Took 0.4765 seconds
hbase(main):063:0> scan 'Student'
ROW COLUMN+CELL
0001 column=Grades:BigData, timestamp=1635416535656, value=100
0001 column=Grades:Computer, timestamp=1635416577731, value=98
0001 column=Grades:Math, timestamp=1635416552108, value=99
0001 column=StuInfo:Age, timestamp=1635416449956, value=18
0001 column=StuInfo:Name, timestamp=1635417454520, value=Xwk Green
0001 column=StuInfo:Sex, timestamp=1635416468797, value=Male
1 row(s)
Took 0.0453 seconds
hbase(main):064:0>
为什么删除后,查询数据还在
需要注意的是,delete操作不会马上删除数据,只会将对应的数据打上删除标记(tombstone),只有在合并数据时,数据才会被删除。
删除一个逻辑行
hbase(main):065:0> deleteall 'Student','0001'
Took 0.0223 seconds
hbase(main):066:0> scan 'Student'
ROW COLUMN+CELL
0 row(s)
Took 0.0648 seconds
hbase(main):067:0>
因为该表只有一行,所以删除0001行后就成了空表。
这篇文章就先到这里,因为我注意到这篇文章字数已经是11862了