hbase应用实战 pdf hbase入门与实践

转载

mob6454cc63081f 2023-09-01 11:08:08

文章标签 hbase应用实战 pdf 大数据实时大数据 hbase java 文章分类 Hbase 数据库

1. HBase 部署

1.1 HBase 部署前提

需要部署 Hadoop，HBase 的数据最终存储在 HDFS 上面
需要部署Zookeeper，HBase 的元数据存储在 Zookeeper 上面

1.2 HBase 下载以及修改配置文件

[root@bigdatatest01 ~]# cd software/
[root@bigdatatest01 software]# wget http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.16.2.tar.gz
[root@bigdatatest01 software]# tar -xzvf hbase-1.2.0-cdh5.16.2.tar.gz -C ~/app/
[root@bigdatatest01 software]# cd ~/app/
[root@bigdatatest01 app]# cd hbase-1.2.0-cdh5.16.2/conf/
[root@bigdatatest01 conf]# ll
total 40
-rw-r--r-- 1 1106 4001 1811 Jun  3  2019 hadoop-metrics2-hbase.properties
-rw-r--r-- 1 1106 4001 4603 Jun  3  2019 hbase-env.cmd
-rw-r--r-- 1 1106 4001 7530 Jun  3  2019 hbase-env.sh
-rw-r--r-- 1 1106 4001 2257 Jun  3  2019 hbase-policy.xml
-rw-r--r-- 1 1106 4001  934 Jun  3  2019 hbase-site.xml
-rw-r--r-- 1 1106 4001 4603 Jun  3  2019 log4j.properties
-rw-r--r-- 1 1106 4001   10 Jun  3  2019 regionservers

1.3 修改配置文件

[root@bigdatatest01 conf]# vim hbase-env.sh 
export JAVA_HOME=/usr/java/jdk1.8.0_151
export HBASE_MANAGES_ZK=false

[root@bigdatatest01 conf]# vim hbase-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <!--hbase.rootdir的前端与$HADOOP_HOME/conf/core-site.xml的fs.defaultFS一致 -->
        <property>
                <name>hbase.rootdir</name>
                <value>hdfs://bigdatatest02:8020/hbase</value>
        </property>
        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>

                <!--本地文件系统的临时文件夹。可以修改到一个更为持久的目录上。(/tmp会在重启时清除) -->
        <property>
                <name>hbase.tmp.dir</name>
                <value>/root/tmp/hbase</value>
        </property>

                <!--如果只设置单个 Hmaster，那么 hbase.master 属性参数需要设置为 master5:60000 (主机名:60000) -->
                <!--如果要设置多个 Hmaster，那么我们只需要提供端口 60000，因为选择真正的 master 的事情会有 zookeeper 去处理 -->
        <property>
                <name>hbase.master</name>
                <value>60000</value>
        </property>

                <!--这个参数用户设置 ZooKeeper 快照的存储位置，默认值为 /tmp，显然在重启的时候会清空。因为笔者的 ZooKeeper 是独立安装的，所以这里路径是指向了 $ZOOKEEPER_HOME/conf/zoo.cfg 中 dataDir 所设定的位置 -->
        <property>
                <name>hbase.zookeeper.property.dataDir</name>
                <value>/root/tmp/zk1</value>
        </property>

        <property>
                <name>hbase.zookeeper.quorum</name>
                <value>bigdatatest01</value>
        </property>
                <!--表示客户端连接 ZooKeeper 的端口 -->
        <property>
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>
        </property>
                <!--ZooKeeper 会话超时。Hbase 把这个值传递改 zk 集群，向它推荐一个会话的最大超时时间 -->
        <property>
                <name>zookeeper.session.timeout</name>
                <value>120000</value>
        </property>

                <!--当 regionserver 遇到 ZooKeeper session expired ， regionserver 将选择 restart 而不是 abort -->
        <property>
                <name>hbase.regionserver.restart.on.zk.expire</name>
                <value>true</value>
        </property>
		<property>
				<name>hbase.online.schema.update.enable</name>
				<value>true</value>
		</property>

		<property>
				<name>hbase.coprocessor.abortonerror</name>
				<value>false</value>
		</property>
</configuration>

[root@bigdatatest01 conf]# vim regionservers
bigdatatest01

HBASE_MANAGES_ZK：是否使用 HBase 内置的 Zookeeper

1.4 启动 HBase

[root@bigdatatest01 conf]# cd ../
[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/start-hbase.
start-hbase.cmd  start-hbase.sh   
[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/start-hbase.sh 
starting master, logging to /root/app/hbase-1.2.0-cdh5.16.2/bin/../logs/hbase-root-master-bigdatatest01.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
The authenticity of host 'bigdatatest01 (192.168.20.66)' can't be established.
ECDSA key fingerprint is SHA256:ubGg3eXeUXlLZLkDjezmag/lpWFbRFEl30lMiQ/Is6M.
ECDSA key fingerprint is MD5:eb:fa:c5:d9:2a:2e:d5:18:39:fc:41:18:8c:4a:76:f6.
Are you sure you want to continue connecting (yes/no)? yes
bigdatatest01: Warning: Permanently added 'bigdatatest01' (ECDSA) to the list of known hosts.
bigdatatest01: starting regionserver, logging to /root/app/hbase-1.2.0-cdh5.16.2/bin/../logs/hbase-root-regionserver-bigdatatest01.out
bigdatatest01: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
bigdatatest01: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

遇到报错

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=WRITE, inode="/hbase":hbase:hbase:drwxr-xr-x

解决报错

[root@bigdatatest01 ~]# su - hdfs
Last login: Tue Feb  2 10:31:56 CST 2021 on pts/1
[hdfs@bigdatatest01 ~]$ hadoop fs -chmod 777 /hbase

重启服务

[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/stop-hbase.sh 
stopping hbasecat: /tmp/hbase-root-master.pid: No such file or directory

[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# bin/start-hbase.sh 
starting master, logging to /root/app/hbase-1.2.0-cdh5.16.2/bin/../logs/hbase-root-master-bigdatatest01.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
bigdatatest01: regionserver running as process 26211. Stop it first.

查看日志，发现权限依旧不足

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=ALL, inode="/hbase/.tmp":hbase:hbase:drwxr-xr-x

增加 /hbase/.tmp 权限

[hdfs@bigdatatest01 ~]$ hadoop fs -chmod 777 /hbase/.tmp
[hdfs@bigdatatest01 ~]$ hadoop fs -chmod -R  777 /hbase/.tmp
[hdfs@bigdatatest01 ~]$ hadoop fs -chmod -R  777 /hbase

重启 Hbase 服务
jps 查看服务

[root@bigdatatest01 hbase-1.2.0-cdh5.16.2]# jps
800 QuorumPeerMain
26211 HRegionServer
1060 Jps
525 QuorumPeerMain
18994 DataNode
32374 HMaster
19354 NodeManager
669 QuorumPeerMain

HBase 是主从架构
HMaster：主节点
HRegionServer：子节点

2. HBase 架构

hbase应用实战 pdf hbase入门与实践_hbase应用实战 pdf

Client：

HBase 有一张特殊表：META

META.：记录了用户所有表拆分出来的的 Region 映射信息，META 可以有多个 Regoin 。

Client 访问用户数据前需要首先访问 ZooKeeper，找到 META 表的 Region 所在的位置，然后才能找到用户数据的位置去访问，中间需要多次网络操作，不过 client 端会做 cache 缓存。

ZooKeeper：

ZooKeeper 为 HBase 提供 Failover 机制，选举 Master，避免单点 Master 单点故障问题。
实时监控 RegionServer 的状态，将 RegionServer 的上线和下线信息实时通知给 Master。
存储 HBase 的 Schema，包括有哪些 Table，每个 Table 有哪些 Column Family。
存储 hbase:meta 表的地址和 Master 地址

Master：

为 RegionServer 分配 Region。
负责 RegionServer 的负载均衡。
发现失效的 RegionServer 并重新分配其上的 Region。
HDFS 上的垃圾文件（HBase）回收。
处理 Schema 更新请求（表的创建，删除，修改，列簇的增加等等）。

RegionServer：

RegionServer 维护 Master 分配给它的 Region，处理对这些 Region 的 IO 请求。
RegionServer 负责 Split 在运行过程中变得过大的 Region，负责 Compact 操作。

可以看到，client 访问 HBase 上数据的过程并不需要 Master 参与（寻址访问 Zookeeper 和 RegioneServer，数据读写访问 RegioneServer），Master 仅仅维护者 Table 和 Region 的元数据信息，负载很低。
META 存的是所有的 Region 的位置信息，那么 RegioneServer 当中 Region 在进行分裂之后的新产生的 Region，是由 Master 来决定发到哪个 RegioneServer，这就意味着，只有 Master 知道 new Region 的位置信息，所以，由 Master 来管理 META 这个表当中的数据的 CRUD。
所以结合以上两点表明，在没有 Region 分裂的情况，Master 宕机一段时间是可以忍受的。

HRegion：

Table 在行的方向上分隔为多个Region。Region是HBase中分布式存储和负载均衡的最小单元，即不同的 Region 可以分别在不同的 Region Server 上，但同一个Region是不会拆分到多个 Server 上。
Region按大小分隔，每个表一般是只有一个 Region。随着数据不断插入表，Region不断增大，当 Region 的某个列族达到一个阈值时就会分成两个新的 Region。
每个 Region 由以下信息标识：< 表名,startRowkey,创建时间>。
由目录表( META )记录该 Region 的 endRowkey。

Store：

每一个 Region 由一个或多个 Store 组成，至少是一个 Store，HBase会把一起访问的数据放在一个 Store 里面，即为每个 ColumnFamily 建一个 Store，如果有几个ColumnFamily，也就有几个 Store。一个 Store由一个 MemStore和0或者多个 StoreFile组成。 HBase以 Store的大小来判断是否需要切分 Region。

MemStore：

MemStore 是放在内存里的。保存修改的数据即keyValues。当 MemStore 的大小达到一个阀值（默认128MB）时，MemStore会被flush到文件，即生成一个快照。目前 HBase 会有一个线程来负责MemStore的flush操作。

StoreFile：

MemStore内存中的数据写到文件后就是 StoreFile，StoreFile底层是以 HFile的格式保存。当 StoreFile文件的数量增长到一定阈值后，系统会进行合并（minor、major compaction），在合并过程中会进行版本合并和删除工作（majar），形成更大的 StoreFile。

HFile：

HBase中KeyValue数据的存储格式，HFile是Hadoop的二进制格式文件，实际上 StoreFile 就是对 Hfile 做了轻量级包装，即 StoreFile 底层就是 HFile。

HLog：

HLog(WAL log)：WAL意为 Write Ahead Log，用来做灾难恢复使用，HLog记录数据的所有变更，一旦 Region Server 宕机，就可以从 HLog 中进行恢复。
HLog文件就是一个普通的Hadoop Sequence File， Sequence File的value是key时HLogKey对象，其中记录了写入数据的归属信息，除了 Table 和 Region 名字外，还同时包括 sequence number和timestamp，timestamp是写入时间，sequence number的起始值为0，或者是最近一次存入文件系统中的sequence number。 Sequence File的value是HBase的KeyValue对象，即对应HFile中的KeyValue。

BlockCache：读取中的缓存。

存储在 RegionServer 中，一个 RegionServer 只会有一个 BlockCache。
在 RegionServer 启动时完成 BlockCache 初始化的工作。

3. HBase 物理存储模型

hbase应用实战 pdf hbase入门与实践_java_02

HBase 每个单元格包含的数据：

RowKey：主键。
Column Family：列簇，把表竖向切割，一个表中可以有多个列簇，一个列簇下面有多个字段。
Column：字段。
Version Number：long类型默认是系统时间戳用户也自定义。
value：存储的值。

Table 中的所有行都按照 RowKey 的字典序排列。
Table 在行的方向上分割为多个 HRegion。
HRegion 按大小分割的，每个表一开始只有一个 HRegion，随着数据不断插入表，HRegion 不断增大，当增大到一个阀值的时候，HRegion 就会等分会两个新的 HRegion。当表中的行不断增多，就会有越来越多的 HRegion。
HRegion 是 Hbase 中分布式存储和负载均衡的最小单元。最小单元就表示不同的 HRegion 可以分布在不同的 HRegion Server 上。但一个 HRegion 是不会拆分到多个 Server 上的。
HRegion 虽然是负载均衡的最小单元，但并不是物理存储的最小单元。事实上，HRegion 由一个或者多个 Store 组成，每个 Store 保存一个 Column Family。每个 Strore 又由一个 MemStore 和 0 至多个 StoreFile 组成。

4. HBase 读写流程

HBase 一次范围查询可能涉及多个 Region 、多个 MemStore，甚至是多个 StoreFile。
HBase 的更新、删除操作底层实现都是往 HBase 里面插入一笔数据，都没有真正的更新真正的数据，而是通过时间戳来实现多个版本，删除操作也没有真正的删除原始数据，而且打了一个 delete 的标签，类似于我们通常所说的逻辑删除。
这种操作极大的简化了更新、删除操作，但是给读取数据带了一些压力。通过多个版本和删除标记进行过滤。

4.1 HBase 读流程

hbase应用实战 pdf hbase入门与实践_hbase应用实战 pdf_03

Client 发送读取数据的请求。
先去 Zookeeper 里面获取 hbase:meta 表所在的 Region 节点。
base:meta 表中根据 RowKey 确定目标 RegionServer 所在的节点以及 Region 信息。
读取顺序：MemStore --> BlockCache --> HFile 文件
将读请求进行封装，发送给 RegionServer 节点，RegionServer 收到读取数据的请求后，解析数据，查询出所有的数据后并返回。

4.2 HBase 写流程

hbase应用实战 pdf hbase入门与实践_hbase_04

Client 发送写数据的请求。
先去 Zookeeper 里面获取 hbase:meta 表所在的 Region 节点。
base:meta 表中根据 RowKey 确定目标 RegionServer 所在的节点以及 Region 信息。
将写请求与对应的 RegionServer 进行通信，RegionServer 接收到写请求，解析数据，先写到 HLog，再写对应 Region 列簇 Store 中的 MemStore。
当 MemStore 触发异步 Flush，把内存中 MemStore 写入到 StoreFile 文件中。

5. HBase Shell

5.1 查看帮助命令

hbase(main):005:0> help

5.2 创建一个 namespace

HBase 中的 namespace 类似于 ORACLE 中的 schema、Mysql 中的 database

Group name: namespace
  Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

alter_namespace：修改 namespace
create_namespace：创建 namespace
describe_namespace：namespace 的描述
drop_namespace：删除 namespace
list_namespace：namespace 列表
list_namespace_tables + ‘$namespace 名称’：查询 namespace 下面的所有

hbase(main):014:0> create_namespace 'test'
Took 0.7759 seconds
hbase(main):015:0> list_namespace
NAMESPACE
SYSTEM
bigdata
default
hbase
test
8 row(s)
Took 0.0082 seconds
hbase(main):016:0> list_namespace_tables 'test'
TABLE
0 row(s)
Took 0.0060 seconds

5.3 创建一个表

Group name: ddl
  Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters

查看 create 的命令帮助

hbase(main):020:0> help 'create'
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:

Create a table with namespace=ns1 and table qualifier=t1
  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1
  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
  hbase> # The above in shorthand would be the following:
  hbase> create 't1', 'f1', 'f2', 'f3'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
  hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => 'weekly'}

Table configuration options can be put at the end.
Examples:

  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
  hbase> # Optionally pre-split the table into NUMREGIONS, using
  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
  hbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}

You can also keep around a reference to the created table:

  hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then

创建表

hbase(main):021:0> create 'test:demo', 'o'
Created table test:demo
Took 1.3342 seconds
=> Hbase::Table - test:demo

5.4 CRUD

5.4.1 插入数据

hbase(main):028:0> put 'test:demo', 'row1' , 'o:id', '1'
Took 0.0427 seconds

5.4.2 查看所有数据

查看表中所有的数据

hbase(main):022:0> scan 'test:demo'
ROW                                                         COLUMN+CELL
0 row(s)
Took 0.0342 seconds

查看一行的数据

hbase(main):029:0> get 'test:demo','row1'
COLUMN                                                      CELL
 o:id                                                       timestamp=1612256670990, value=1
 1 row(s)
Took 0.0230 seconds

5.4.3 更新数据

hbase(main):030:0> put 'test:demo', 'row1' , 'o:id', '2'
Took 0.0094 seconds
hbase(main):031:0> get 'test:demo','row1'
COLUMN                                                      CELL
 o:id                                                       timestamp=1612256869333, value=2
 1 row(s)
Took 0.0123 seconds

5.4.4 删除数据

删除数据：只能删除最新版本的数据，如果这个数据有多个版本，删除这个单元格数据，会显示上个版本这个单元格的数据。

hbase(main):032:0> delete 'test:demo', 'row1', 'o:id'
Took 0.0161 seconds
hbase(main):033:0> get 'test:demo','row1'
COLUMN                                                      CELL
 o:id                                                       timestamp=1612256670990, value=1
 1 row(s)
Took 0.0072 seconds

6. HBase API CRUD

6.1 POM 文件

<dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.12</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.2.0-cdh5.16.2</version>
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-server</artifactId>
    <version>1.2.0-cdh5.16.2</version>
</dependency>

6.2 HBaseUtils Code

package com.xk.bigdata.hbase.basic;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class HBaseUtils {

    public static Connection connection;

    /**
     * 创建连接
     *
     * @param zookeeperQuorum ： Zookeeper 连接地址
     * @throws Exception
     */
    public static void init(String zookeeperQuorum) throws Exception {
        Configuration hbaseConf = new Configuration();
        hbaseConf.set(HConstants.ZOOKEEPER_QUORUM, zookeeperQuorum);
        Configuration conf = HBaseConfiguration.create(hbaseConf);
        connection = ConnectionFactory.createConnection(conf);
    }

    /**
     * 关闭连接
     *
     * @throws Exception
     */
    public static void close() throws Exception {
        if (!connection.isClosed()) {
            connection.close();
        }
    }

    /**
     * 创建表
     *
     * @param tableName ：表名
     * @param familys   ：列簇数组
     * @throws IOException
     */
    public static void createTable(String tableName, String[] familys) throws IOException {
        Admin admin = connection.getAdmin();
        if (admin.tableExists(TableName.valueOf(tableName))) {
            System.out.println(tableName + "已经存在");
        } else {
            HTableDescriptor tableDescriptor = new HTableDescriptor(TableName.valueOf(tableName));
            for (String family : familys) {
                tableDescriptor.addFamily(new HColumnDescriptor(family));
            }
            admin.createTable(tableDescriptor);
            System.out.println(tableName + "创建成功！！");
        }
    }

    /**
     * 插入数据
     *
     * @param tableName ： 表名
     * @param rowKey    ： 主键
     * @param family    ： 列簇
     * @param qualifier ： 字段名
     * @param value     ： 字段数据
     * @throws IOException
     */
    public static void putRecord(String tableName, String rowKey, String family, String qualifier, String value) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Put put = new Put(Bytes.toBytes(rowKey));
        put.addColumn(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));
        table.put(put);
        System.out.println(tableName + "中字段：" + qualifier + "插入成功！！");
    }

    /**
     * 得到 hbase 中的一条数据
     *
     * @param tableName ：表名
     * @param rowKey    ： 主键
     * @throws IOException
     */
    public static void getOneRecord(String tableName, String rowKey) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Get get = new Get(Bytes.toBytes(rowKey));
        Result result = table.get(get);
        for (Cell cell : result.rawCells()) {
            System.out.println(Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())
                    + " : " + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())
                    + ":" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())
                    + " : " + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
        }
    }

    /**
     * 得到表中的所有数据
     *
     * @param tableName
     * @throws IOException
     */
    public static void getAllRecord(String tableName) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Scan scan = new Scan();
        ResultScanner scanner = table.getScanner(scan);
        for (Result result : scanner) {
            for (Cell cell : result.rawCells()) {
                System.out.println(Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())
                        + " : " + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())
                        + ":" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())
                        + " : " + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
            }
        }
    }

    /**
     * 删除数据
     *
     * @param tableName ：表名
     * @param rowKey    ：主键
     * @throws IOException
     */
    public static void deleteRecord(String tableName, String rowKey) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Delete delete = new Delete(Bytes.toBytes(tableName));
        table.delete(delete);
        System.out.println(tableName + "=====》" + rowKey + "删除成功！！！");
    }
}

6.3 HBaseUtilsTest Code

package com.xk.bigdata.hbase.basic;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;

public class HBaseUtilsTest {

    final String zookeeperQuorum = "bigdatatest02";

    @Before
    public void setUp() throws Exception {
        HBaseUtils.init(zookeeperQuorum);
    }

    @After
    public void cleanUp() throws Exception {
        HBaseUtils.close();
    }

    @Test
    public void testCreateTable() throws Exception {
        HBaseUtils.createTable("test:demo1", new String[]{"o"});
    }

    @Test
    public void testPutRecord() throws Exception {
        HBaseUtils.putRecord("test:demo1", "row2", "o", "id", "2");
    }

    @Test
    public void testGetOneRecord() throws IOException {
        HBaseUtils.getOneRecord("test:demo1", "row1");
    }

    @Test
    public void testGetAllRecord() throws IOException {
        HBaseUtils.getAllRecord("test:demo1");
    }

    @Test
    public void testDeleteRecord() throws IOException {
        HBaseUtils.deleteRecord("test:demo1", "row2");
    }
}

7. HBase API （多版本控制）

7.1 HBaseMultiVersion Code

package com.xk.bigdata.hbase.basic;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HConstants;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

import java.io.IOException;

public class HBaseMultiVersion {

    public static Connection connection;

    /**
     * 创建连接
     *
     * @param zookeeperQuorum ： Zookeeper 连接地址
     * @throws Exception
     */
    public static void init(String zookeeperQuorum) throws Exception {
        Configuration hbaseConf = new Configuration();
        hbaseConf.set(HConstants.ZOOKEEPER_QUORUM, zookeeperQuorum);
        Configuration conf = HBaseConfiguration.create(hbaseConf);
        connection = ConnectionFactory.createConnection(conf);
    }

    /**
     * 关闭连接
     *
     * @throws Exception
     */
    public static void close() throws Exception {
        if (!connection.isClosed()) {
            connection.close();
        }
    }

    /**
     * 得到固定版本的整条数据
     *
     * @param tableName ： 表名
     * @param version   ： 版本号
     * @throws IOException
     */
    public static void getAllVersionRecord(String tableName, Integer version) throws IOException {
        Table table = connection.getTable(TableName.valueOf(tableName));
        Scan scan = new Scan();
        scan.setMaxVersions(version);
        ResultScanner scanner = table.getScanner(scan);
        for (Result result : scanner) {
            for (Cell cell : result.rawCells()) {
                System.out.println(Bytes.toString(cell.getRowArray(), cell.getRowOffset(), cell.getRowLength())
                        + " : " + Bytes.toString(cell.getFamilyArray(), cell.getFamilyOffset(), cell.getFamilyLength())
                        + ":" + Bytes.toString(cell.getQualifierArray(), cell.getQualifierOffset(), cell.getQualifierLength())
                        + " : " + Bytes.toString(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
            }
        }
    }

}

7.2 HBaseMultiVersionTest Code

package com.xk.bigdata.hbase.basic;

import org.junit.After;
import org.junit.Before;
import org.junit.Test;

import java.io.IOException;

public class HBaseMultiVersionTest {

    final String zookeeperQuorum = "bigdatatest02";

    @Before
    public void setUp() throws Exception {
        HBaseMultiVersion.init(zookeeperQuorum);
    }

    @After
    public void cleanUp() throws Exception {
        HBaseMultiVersion.close();
    }

    @Test
    public void testGetAllVersionRecord() throws IOException {
        HBaseMultiVersion.getAllVersionRecord("test:demo1", -1);
    }

}

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：java list链表用法 java链表详解

下一篇：java sdk 在哪 java sdk怎么用

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

hbase应用实战 pdf hbase入门与实践

hbase应用实战 pdf hbase入门与实践

目录

1. HBase 部署

1.1 HBase 部署前提

1.2 HBase 下载以及修改配置文件

1.3 修改配置文件

1.4 启动 HBase

2. HBase 架构

3. HBase 物理存储模型

4. HBase 读写流程

4.1 HBase 读流程

4.2 HBase 写流程

5. HBase Shell

5.1 查看帮助命令

5.2 创建一个 namespace

5.3 创建一个表

5.4 CRUD

5.4.1 插入数据

5.4.2 查看所有数据

5.4.3 更新数据

5.4.4 删除数据

6. HBase API CRUD

6.1 POM 文件

6.2 HBaseUtils Code

6.3 HBaseUtilsTest Code

7. HBase API （多版本控制）

7.1 HBaseMultiVersion Code

7.2 HBaseMultiVersionTest Code

51CTO博客