1.背景

业务场景:
输入IMSI标识,返回用户历史足迹图。因为用户使用的卡数量很多,记录可达千万级,如果放在关系型数据库中明显压力巨大,于是把它放入HBase,再由Java API调用之。

2.解决步骤


①要调用HBase得导入一系列的依赖包,类似于MySql驱动包。项目连接它,就需要其提供的API,所以我们要先到Maven中央仓库去查找连接HBase的依赖包:


输入pom文件我们就可以得到相关的依赖。

<project xmlns="http:///POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http:///POM/4.0.0 http:///xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>cn.sibat</groupId>
    <artifactId>uhuibao-hbase</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>war</packaging>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.glassfish.jersey</groupId>
                <artifactId>jersey-bom</artifactId>
                <version>${jersey.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <dependency>
            <groupId>org.glassfish.jersey.containers</groupId>
            <artifactId>jersey-container-servlet</artifactId>
        </dependency>

        <dependency>
            <groupId>net.sf.json-lib</groupId>
            <artifactId>json-lib</artifactId>
            <version>2.4</version>
            <classifier>jdk15</classifier>
        </dependency>

        <!-- 连接Hbase -->
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>1.0.3</version>
        </dependency>
        <dependency>
            <groupId>jdk.tools</groupId>
            <artifactId>jdk.tools</artifactId>
            <version>1.7</version>
            <scope>system</scope>
            <systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
        </dependency>


        <-- 以下的依赖,可以把上面的hbase依赖的Hadoop环境版本设置成与集群的Hadoop版本一致,但是它也会引发Jersey冲突,因为新版的Hadoop的jar包中的关联依赖有Jersey的jar包1.9和2.6的冲突。
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-annotations</artifactId>
            <version>${hadoop-version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-auth</artifactId>
            <version>${hadoop-version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>${hadoop-version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-core</artifactId>
            <version>${hadoop-version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-api</artifactId>
            <version>${hadoop-version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-yarn-common</artifactId>
            <version>${hadoop-version}</version>
        </dependency>
      -->

    </dependencies>


    <properties>
        <jersey.version>2.6</jersey.version>
        <hadoop-version>2.6.4</hadoop-version>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

</project>

默认依赖:

hbase 进程停止 hbase status dead_java

hbase 进程停止 hbase status dead_java_02


该情况将会使web项目无法在servlet容器中创建Jersey实例,所以没有必要保持web项目和Hadoop集群的相关jar包版本一致。否则会出如下问题:

严重: Servlet [Jersey Web Application] in web application [/uhuibao-track] threw load() exception
java.lang.ClassNotFoundException: org.glassfish.jersey.servlet.ServletContainer


②得到jar包后的可测试小案例:

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;
public class UserTest {
    public static void main(String[] args) {
        Configuration conf = new Configuration();
        conf.set("hbase.zookeeper.quorum",
                "192.168.2.10,192.168.2.13,192.168.2.6,192.168.2.7");
        Connection conn = null;
        try {
            conn = ConnectionFactory.createConnection(conf);
            TableName tableName = TableName.valueOf("uhuibao");
            Table t = conn.getTable(tableName);
            Get get = new Get(Bytes.toBytes("row"));
            Result result = t.get(get);
            System.out.println(result);
            t.close();
            conn.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

该案例报错:

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.(Windows环境需要winutils.exe和其他dll等)
Opening socket connection to server 192.168.2.7/192.168.2.7:2181. Will not attempt to authenticate using SASL (unknown error)(无法定位登录配置) )


③由于集群在其他公司,而这四个节点分别做隧道不好处理,所以该案例可以直接放到服务器测试。
因为使用Web Service框架做控制层,所以可以单独在抽一个子项目出来。因为hbase的依赖包达几十兆。

数据访问层:

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;

public class HBaseDAO {
    private static HBaseConfiguration hbaseConfig = null;
    private static Configuration config = null;
    static {
        config = HBaseConfiguration.create();
        config.set("hbase.zookeeper.quorum",
                "192.168.2.6,192.168.2.7,192.168.2.10,192.168.2.13");
        config.set("hbase.zookeeper.property.clientPort", "2181");
        // hbaseConfig = new HBaseConfiguration(config);
    }

    public static Result getResult(String tableName, String rowKey)
            throws IOException {
        Get get = new Get(Bytes.toBytes(rowKey));
        HTable table = new HTable(config, Bytes.toBytes(tableName));
        Result result = table.get(get);
        for (KeyValue kv : result.list()) {
            System.out.println("family:" + Bytes.toString(kv.getFamily()));
            System.out
                    .println("qualifier:" + Bytes.toString(kv.getQualifier()));
            System.out.println("value:" + Bytes.toString(kv.getValue()));
            System.out.println("Timestamp:" + kv.getTimestamp());
            System.out.println("-------------------------------------------");
        }
        return result;
    }
}

控制层:

import java.io.IOException;
import .DefaultValue;
import .GET;
import .Path;
import .Produces;
import .QueryParam;
import .core.MediaType;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.log4j.Logger;
import cn.sibat.uhuibao.hbase.dao.HBaseDAO;
import cn.sibat.uhuibao.hbase.util.JsonUtil;
import cn.sibat.uhuibao.hbase.util.Status;

@Path("hbase")
public class UserTrackApi {
    static Logger log = Logger.getLogger(UserTrackApi.class);
    @GET
    @Path("user_track")
    @Produces(MediaType.APPLICATION_JSON)
    public String getUserTrackByImsi() {
        try {
            Result result = HBaseDAO.getResult("uhuibao", "row1");
            log.debug(result);
        } catch (IOException e) {
            e.printStackTrace();
            log.error(e);
        }
        return JsonUtil.getResponse(Status.OK).toString();
    }
}

④把改变hbase依赖的Hadoop*.jar给注释掉,得如下测试结果。

[datum@webserver hbase-test]$ java -jar hbase-test.jar
get:{“timeRange”:[0,9223372036854775807],”totalColumns”:0,”cacheBlocks”:true,”families”:{},”maxVersions”:1,”row”:”row1”}
15:14:01,360 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
15:14:01,530 INFO [RecoverableZooKeeper:120] Process identifier=hconnection-0x2e3f9952 connecting to ZooKeeper ensemble=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181
15:14:01,536 INFO [ZooKeeper:100] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
15:14:01,536 INFO [ZooKeeper:100] Client environment:=webserver
15:14:01,536 INFO [ZooKeeper:100] Client environment:java.version=1.7.0_55
15:14:01,536 INFO [ZooKeeper:100] Client environment:java.vendor=Oracle Corporation
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.home=/usr/java/jdk1.7.0_55/jre
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.class.path=hbase-test.jar
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.io.tmpdir=/tmp
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.compiler=
15:14:01,537 INFO [ZooKeeper:100] Client environment:=Linux
15:14:01,537 INFO [ZooKeeper:100] Client environment:os.arch=amd64
15:14:01,537 INFO [ZooKeeper:100] Client environment:os.version=2.6.32-504.el6.x86_64
15:14:01,537 INFO [ZooKeeper:100] Client environment:=datum
15:14:01,537 INFO [ZooKeeper:100] Client environment:user.home=/home/datum
15:14:01,539 INFO [ZooKeeper:100] Client environment:user.dir=/home/datum/hbase-test
15:14:01,540 INFO [ZooKeeper:438] Initiating client connection, connectString=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181 sessionTimeout=90000 watcher=hconnection-0x2e3f99520x0, quorum=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181, baseZNode=/hbase
15:14:01,557 INFO [ClientCnxn:975] Opening socket connection to server 192.168.2.13/192.168.2.13:2181. Will not attempt to authenticate using SASL (unknown error)
15:14:01,568 INFO [ClientCnxn:852] Socket connection established to 192.168.2.13/192.168.2.13:2181, initiating session
15:14:01,891 INFO [ClientCnxn:1235] Session establishment complete on server 192.168.2.13/192.168.2.13:2181, sessionid = 0x157a723d1610013, negotiated timeout = 40000
table:uhuibao;hconnection-0x2e3f9952

怀疑是端口号不可达


⑤最终在博文上得到启发,可能是hosts文件出现问题.

[datum@webserver hbase-test]$ su
密码:
[root@webserver hbase-test]# vim /etc/hosts

hbase 进程停止 hbase status dead_java_03


原因是hosts文件没有把Hadoop集群的四个节点做映射。添加完成之后,再测试一下。Will not attempt to authenticate using SASL (unknown error)的错误就消失了。


⑥测试成功结果.

[datum@webserver hbase-test]$ java -jar hbase-test.jar 
get:{"timeRange":[0,9223372036854775807],"totalColumns":0,"cacheBlocks":true,"families":{},"maxVersions":1,"row":"row1"}
15:37:38,057 WARN  [NativeCodeLoader:62] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15:37:38,223 INFO  [RecoverableZooKeeper:120] Process identifier=hconnection-0x2e3f9952 connecting to ZooKeeper ensemble=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181
15:37:38,229 INFO  [ZooKeeper:100] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
15:37:38,229 INFO  [ZooKeeper:100] Client environment:=webserver
15:37:38,229 INFO  [ZooKeeper:100] Client environment:java.version=1.7.0_55
15:37:38,229 INFO  [ZooKeeper:100] Client environment:java.vendor=Oracle Corporation
15:37:38,229 INFO  [ZooKeeper:100] Client environment:java.home=/usr/java/jdk1.7.0_55/jre
15:37:38,230 INFO  [ZooKeeper:100] Client environment:java.class.path=hbase-test.jar
15:37:38,230 INFO  [ZooKeeper:100] Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
15:37:38,230 INFO  [ZooKeeper:100] Client environment:java.io.tmpdir=/tmp
15:37:38,230 INFO  [ZooKeeper:100] Client environment:java.compiler=<NA>
15:37:38,230 INFO  [ZooKeeper:100] Client environment:=Linux
15:37:38,230 INFO  [ZooKeeper:100] Client environment:os.arch=amd64
15:37:38,230 INFO  [ZooKeeper:100] Client environment:os.version=2.6.32-504.el6.x86_64
15:37:38,230 INFO  [ZooKeeper:100] Client environment:=datum
15:37:38,230 INFO  [ZooKeeper:100] Client environment:user.home=/home/datum
15:37:38,232 INFO  [ZooKeeper:100] Client environment:user.dir=/home/datum/hbase-test
15:37:38,233 INFO  [ZooKeeper:438] Initiating client connection, connectString=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181 sessionTimeout=90000 watcher=hconnection-0x2e3f99520x0, quorum=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181, baseZNode=/hbase
15:37:38,250 INFO  [ClientCnxn:975] Opening socket connection to server 192.168.2.10/192.168.2.10:2181. Will not attempt to authenticate using SASL (unknown error)
15:37:38,256 INFO  [ClientCnxn:852] Socket connection established to 192.168.2.10/192.168.2.10:2181, initiating session
15:37:38,353 INFO  [ClientCnxn:1235] Session establishment complete on server 192.168.2.10/192.168.2.10:2181, sessionid = 0x57a724d6ad0014, negotiated timeout = 40000
table:uhuibao;hconnection-0x2e3f9952
result:[row1/ODChain:date/1476956451226/Put/vlen=5/seqid=0, row1/ODChain:imsi/1476956451221/Put/vlen=5/seqid=0, row1/ODChain:route/1476956451230/Put/vlen=6/seqid=0]
hellowrld
family:ODChain
qualifier:date
value:date1
Timestamp:1476956451226
-------------------------------------------
hellowrld
family:ODChain
qualifier:imsi
value:imsi1
Timestamp:1476956451221
-------------------------------------------
hellowrld
family:ODChain
qualifier:route
value:route1
Timestamp:1476956451230
-------------------------------------------
keyvalues={row1/ODChain:date/1476956451226/Put/vlen=5/seqid=0, row1/ODChain:imsi/1476956451221/Put/vlen=5/seqid=0, row1/ODChain:route/1476956451230/Put/vlen=6/seqid=0}


作者: @nanphonfy