1.背景
业务场景:
输入IMSI标识,返回用户历史足迹图。因为用户使用的卡数量很多,记录可达千万级,如果放在关系型数据库中明显压力巨大,于是把它放入HBase,再由Java API调用之。
2.解决步骤
①要调用HBase得导入一系列的依赖包,类似于MySql驱动包。项目连接它,就需要其提供的API,所以我们要先到Maven中央仓库去查找连接HBase的依赖包:
输入pom文件我们就可以得到相关的依赖。
<project xmlns="http:///POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http:///POM/4.0.0 http:///xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>cn.sibat</groupId>
<artifactId>uhuibao-hbase</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>war</packaging>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.glassfish.jersey</groupId>
<artifactId>jersey-bom</artifactId>
<version>${jersey.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-servlet</artifactId>
</dependency>
<dependency>
<groupId>net.sf.json-lib</groupId>
<artifactId>json-lib</artifactId>
<version>2.4</version>
<classifier>jdk15</classifier>
</dependency>
<!-- 连接Hbase -->
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>jdk.tools</groupId>
<artifactId>jdk.tools</artifactId>
<version>1.7</version>
<scope>system</scope>
<systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
</dependency>
<-- 以下的依赖,可以把上面的hbase依赖的Hadoop环境版本设置成与集群的Hadoop版本一致,但是它也会引发Jersey冲突,因为新版的Hadoop的jar包中的关联依赖有Jersey的jar包1.9和2.6的冲突。
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-annotations</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-api</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
<version>${hadoop-version}</version>
</dependency>
-->
</dependencies>
<properties>
<jersey.version>2.6</jersey.version>
<hadoop-version>2.6.4</hadoop-version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
</project>默认依赖:


该情况将会使web项目无法在servlet容器中创建Jersey实例,所以没有必要保持web项目和Hadoop集群的相关jar包版本一致。否则会出如下问题:
严重: Servlet [Jersey Web Application] in web application [/uhuibao-track] threw load() exception
java.lang.ClassNotFoundException: org.glassfish.jersey.servlet.ServletContainer
②得到jar包后的可测试小案例:
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;
public class UserTest {
public static void main(String[] args) {
Configuration conf = new Configuration();
conf.set("hbase.zookeeper.quorum",
"192.168.2.10,192.168.2.13,192.168.2.6,192.168.2.7");
Connection conn = null;
try {
conn = ConnectionFactory.createConnection(conf);
TableName tableName = TableName.valueOf("uhuibao");
Table t = conn.getTable(tableName);
Get get = new Get(Bytes.toBytes("row"));
Result result = t.get(get);
System.out.println(result);
t.close();
conn.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}该案例报错:
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.(Windows环境需要winutils.exe和其他dll等)
Opening socket connection to server 192.168.2.7/192.168.2.7:2181. Will not attempt to authenticate using SASL (unknown error)(无法定位登录配置) )
③由于集群在其他公司,而这四个节点分别做隧道不好处理,所以该案例可以直接放到服务器测试。
因为使用Web Service框架做控制层,所以可以单独在抽一个子项目出来。因为hbase的依赖包达几十兆。
数据访问层:
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
public class HBaseDAO {
private static HBaseConfiguration hbaseConfig = null;
private static Configuration config = null;
static {
config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum",
"192.168.2.6,192.168.2.7,192.168.2.10,192.168.2.13");
config.set("hbase.zookeeper.property.clientPort", "2181");
// hbaseConfig = new HBaseConfiguration(config);
}
public static Result getResult(String tableName, String rowKey)
throws IOException {
Get get = new Get(Bytes.toBytes(rowKey));
HTable table = new HTable(config, Bytes.toBytes(tableName));
Result result = table.get(get);
for (KeyValue kv : result.list()) {
System.out.println("family:" + Bytes.toString(kv.getFamily()));
System.out
.println("qualifier:" + Bytes.toString(kv.getQualifier()));
System.out.println("value:" + Bytes.toString(kv.getValue()));
System.out.println("Timestamp:" + kv.getTimestamp());
System.out.println("-------------------------------------------");
}
return result;
}
}控制层:
import java.io.IOException;
import .DefaultValue;
import .GET;
import .Path;
import .Produces;
import .QueryParam;
import .core.MediaType;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.log4j.Logger;
import cn.sibat.uhuibao.hbase.dao.HBaseDAO;
import cn.sibat.uhuibao.hbase.util.JsonUtil;
import cn.sibat.uhuibao.hbase.util.Status;
@Path("hbase")
public class UserTrackApi {
static Logger log = Logger.getLogger(UserTrackApi.class);
@GET
@Path("user_track")
@Produces(MediaType.APPLICATION_JSON)
public String getUserTrackByImsi() {
try {
Result result = HBaseDAO.getResult("uhuibao", "row1");
log.debug(result);
} catch (IOException e) {
e.printStackTrace();
log.error(e);
}
return JsonUtil.getResponse(Status.OK).toString();
}
}④把改变hbase依赖的Hadoop*.jar给注释掉,得如下测试结果。
[datum@webserver hbase-test]$ java -jar hbase-test.jar
get:{“timeRange”:[0,9223372036854775807],”totalColumns”:0,”cacheBlocks”:true,”families”:{},”maxVersions”:1,”row”:”row1”}
15:14:01,360 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
15:14:01,530 INFO [RecoverableZooKeeper:120] Process identifier=hconnection-0x2e3f9952 connecting to ZooKeeper ensemble=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181
15:14:01,536 INFO [ZooKeeper:100] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
15:14:01,536 INFO [ZooKeeper:100] Client environment:=webserver
15:14:01,536 INFO [ZooKeeper:100] Client environment:java.version=1.7.0_55
15:14:01,536 INFO [ZooKeeper:100] Client environment:java.vendor=Oracle Corporation
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.home=/usr/java/jdk1.7.0_55/jre
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.class.path=hbase-test.jar
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.io.tmpdir=/tmp
15:14:01,537 INFO [ZooKeeper:100] Client environment:java.compiler=
15:14:01,537 INFO [ZooKeeper:100] Client environment:=Linux
15:14:01,537 INFO [ZooKeeper:100] Client environment:os.arch=amd64
15:14:01,537 INFO [ZooKeeper:100] Client environment:os.version=2.6.32-504.el6.x86_64
15:14:01,537 INFO [ZooKeeper:100] Client environment:=datum
15:14:01,537 INFO [ZooKeeper:100] Client environment:user.home=/home/datum
15:14:01,539 INFO [ZooKeeper:100] Client environment:user.dir=/home/datum/hbase-test
15:14:01,540 INFO [ZooKeeper:438] Initiating client connection, connectString=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181 sessionTimeout=90000 watcher=hconnection-0x2e3f99520x0, quorum=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181, baseZNode=/hbase
15:14:01,557 INFO [ClientCnxn:975] Opening socket connection to server 192.168.2.13/192.168.2.13:2181. Will not attempt to authenticate using SASL (unknown error)
15:14:01,568 INFO [ClientCnxn:852] Socket connection established to 192.168.2.13/192.168.2.13:2181, initiating session
15:14:01,891 INFO [ClientCnxn:1235] Session establishment complete on server 192.168.2.13/192.168.2.13:2181, sessionid = 0x157a723d1610013, negotiated timeout = 40000
table:uhuibao;hconnection-0x2e3f9952
怀疑是端口号不可达
⑤最终在博文上得到启发,可能是hosts文件出现问题.
[datum@webserver hbase-test]$ su密码:[root@webserver hbase-test]# vim /etc/hosts
原因是hosts文件没有把Hadoop集群的四个节点做映射。添加完成之后,再测试一下。Will not attempt to authenticate using SASL (unknown error)的错误就消失了。
⑥测试成功结果.
[datum@webserver hbase-test]$ java -jar hbase-test.jar
get:{"timeRange":[0,9223372036854775807],"totalColumns":0,"cacheBlocks":true,"families":{},"maxVersions":1,"row":"row1"}
15:37:38,057 WARN [NativeCodeLoader:62] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15:37:38,223 INFO [RecoverableZooKeeper:120] Process identifier=hconnection-0x2e3f9952 connecting to ZooKeeper ensemble=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181
15:37:38,229 INFO [ZooKeeper:100] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
15:37:38,229 INFO [ZooKeeper:100] Client environment:=webserver
15:37:38,229 INFO [ZooKeeper:100] Client environment:java.version=1.7.0_55
15:37:38,229 INFO [ZooKeeper:100] Client environment:java.vendor=Oracle Corporation
15:37:38,229 INFO [ZooKeeper:100] Client environment:java.home=/usr/java/jdk1.7.0_55/jre
15:37:38,230 INFO [ZooKeeper:100] Client environment:java.class.path=hbase-test.jar
15:37:38,230 INFO [ZooKeeper:100] Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
15:37:38,230 INFO [ZooKeeper:100] Client environment:java.io.tmpdir=/tmp
15:37:38,230 INFO [ZooKeeper:100] Client environment:java.compiler=<NA>
15:37:38,230 INFO [ZooKeeper:100] Client environment:=Linux
15:37:38,230 INFO [ZooKeeper:100] Client environment:os.arch=amd64
15:37:38,230 INFO [ZooKeeper:100] Client environment:os.version=2.6.32-504.el6.x86_64
15:37:38,230 INFO [ZooKeeper:100] Client environment:=datum
15:37:38,230 INFO [ZooKeeper:100] Client environment:user.home=/home/datum
15:37:38,232 INFO [ZooKeeper:100] Client environment:user.dir=/home/datum/hbase-test
15:37:38,233 INFO [ZooKeeper:438] Initiating client connection, connectString=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181 sessionTimeout=90000 watcher=hconnection-0x2e3f99520x0, quorum=192.168.2.13:2181,192.168.2.10:2181,192.168.2.7:2181,192.168.2.6:2181, baseZNode=/hbase
15:37:38,250 INFO [ClientCnxn:975] Opening socket connection to server 192.168.2.10/192.168.2.10:2181. Will not attempt to authenticate using SASL (unknown error)
15:37:38,256 INFO [ClientCnxn:852] Socket connection established to 192.168.2.10/192.168.2.10:2181, initiating session
15:37:38,353 INFO [ClientCnxn:1235] Session establishment complete on server 192.168.2.10/192.168.2.10:2181, sessionid = 0x57a724d6ad0014, negotiated timeout = 40000
table:uhuibao;hconnection-0x2e3f9952
result:[row1/ODChain:date/1476956451226/Put/vlen=5/seqid=0, row1/ODChain:imsi/1476956451221/Put/vlen=5/seqid=0, row1/ODChain:route/1476956451230/Put/vlen=6/seqid=0]
hellowrld
family:ODChain
qualifier:date
value:date1
Timestamp:1476956451226
-------------------------------------------
hellowrld
family:ODChain
qualifier:imsi
value:imsi1
Timestamp:1476956451221
-------------------------------------------
hellowrld
family:ODChain
qualifier:route
value:route1
Timestamp:1476956451230
-------------------------------------------
keyvalues={row1/ODChain:date/1476956451226/Put/vlen=5/seqid=0, row1/ODChain:imsi/1476956451221/Put/vlen=5/seqid=0, row1/ODChain:route/1476956451230/Put/vlen=6/seqid=0}作者: @nanphonfy

















