上篇:大数据之实时项目 第7天 es安装说明
1、在kibana基本操作
如图所示:
(1)创建表结构
编写代码:创建结构数据
PUT gmall0315_test/_doc/1
{
"name":"zhangsan",
"age":23,
"amout":250.1
}
(2)查询数据
如图所示:
其中:
text:表示分词
- 作用:
全文配配、占空间大(磁盘、内存)
keyword:表示不分词
- 作用:
精确匹配,和作为聚合字段、占空间小
编写代码:查询结构数据
GET gmall0315_test/_mapping
(3)分组查询
编写代码:分组查询结构数据
GET gmall0315_test/_search
{
"query":{
"bool":{
"filter":{
"term": {
"name.keyword": "zhangsan"
}
}
}
}
}
(4) 重复分组查询
如图所示:
编写代码:重复分组查询结构数据
GET gmall0315_test/_search
{
"query":{
"bool":{
"filter":{
"term": {
"name.keyword": "zhangsan"
}
}
}
},
"aggs": {
"groupby_name": {
"terms": {
"field": "name.keyword",
"size": 10
}
}
}
}
(5) 索引创建语句
索引创建语句编写
PUT gmall0315_test2/
{
"mappings":{
"_doc": {
"properties": {
"age": {
"type": "long"
},
"amout": {
"type": "float"
},
"name": {
"type": "keyword"
},
"phone_num": {
"type": "keyword",
"index": false
}
}
}
}
}
其:
要不要使用索引:index:true或false,默认是false
要不要使用分词:text(分词)、keyword(不分词)
2、设计es索引结构
(1)项目索引创建语句
代码编写
PUT gmall0315_dau
{
"mappings": {
"_doc":{
"properties":{
"mid":{
"type":"keyword"
},
"uid":{
"type":"keyword"
},
"area":{
"type":"keyword"
},
"os":{
"type":"keyword"
},
"ch":{
"type":"keyword"
},
"vs":{
"type":"keyword"
},
"logDate":{
"type":"keyword"
},
"logHour":{
"type":"keyword"
},
"logHourMinute":{
"type":"keyword"
},
"ts":{
"type":"long"
}
}
}
}
}
(2)分清楚索引类型
需要索引,也需要分词 | 需要索引,但不需要分词 | 既不需要索引,也不需要分词 |
标题、商品、分类名称; | 类型id、日期、数量、年龄、各种id | 不被会用于条件过滤,经过脱敏的字段、138****0101 |
type:”text“ | type:”keyword“ | insex:false |
(3)保存到es中
- 在common子模块代码编写
(1)pom文件依赖添加:
<dependency>
<groupId>io.searchbox</groupId>
<artifactId>jest</artifactId>
<version>5.3.3</version>
</dependency>
<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
<version>4.5.2</version>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
<version>2.7.8</version>
</dependency>
(2) 代码编写
MyEsUtil.scala
package com.study.gamll0315.common.util
import java.util.Objects
import io.searchbox.client.config.HttpClientConfig
import io.searchbox.client.{JestClient, JestClientFactory}
import io.searchbox.core.Index
object MyEsUtil {
private val ES_HOST = "http://flink102"
private val ES_HTTP_PORT = 9200
private var factory:JestClientFactory = null
/**
* 获取客户端
*
* @return jestclient
*/
def getClient: JestClient = {
if (factory == null) build()
factory.getObject
}
/**
* 关闭客户端
*/
def close(client: JestClient): Unit = {
if (!Objects.isNull(client)) try
client.shutdownClient()
catch {
case e: Exception =>
e.printStackTrace()
}
}
/**
* 建立连接
*/
private def build(): Unit = {
factory = new JestClientFactory
factory.setHttpClientConfig(new HttpClientConfig.Builder(ES_HOST + ":" + ES_HTTP_PORT).multiThreaded(true)
.maxTotalConnection(20) //连接总数
.connTimeout(10000).readTimeout(10000).build)
}
def main(args: Array[String]): Unit = {
val jedis: JestClient = getClient
val source="{\n \"name\":\"zhang4\",\n \"age\":23,\n \"amout\":250.1,\n \"phone_num\":\"138*****6541\"\n}"
val index: Index = new Index.Builder(source).index("gmall0315_test").`type`("_doc").build()
jedis.execute(index)
//关闭资源
close(jedis)
}
}
运行程序
(3)在kibana监控平台查看
如图所示:直接执行
GET gmall0315_test/_search
(4)批量插入es基本代码实现
/**
* 批量插入es
* @param indexName
* @param list
*/
def indexBulk(indexName:String,list: List[Any]): Unit ={
val jedis: JestClient = getClient
val bulkBuilder = new Bulk.Builder().defaultIndex(indexName).defaultType("_doc")
for (doc<-list){
val index: Index = new Index.Builder(doc).build()
bulkBuilder.addAction(index)
}
//返回执行多少条
val items: util.List[BulkResult#BulkResultItem] = jedis.execute(bulkBuilder.build()).getItems
println(s"保存=${items.size()}")
//关闭资源
close(jedis)
}
}
完整代码实现
MyEsUtil .scala
package com.study.gamll0315.common.util
import java.util.Objects
import io.searchbox.client.config.HttpClientConfig
import io.searchbox.client.{JestClient, JestClientFactory}
import io.searchbox.core.{Bulk, Index}
object MyEsUtil {
private val ES_HOST = "http://flink102"
private val ES_HTTP_PORT = 9200
private var factory:JestClientFactory = null
/**
* 获取客户端
*
* @return jestclient
*/
def getClient: JestClient = {
if (factory == null) build()
factory.getObject
}
/**
* 关闭客户端
*/
def close(client: JestClient): Unit = {
if (!Objects.isNull(client)) try
client.shutdownClient()
catch {
case e: Exception =>
e.printStackTrace()
}
}
/**
* 建立连接
*/
private def build(): Unit = {
factory = new JestClientFactory
factory.setHttpClientConfig(new HttpClientConfig.Builder(ES_HOST + ":" + ES_HTTP_PORT).multiThreaded(true)
.maxTotalConnection(20) //连接总数
.connTimeout(10000).readTimeout(10000).build)
}
def main(args: Array[String]): Unit = {
val jedis: JestClient = getClient
val source="{\n \"name\":\"zhang4\",\n \"age\":23,\n \"amout\":250.1,\n \"phone_num\":\"138*****6541\"\n}"
val index: Index = new Index.Builder(source).index("gmall0315_test").`type`("_doc").build()
jedis.execute(index)
//关闭资源
close(jedis)
}
/**
* 批量插入es
* @param indexName
* @param list
*/
def indexBulk(indexName:String,list: List[Any]): Unit ={
val jedis: JestClient = getClient
val bulkBuilder = new Bulk.Builder().defaultIndex(indexName).defaultType("_doc")
for (doc<-list){
val index: Index = new Index.Builder(doc).build()
bulkBuilder.addAction(index)
}
//返回执行多少条
val items: util.List[BulkResult#BulkResultItem] = jedis.execute(bulkBuilder.build()).getItems
println(s"保存=${items.size()}")
//关闭资源
close(jedis)
}
}
另:
public static final String ES_INDEX_DAU="gmall0315_dau";
//ES
val list: List[Startuplog] = startuplogItr.toList
for (startuplog <- list) {
val key = "dau:" + startuplog.logDate
val value = startuplog.mid
jedis.sadd(key, value)
println(startuplog) //往es中保存
}
MyEsUtil.indexBulk(GmallConstant.ES_INDEX_DAU,startuplogItr.toList)
接下来,先启动JsonMocker,再启动DauApp数据模拟发送
注意的是,以下进程必须要启动
[root@flink102 ~]# jps -l
1745 org.apache.zookeeper.server.quorum.QuorumPeerMain
9236 sun.tools.jps.Jps
8582 kafka.Kafka
7064 org.elasticsearch.bootstrap.Elasticsearch
8907 kafka.tools.ConsoleConsumer
9181 gamll0315-logger-0.0.1-SNAPSHOT.jar
另:我们还需要改动1处,如图所示:
另外我们还需要把redis过滤的清单把它删除掉
//查看redis的清单数据
127.0.0.1:6379> keys *
1) "dau:2020-03-17"
//删除redis的清单数据
127.0.0.1:6379> flushall
OK
//再次查看没有数据了
127.0.0.1:6379> keys *
(empty list or set)
127.0.0.1:6379>
再次启动程序
先启动JsonMocker,再启动DauApp数据模拟发送
最后,我们就可以在kibana监控平台查看,执行
GET gmall0315_dau/_search