Druid实时OLAP分析

  • 开发环境准备
  • 启动Druid
  • 访问WebUI
  • 点击流日志指标分析
  • 每日PV分析
  • 每日UV分析
  • 每日IP分析
  • 每日用户访问来源流量占比(百度、知乎、新浪、首页...)
  • 每日不同城市访问来源流量占比
  • 订单数指标分析
  • 日订单数分析
  • 周订单数分析
  • 月订单数分析
  • 今日各区域订单数(地图)
  • 周订单数趋势分析
  • 今日区域订单的订单数Top8
  • 周销售环比分析
  • 24小时销售额分析
  • 今日top4地区销售排行
  • 每日实际支付买家数
  • 每日购物车支付转换率
  • 商品消息数指标分析
  • 每日商家商品数量
  • 每日商家商品品牌数量
  • 每日首发上架商品数
  • 购物车消息数指标分析
  • 每日加入购物车次数
  • 每日加入购物车买家数
  • 每日加入购物车商品数
  • 评论数指标分析
  • 每日买家评价数
  • 每日买家评价卖家数
  • 每日买家好评率


开发环境准备

启动Druid

  • 1、node1节点(使用外部zk而不使用imply自带zk启动overlord和coordinator)
# 使用外部zk而不使用imply自带zk启动overlord和coordinator
/export/servers/imply-3.0.4/bin/supervise -c /export/servers/imply-3.0.4/conf/supervise/master-no-zk.conf
  • 2、node2节点(启动historical和middlemanager)
/export/servers/imply-3.0.4/bin/supervise -c /export/servers/imply-3.0.4/conf/supervise/data.conf
  • 3、node3节点(启动broker和router)
/export/servers/imply-3.0.4/bin/supervise -c /export/servers/imply-3.0.4/conf/supervise/query.conf

访问WebUI

点击流日志指标分析

  • 操作步骤:
  • 1、打开 postman
  • 2、修改摄取Kafka实时数据 配置文件
  • 打开 Druid实时数据分析项目_配置文件\index_kafka_dws_click_log.json数据文件
  • 修改 Kafka 集群地址
  • 修改 topic 地址
  • 3、复制JSON配置文件到 postman
  • 4、发送请求到 http://node1:8090/druid/indexer/v1/supervisor

每日PV分析

SELECT 
          SUM("count") as totalPV
        from
          "dws_click_log"
        where TIME_FORMAT("__time", 'yyyy-MM-dd') = '2010-09-05'

每日UV分析

SELECT 
          COUNT(DISTINCT "uid") as totalPV
        from
          "dws_click_log"
        where TIME_FORMAT("__time", 'yyyy-MM-dd') = '2010-09-05'

每日IP分析

SELECT 
          COUNT(DISTINCT "ip") as totalPV
        from
          "dws_click_log"
        where TIME_FORMAT("__time", 'yyyy-MM-dd') = '2010-09-05'

每日用户访问来源流量占比(百度、知乎、新浪、首页…)

SELECT
          referDomain,  
          SUM("count") as total_count
        FROM
          "dws_click_log"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2010-09-05'
        GROUP BY 1

每日不同城市访问来源流量占比

SELECT
          province, 
          city,
          SUM("count") as total_count
        FROM
          "dws_click_log"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2010-09-05'
        GROUP BY province,city

订单数指标分析

  • 操作步骤:
  • 1、打开 postman
  • 2、修改摄取Kafka实时数据 配置文件
  • 打开 Druid实时数据分析项目_配置文件\index_kafka_dws_order.json数据文件
  • 修改 Kafka 集群地址
  • 修改 topic 地址
  • 3、复制JSON配置文件到 postman
  • 4、发送请求到 http://node1:8090/druid/indexer/v1/supervisor

日订单数分析

SELECT
      SUM("count") as total_count
    FROM
      "dws_order"
    WHERE
      TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-01-06'

周订单数分析

SELECT
      SUM("count") as total_count
    FROM
      "dws_order"
    WHERE
      "__time" >= CURRENT_TIMESTAMP - INTERVAL '7' DAY

月订单数分析

SELECT
      SUM("count") as total_count
    FROM
      "dws_order"
    WHERE
    "__time" >= CURRENT_TIMESTAMP - INTERVAL '1' MONTH

今日各区域订单数(地图)

SELECT
      areaId,
      sum("count") as total_count
    FROM
      "dws_order"
    WHERE
      TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-01-06' 
    GROUP BY 1
    ORDER BY 2 DESC

周订单数趋势分析

  • 分析一周内每日订单数
SELECT
      TIME_FORMAT("__time", 'yyyy-MM-dd') as "date",
      sum("count") as total_count
    FROM  
      "dws_order"
    WHERE
      "__time" >= CURRENT_TIMESTAMP - INTERVAL '7' DAY
    GROUP BY 1

今日区域订单的订单数Top8

  • 按照区域分组
  • 按照订单数降序排列
  • 取前8条
SELECT
      areaId,
      SUM("count") total_count
    FROM
      "dws_order"
    WHERE
      TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-01-06' 
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 8

周销售环比分析

  • 获取上周每天总销售额
  • 获取本周每天中销售额
-- 上周
    SELECT
      '上周' as "week",
      TIME_FORMAT("__time", 'yyyy-MM-dd') as "date1",
      SUM("totalMoney") as total_money
    FROM
      "dws_order"
    WHERE
      "__time" BETWEEN (CURRENT_TIMESTAMP - INTERVAL '14' DAY) AND (CURRENT_TIMESTAMP - INTERVAL '7' DAY)
    GROUP BY 1,2
    UNION ALL
    SELECT
      '本周' as "week",
      TIME_FORMAT("__time", 'yyyy-MM-dd') as "date1",
      SUM("totalMoney") as total_money
    FROM
      "dws_order"
    WHERE
      "__time" >= CURRENT_TIMESTAMP - INTERVAL '7' DAY
    GROUP BY 1,2;

24小时销售额分析

SELECT
      TIME_FORMAT("__time", 'HH') as "hour",
      SUM("totalMoney") as "totalMoney"
    FROM
      "dws_order"
    WHERE
      "__time" >= CURRENT_TIMESTAMP - INTERVAL '1' DAY
    GROUP BY 1
    ORDER BY 2 DESC

今日top4地区销售排行

SELECT
      areaId,
      SUM("totalMoney") as "totalMoney"
    FROM
      "dws_order"
    WHERE
      "__time" >= CURRENT_TIMESTAMP - INTERVAL '1' DAY
    GROUP BY 1
    ORDER BY 2 DESC
    LIMIT 4

每日实际支付买家数

SELECT 
        COUNT(DISTINCT "userId") as "totalCount"
        FROM 
            "dws_order" 
        WHERE TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-01-06'  and isPay=1

每日购物车支付转换率

SELECT 
            SUM("count") as "totalCount",       --找到总的已支付的订单
            SUM(case when isFromCart=0 then 1 else 0 end) as "cartTotalCount"   --直接下单的订单数量
            FROM 
                "dws_order" 
            WHERE TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-01-06'  and isPay=1

商品消息数指标分析

  • 操作步骤:
  • 1、打开 postman
  • 2、修改摄取Kafka实时数据 配置文件
  • 打开 Druid实时数据分析项目_配置文件\index_kafka_dws_goods.json数据文件
  • 修改 Kafka 集群地址
  • 修改 topic 地址
  • 3、复制JSON配置文件到 postman
  • 4、发送请求到 http://node1:8090/druid/indexer/v1/supervisor

每日商家商品数量

SELECT
            shopId,
                COUNT(DISTINCT "goodsId") as total_count
        FROM 
            "dws_goods"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2019-02-23'
        GROUP BY shopId

每日商家商品品牌数量

SELECT
            shopId,
                COUNT(DISTINCT "brandId") as total_count
        FROM 
            "dws_goods"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2019-02-23'
        GROUP BY shopId

每日首发上架商品数

SELECT
            shopId,
                COUNT(DISTINCT "goodsId") as total_count
        FROM 
            "dws_goods"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2019-02-23' and isSale=1
        GROUP BY shopId

购物车消息数指标分析

  • 操作步骤:
  • 1、打开 postman
  • 2、修改摄取Kafka实时数据 配置文件
  • 打开 Druid实时数据分析项目_配置文件\index_kafka_dws_cart.json数据文件
  • 修改 Kafka 集群地址
  • 修改 topic 地址
  • 3、复制JSON配置文件到 postman
  • 4、发送请求到 http://node1:8090/druid/indexer/v1/supervisor

每日加入购物车次数

SELECT
            SUM("count") as total_count
        FROM 
            "dws_cart"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2019-12-16'

每日加入购物车买家数

SELECT
            COUNT(DISTINCT "userId") as total_count
        FROM 
            "dws_cart"
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') ='2019-12-16'


每日加入购物车商品数

SELECT 
        SUM("totalGoods") as "totalCount"
        FROM 
            "dws_cart" 
        WHERE TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-12-16'

评论数指标分析

  • 操作步骤:
  • 1、打开 postman
  • 2、修改摄取Kafka实时数据 配置文件
  • 打开 Druid实时数据分析项目_配置文件\index_kafka_dws_comments.json数据文件
  • 修改 Kafka 集群地址
  • 修改 topic 地址
  • 3、复制JSON配置文件到 postman

每日买家评价数

select 
        userId,
        SUM("count") as totalCount
        from dws_comments
        WHERE
              TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-12-06' 
        GROUP BY userId

每日买家评价卖家数

select 
        userId,
        shopId,
        SUM("count") as totalCount
        from dws_comments
        WHERE
              TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-12-06' 
        GROUP BY 1,2

每日买家好评率

select n
            userId,
          SUM("count") as totalCount,
          SUM(case when starScore>'3' then "count" else 0 end) as goodCount 
        from dws_comments
        WHERE
          TIME_FORMAT("__time", 'yyyy-MM-dd') = '2019-12-06' 
        GROUP BY userId