表T1中有两列数,code和create_time,数据示例如下:
现需将上述数据,按照相同code分组,分组后效果类似:
不同于rank和row_number窗口函数,这是个被称为间隙和岛屿的问题,第一步,求row_id和同类code分组id的差得到grp:
SELECT code, create_time, ( ROW_NUMBER ( ) OVER ( ORDER BY create_time ) - ROW_NUMBER ( ) OVER ( PARTITION BY code ORDER BY create_time ) ) AS grp FROM t1 ORDER BY create_time ASC
基于上一步的结果,可进一步聚合求得每个“岛屿”的边界:
SELECT
code,
MIN ( create_time ) AS start_time,
MAX ( create_time ) AS end_time
FROM
( SELECT code, create_time, ( ROW_NUMBER ( ) OVER ( ORDER BY create_time ) - ROW_NUMBER ( ) OVER ( PARTITION BY code ORDER BY create_time ) ) AS grp FROM t1 ORDER BY create_time ASC ) A
GROUP BY
code,
grp
order by start_time
执行结果: