编写Hive的HQL语句来实现以下结果:表中的1表示选修,表中的0表示未选修 id a b c d e f 1 1 1 1 0 1 0 2 1 0 1 1 0 1 3 1 1 1 0 1 0 表示有id为1,2,3的学生选修了课程a,b,c,d,e,f中其中几门 id course 1,a 1,b 1,c 1,e 2,a 2,c 2,d 2,f 3,a 3,b 3,c 3,e
create table t_stu_course ( id int, course string ) row format delimited fields terminated by ","; load data local inpath "/root/t_stu_course.txt" into table t_stu_course;
select tmp.id ,max(tmp.a) as a ,max(tmp.b) as b ,max(tmp.c) as c ,max(tmp.d) as d ,max(tmp.e) as e ,max(tmp.f) as f from ( select id ,case when course="a" then 1 else 0 end as a ,case when course="b" then 1 else 0 end as b ,case when course="c" then 1 else 0 end as c ,case when course="d" then 1 else 0 end as d ,case when course="e" then 1 else 0 end as e ,case when course="f" then 1 else 0 end as f from t_stu_course ) tmp group by tmp.id;
select collect_set(course) as courses from t_stu_course;
set hive.strict.checks.cartesian.product=false;
select t1.id as id,t1.course as id_courses,t2.course courses from ( select id as id,collect_set(course) as course from t_stu_course group by id ) t1 join (select collect_set(course) as course from t_stu_course) t2;
启用严格模式:hive.mapred.mode = strict // Deprecated hive.strict.checks.large.query = true 该设置会禁用:1. 不指定分页的orderby 2. 对分区表不指定分区进行查询 3. 和数据量无关,只是一个查询模式 hive.strict.checks.type.safety = true 严格类型安全,该属性不允许以下操作:1. bigint和string之间的比较 2. bigint和double之间的比较 hive.strict.checks.cartesian.product = true 该属性不允许笛卡尔积操作