Hive實現交叉二維分析的小語句
阿新 • • 發佈:2018-12-18
1. 梳理出你要的列和行維度
列維度: 每一週
行維度: 年級 + 學科 + 班型
2. 對資料按周增序進行聚合 (即根據列維度) ,生成list
concat_ws 和 collect_list (collect_set 會去重後再聚合) 順序隨機
sort_array 只能增序,要倒序排的話在子查詢裡新增一個輔助列來排序即可。
3. 依次取list的元素
即為 按周增序的指標結果
select term, kemu, course_applicable_user_type, split(hs,',')[0] lesson_order1, split(hs,',')[1] lesson_order2, split(hs,',')[2] lesson_order3, split(hs,',')[3] lesson_order4, split(hs,',')[4] lesson_order5 from ( select term, kemu, course_applicable_user_type, -- concat_ws(',', collect_list(cast(lesson_order as string))) as lesson_order_set, -- concat_ws(',', collect_list(cast(lesson_valid_rate as string))) as index_amount_set, regexp_replace( concat_ws(',', sort_array ( collect_list( concat_ws(':', case when length(cast(lesson_order as string))=1 then concat('0',cast(lesson_order as string)) else cast(lesson_order as string) end, cast(lesson_valid_rate as string) ) ) ) ),'\\d\\d\:','' )hs from ( select term, kemu, course_applicable_user_type, lesson_order, lesson_valid_rate from tmp )t group by term,kemu,course_applicable_user_type )t1