Hive視窗函式詳細介紹2,rank(),dense_rank() ,row_number()
阿新 • • 發佈:2020-07-27
在hive中,有三種視窗函式,rank(),dense_rank() 和row_number() 可以在視窗內實現對資料的排序。現在主要介紹這三個視窗函式的區別
1.rank() :生成資料項在分組內的排名,排名相等時會在名次中留下空位。
2. dense_rank() :生成資料項在分組內的排名,排名相等不會在名次中留下空位。
3.row_number() : 從1開始,按照順序,生成分組內記錄的序列
下面通過一個例項展示它們之間的區別。
create table if not exists buy_info ( name string, buy_date string, buy_numint ) row format delimited fields terminated by '|'; select * from buy_info;
liulei | 2015-04-11 | 5 |
liulei | 2015-04-12 | 7 |
liulei | 2015-04-13 | 3 |
liulei | 2015-04-14 | 2 |
liulei | 2015-04-15 | 4 |
liulei | 2015-04-16 | 4 |
select name,buy_date, buy_num , ranK() over(partition by name order bybuy_num desc) as rank1, dense_rank() over(partition by name order by buy_num desc) as rank2, row_number() over(partition by name order by buy_num desc) as rank3
from buy_info;
name | buy_date | buy_num | rank1 | rank2 | rank3 |
liulei | 2015-04-12 | 7 | 1 | 1 | 1 |
liulei | 2015-04-11 | 5 | 2 | 2 | 2 |
liulei | 2015-04-15 | 4 | 3 | 3 | 3 |
liulei | 2015-04-16 | 4 | 3 | 3 | 4 |
liulei | 2015-04-13 | 3 | 5 | 4 | 5 |
liulei | 2015-04-14 | 2 | 6 | 5 | 6 |