1. 程式人生 > 實用技巧 >Hive視窗函式詳細介紹2,rank(),dense_rank() ,row_number()

Hive視窗函式詳細介紹2,rank(),dense_rank() ,row_number()

在hive中,有三種視窗函式,rank(),dense_rank() 和row_number() 可以在視窗內實現對資料的排序。現在主要介紹這三個視窗函式的區別

1.rank() :生成資料項在分組內的排名,排名相等時會在名次中留下空位。

2. dense_rank() :生成資料項在分組內的排名,排名相等不會在名次中留下空位。

3.row_number() : 從1開始,按照順序,生成分組內記錄的序列

下面通過一個例項展示它們之間的區別。

create  table  if   not    exists   buy_info  (
    
 name   string,
 buy_date   string,
 buy_num   
int ) row format delimited fields terminated by '|'; select * from buy_info;
liulei 2015-04-11 5
liulei 2015-04-12 7
liulei 2015-04-13 3
liulei 2015-04-14 2
liulei 2015-04-15 4
liulei 2015-04-16 4

select  name,buy_date,  buy_num ,
ranK()  over(partition by  name   order  by
buy_num desc) as rank1, dense_rank() over(partition by name order by buy_num desc) as rank2, row_number() over(partition by name order by buy_num desc) as rank3
from buy_info;
name buy_date buy_num rank1 rank2 rank3
liulei 2015-04-12 7 1 1 1
liulei 2015-04-11 5 2 2 2
liulei 2015-04-15 4 3 3 3
liulei 2015-04-16 4 3 3 4
liulei 2015-04-13 3 5 4 5
liulei 2015-04-14 2 6 5 6