hive分組隨機抽一定量數
阿新 • • 發佈:2019-02-11
要求根據員工的職級分類,然後每類職級隨機抽取2條資料,
建表:
create table temp.a
(id string,
name string,
age string,
rank string
)
ROW format delimited FIELDS TERMINATED BY ',' ;
load data local inpath 'a.txt' into table temp.a;
select * from temp.a;
id | name | age | rank |
1 | a | 10 | p1 |
2 | b | 34 | p1 |
3 | c | 23 | p2 |
4 | d | 33 | p2 |
5 | e | 23 | p2 |
6 | f | 67 | p3 |
7 | g | 34 | p3 |
8 | h | 12 | p4 |
9 | i | 54 | p5 |
SQL:
select id,
name,
age,
rank
from (
select id,
name,
age,
rank,
row_number()over(partition by rank order by rand()) as rn
from a
) t
where t.rn <=2
結果1:
結果2:
注:如果order by rand(1),則每次排序相同,即出來的結果相同。