1. 程式人生 > >hive分組隨機抽一定量數

hive分組隨機抽一定量數

要求根據員工的職級分類,然後每類職級隨機抽取2條資料,

建表:

create table temp.a
(id    string,
 name  string,
 age   string,
 rank  string
)
ROW format delimited FIELDS TERMINATED BY ',' ;

load data local inpath 'a.txt' into table temp.a;

select  * from temp.a;


id name age rank
1 a 10 p1
2 b 34 p1
3 c 23 p2
4 d 33 p2
5 e 23 p2
6 f 67 p3
7 g 34 p3
8 h 12 p4
9 i 54 p5

SQL:

select id,
	   name,
	   age,
	   rank
from ( 
		select id,
				name,
				age,
				rank,
				row_number()over(partition by rank order by rand()) as rn
		from a 
	  ) t
where t.rn <=2

結果1:


結果2:



注:如果order by rand(1),則每次排序相同,即出來的結果相同。