ORACLE的SQL練習---8. 視窗函式OVER()
阿新 • • 發佈:2020-12-08
Over()視窗函式最常見的搭配有以下幾種:
- rank(),dense_rank(),row_number() + over(partition by … order by …) 排名
- sum(),avg(),count()聚合函式+over(partition by … order by …)
- max(),min()+over(partition by … order by …) 最大值、最小值
- first_value(),last_value() + over(partition by … order by …) 第一條、最後一條記錄
- lag(),lead() + over(partition by … order by …) 偏移量
其中的partition by 是分組,order by 是排序。這裡的分組與group by 是不同的,最明顯的是group by會影響返回結果的條數,但是partition by 不會。
案例用到的建表語句:
create table LX_05_SALARY
(
id NUMBER,
department_name VARCHAR2(100),
sal NUMBER,
pay_date DATE
)
插數語句:
insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (1, 'A部門', 80000, to_date('10-01-2020', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (2, 'B部門', 60000, to_date('10-01-2020', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (3, 'C部門', 100000, to_date('10-01-2020', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (4, 'A部門', 70000, to_date('10-12-2019', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (5, 'B部門', 60000, to_date('10-12-2019', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (6, 'C部門', 80000, to_date('10-12-2019', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (7, 'C部門', 48000, to_date('10-12-2019', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (8, 'C部門', 92000, to_date('10-12-2019', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (9, 'B部門', 90000, to_date('10-12-2019', 'dd-mm-yyyy')); insert into lx_05_salary (ID, DEPARTMENT_NAME, SAL, PAY_DATE) values (10, 'B部門', 50000, to_date('10-12-2019', 'dd-mm-yyyy'));
具體的用法如下:
- rank(),dense_rank(),row_number() + over(partition by … order by …) 排名:
select a.department_name, a.id, a.sal, rank() over(partition by a.department_name order by a.sal desc) as rank排名, dense_rank() over(partition by a.department_name order by a.sal desc) as dense_rank排名, row_number() over(partition by a.department_name order by a.sal desc) as row_number排序 from lx_05_salary a;
由上圖結果可以看出:
rank()的排名如果有並列出現,下一名會跳過並列的名次。有兩個同時排名第二,那下一條就是第四名。
dense_rank()的排名與rank()不同,出現並列不會跳過並列的名次,依然按順序排名。例如:兩個第二名,下一條還是第三名。
row_number()不會出現並列的情況,一直順序排下去。
知道這三者之間的區別之後,就需要在實際應用中選擇合適方式來使用。
- sum(),avg(),count()聚合函式+over(partition by … order by …)
select a.department_name,a.id,a.sal,
sum(a.sal)over(partition by a.department_name order by a.id ) as 部門內連續求和 ,
sum(a.sal)over(partition by a.department_name) as 部門求和,
round(a.sal/sum(a.sal)over(partition by a.department_name),4)*100 as 每人佔部門份額 ,
sum(a.sal)over(order by a.department_name) as 部門連續求和,
sum(a.sal)over(order by a.id) as 人員連續求和,
sum(a.sal) over() as 總計,
round(a.sal/sum(a.sal) over(),4)*100 as 人員份額,
round(sum(a.sal) over(partition by a.department_name)/sum(a.sal) over(),4)*100 as 部門份額
from lx_05_salary a
order by a.department_name;
over()中使用order by 會在分組內連續向下求和。不使用的話就只會在當前分組下求合計。avg(),count()的用法是一致的。
- max(),min()+over(partition by … order by …) 最大值、最小值
select a.department_name,
a.id,
a.sal,
max(a.sal) over(partition by a.department_name order by a.sal desc) as max_desc,
min(a.sal) over(partition by a.department_name order by a.sal desc) as min_desc_失效,
min(a.sal) over(partition by a.department_name order by a.sal desc rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as min_desc,
max(a.sal) over(partition by a.department_name order by a.sal asc) as max_asc_失效,
max(a.sal) over(partition by a.department_name order by a.sal asc rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as max_asc,
min(a.sal) over(partition by a.department_name order by a.sal asc) as min_asc
from lx_05_salary a
min(),max()在使用order by的時候會有無效的情況,要麼就去掉order by 要麼向上面例子使用(rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)具體的可以參考這位大神的文章
- first_value(),last_value() + over(partition by … order by …) 第一條、最後一條記錄。
first_value(),last_value() 與max()和min()一樣,在升序或者降序排序的時候也會有無效的情況。
select a.department_name,
a.id,
a.sal,
first_value(a.sal) over(partition by a.department_name order by a.sal desc) as first_value,
last_value(a.sal) over(partition by a.department_name order by a.sal desc rows BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_value,
last_value(a.sal) over(partition by a.department_name order by a.sal asc) as last_value_無效
from lx_05_salary a;
- lag(),lead() + over(partition by … order by …) 偏移量
語法:
lag(目標欄位,偏移量,預設值) + over(partition by … order by …)
lead(目標欄位,偏移量,預設值) + over(partition by … order by …)
舉例:
沒有預設值的會取空
select a.*,
lag(a.id, 1 ) over(partition by a.department_name order by a.id) as 同部門_上一id,
lead(a.id, 1 ) over(partition by a.department_name order by a.id) as 同部門_下一id,
lag(a.id,1) over(order by a.id) as 全部_上一id ,
lead(a.id,1) over(order by a.id) as 全部_下一id
from lx_05_salary a
order by a.department_name,a.id;
如果不寫偏移量,預設是1,如下例:
select a.*,
lag(a.id ) over(partition by a.department_name order by a.id) as 同部門_上一id,
lead(a.id ) over(partition by a.department_name order by a.id) as 同部門_下一id,
lag(a.id ) over(order by a.id) as 全部_上一id ,
lead(a.id ) over(order by a.id) as 全部_下一id
from lx_05_salary a
order by a.department_name,a.id;
設定預設值的寫法:
select a.*,
lag(a.id, 1,999999 ) over(partition by a.department_name order by a.id) as 同部門_上一id,
lead(a.id, 1,999999 ) over(partition by a.department_name order by a.id) as 同部門_下一id,
lag(a.id,1,999999 ) over(order by a.id) as 全部_上一id ,
lead(a.id,1,999999 ) over(order by a.id) as 全部_下一id
from lx_05_salary a
order by a.department_name,a.id;