MySQL8.0視窗函式入門實踐及總結
阿新 • • 發佈:2020-07-10
前言
MySQL8.0之前,做資料排名統計等相當痛苦,因為沒有像Oracle、SQL SERVER 、PostgreSQL等其他資料庫那樣的視窗函式。但隨著MySQL8.0中新增了視窗函式之後,針對這類統計就再也不是事了,本文就以常用的排序例項介紹MySQL的視窗函式。
1、準備工作
建立表及測試資料
mysql> use testdb; Database changed /* 建立表 */ mysql> create table tb_score(id int primary key auto_increment,stu_no varchar(10),course varchar(50),score decimal(4,1),key idx_stuNo_course(stu_no,course)); Query OK,0 rows affected (0.03 sec) mysql> show tables; +------------------+ | Tables_in_testdb | +------------------+ | tb_score | +------------------+ /* 新增一批測試資料 */ mysql> insert into tb_score(stu_no,course,score)values('2020001','mysql',90),('2020001','C++',85),('2020003','English',100),('2020002',50),70),99); Query OK,6 rows affected (0.00 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql> insert into tb_score(stu_no,score)values('2020003',78),81),80),('2020004',60),100); Query OK,6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql> insert into tb_score(stu_no,score)values('2020005',98),('2020005',96),('2020006',70); Query OK,score)values('2020007',('2020007',66),76),('2020008',69),86); Query OK,score)values('2020009',('2020009',86),('2020010',75),81); Query OK,score)values('2020011',('2020012',('2020011',84),('2020013',88); Query OK,6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0
2、統計每門課程分數的排名
根據每門課程的分數從高到低進行排名,此時,會出現分數相同時怎麼處理的問題,下面就根據不同的視窗函式來處理不同場景的需求
ROW_NUMBER
由結果可以看出,分數相同時按照學號順序進行排名
mysql> select stu_no,score,row_number()over(partition by course order by score desc ) rn -> from tb_score; +---------+---------+-------+----+ | stu_no | course | score | rn | +---------+---------+-------+----+ | 2020005 | C++ | 96.0 | 1 | | 2020013 | C++ | 96.0 | 2 | | 2020006 | C++ | 90.0 | 3 | | 2020001 | C++ | 85.0 | 4 | | 2020012 | C++ | 85.0 | 5 | | 2020003 | C++ | 81.0 | 6 | | 2020010 | C++ | 76.0 | 7 | | 2020002 | C++ | 70.0 | 8 | | 2020008 | C++ | 69.0 | 9 | | 2020007 | C++ | 66.0 | 10 | | 2020009 | C++ | 66.0 | 11 | | 2020004 | C++ | 60.0 | 12 | | 2020003 | English | 100.0 | 1 | | 2020004 | English | 100.0 | 2 | | 2020002 | English | 99.0 | 3 | | 2020013 | English | 88.0 | 4 | | 2020008 | English | 86.0 | 5 | | 2020009 | English | 86.0 | 6 | | 2020011 | English | 84.0 | 7 | | 2020010 | English | 81.0 | 8 | | 2020003 | English | 80.0 | 9 | | 2020007 | English | 76.0 | 10 | | 2020012 | English | 75.0 | 11 | | 2020005 | English | 70.0 | 12 | | 2020006 | English | 70.0 | 13 | | 2020005 | mysql | 98.0 | 1 | | 2020001 | mysql | 90.0 | 2 | | 2020008 | mysql | 90.0 | 3 | | 2020011 | mysql | 90.0 | 4 | | 2020004 | mysql | 80.0 | 5 | | 2020003 | mysql | 78.0 | 6 | | 2020010 | mysql | 75.0 | 7 | | 2020009 | mysql | 70.0 | 8 | | 2020006 | mysql | 60.0 | 9 | | 2020002 | mysql | 50.0 | 10 | | 2020007 | mysql | 50.0 | 11 | +---------+---------+-------+----+ 36 rows in set (0.00 sec)mysql> select stu_no,row_number()over(partition by course order by score desc ) rn -> from tb_score; +---------+---------+-------+----+ | stu_no | course | score | rn | +---------+---------+-------+----+ | 2020005 | C++ | 96.0 | 1 | | 2020013 | C++ | 96.0 | 2 | | 2020006 | C++ | 90.0 | 3 | | 2020001 | C++ | 85.0 | 4 | | 2020012 | C++ | 85.0 | 5 | | 2020003 | C++ | 81.0 | 6 | | 2020010 | C++ | 76.0 | 7 | | 2020002 | C++ | 70.0 | 8 | | 2020008 | C++ | 69.0 | 9 | | 2020007 | C++ | 66.0 | 10 | | 2020009 | C++ | 66.0 | 11 | | 2020004 | C++ | 60.0 | 12 | | 2020003 | English | 100.0 | 1 | | 2020004 | English | 100.0 | 2 | | 2020002 | English | 99.0 | 3 | | 2020013 | English | 88.0 | 4 | | 2020008 | English | 86.0 | 5 | | 2020009 | English | 86.0 | 6 | | 2020011 | English | 84.0 | 7 | | 2020010 | English | 81.0 | 8 | | 2020003 | English | 80.0 | 9 | | 2020007 | English | 76.0 | 10 | | 2020012 | English | 75.0 | 11 | | 2020005 | English | 70.0 | 12 | | 2020006 | English | 70.0 | 13 | | 2020005 | mysql | 98.0 | 1 | | 2020001 | mysql | 90.0 | 2 | | 2020008 | mysql | 90.0 | 3 | | 2020011 | mysql | 90.0 | 4 | | 2020004 | mysql | 80.0 | 5 | | 2020003 | mysql | 78.0 | 6 | | 2020010 | mysql | 75.0 | 7 | | 2020009 | mysql | 70.0 | 8 | | 2020006 | mysql | 60.0 | 9 | | 2020002 | mysql | 50.0 | 10 | | 2020007 | mysql | 50.0 | 11 | +---------+---------+-------+----+ 36 rows in set (0.00 sec)
DENSE_RANK
為了讓分數相同時排名也相同,則可以使用DENSE_RANK函式,結果如下:
mysql> select stu_no,DENSE_RANK()over(partition by course order by score desc ) rn -> from tb_score ; +---------+---------+-------+----+ | stu_no | course | score | rn | +---------+---------+-------+----+ | 2020005 | C++ | 96.0 | 1 | | 2020013 | C++ | 96.0 | 1 | | 2020006 | C++ | 90.0 | 2 | | 2020001 | C++ | 85.0 | 3 | | 2020012 | C++ | 85.0 | 3 | | 2020003 | C++ | 81.0 | 4 | | 2020010 | C++ | 76.0 | 5 | | 2020002 | C++ | 70.0 | 6 | | 2020008 | C++ | 69.0 | 7 | | 2020007 | C++ | 66.0 | 8 | | 2020009 | C++ | 66.0 | 8 | | 2020004 | C++ | 60.0 | 9 | | 2020003 | English | 100.0 | 1 | | 2020004 | English | 100.0 | 1 | | 2020002 | English | 99.0 | 2 | | 2020013 | English | 88.0 | 3 | | 2020008 | English | 86.0 | 4 | | 2020009 | English | 86.0 | 4 | | 2020011 | English | 84.0 | 5 | | 2020010 | English | 81.0 | 6 | | 2020003 | English | 80.0 | 7 | | 2020007 | English | 76.0 | 8 | | 2020012 | English | 75.0 | 9 | | 2020005 | English | 70.0 | 10 | | 2020006 | English | 70.0 | 10 | | 2020005 | mysql | 98.0 | 1 | | 2020001 | mysql | 90.0 | 2 | | 2020008 | mysql | 90.0 | 2 | | 2020011 | mysql | 90.0 | 2 | | 2020004 | mysql | 80.0 | 3 | | 2020003 | mysql | 78.0 | 4 | | 2020010 | mysql | 75.0 | 5 | | 2020009 | mysql | 70.0 | 6 | | 2020006 | mysql | 60.0 | 7 | | 2020002 | mysql | 50.0 | 8 | | 2020007 | mysql | 50.0 | 8 | +---------+---------+-------+----+ 36 rows in set (0.00 sec)
RANK
DENSE_RANK的結果是分數相同時排名相同了,但是下一個名次是緊接著上一個名次的,如果2個並列的第1之後,下一個我想是第3名,則可以使用RANK函式實現
mysql> select stu_no,rank()over(partition by course order by score desc ) rn -> from tb_score; +---------+---------+-------+----+ | stu_no | course | score | rn | +---------+---------+-------+----+ | 2020005 | C++ | 96.0 | 1 | | 2020013 | C++ | 96.0 | 1 | | 2020006 | C++ | 90.0 | 3 | | 2020001 | C++ | 85.0 | 4 | | 2020012 | C++ | 85.0 | 4 | | 2020003 | C++ | 81.0 | 6 | | 2020010 | C++ | 76.0 | 7 | | 2020002 | C++ | 70.0 | 8 | | 2020008 | C++ | 69.0 | 9 | | 2020007 | C++ | 66.0 | 10 | | 2020009 | C++ | 66.0 | 10 | | 2020004 | C++ | 60.0 | 12 | | 2020003 | English | 100.0 | 1 | | 2020004 | English | 100.0 | 1 | | 2020002 | English | 99.0 | 3 | | 2020013 | English | 88.0 | 4 | | 2020008 | English | 86.0 | 5 | | 2020009 | English | 86.0 | 5 | | 2020011 | English | 84.0 | 7 | | 2020010 | English | 81.0 | 8 | | 2020003 | English | 80.0 | 9 | | 2020007 | English | 76.0 | 10 | | 2020012 | English | 75.0 | 11 | | 2020005 | English | 70.0 | 12 | | 2020006 | English | 70.0 | 12 | | 2020005 | mysql | 98.0 | 1 | | 2020001 | mysql | 90.0 | 2 | | 2020008 | mysql | 90.0 | 2 | | 2020011 | mysql | 90.0 | 2 | | 2020004 | mysql | 80.0 | 5 | | 2020003 | mysql | 78.0 | 6 | | 2020010 | mysql | 75.0 | 7 | | 2020009 | mysql | 70.0 | 8 | | 2020006 | mysql | 60.0 | 9 | | 2020002 | mysql | 50.0 | 10 | | 2020007 | mysql | 50.0 | 10 | +---------+---------+-------+----+ 36 rows in set (0.01 sec)
這樣就實現了各種排序需求。
NTILE
NTILE函式的作用是對每個分組排名後,再將對應分組分成N個小組,例如
mysql> select stu_no,rank()over(partition by course order by score desc )rn,NTILE(2)over(partition by course order by score desc ) rn_group from tb_score; +---------+---------+-------+----+----------+ | stu_no | course | score | rn | rn_group | +---------+---------+-------+----+----------+ | 2020005 | C++ | 96.0 | 1 | 1 | | 2020013 | C++ | 96.0 | 1 | 1 | | 2020006 | C++ | 90.0 | 3 | 1 | | 2020001 | C++ | 85.0 | 4 | 1 | | 2020012 | C++ | 85.0 | 4 | 1 | | 2020003 | C++ | 81.0 | 6 | 1 | | 2020010 | C++ | 76.0 | 7 | 2 | | 2020002 | C++ | 70.0 | 8 | 2 | | 2020008 | C++ | 69.0 | 9 | 2 | | 2020007 | C++ | 66.0 | 10 | 2 | | 2020009 | C++ | 66.0 | 10 | 2 | | 2020004 | C++ | 60.0 | 12 | 2 | | 2020003 | English | 100.0 | 1 | 1 | | 2020004 | English | 100.0 | 1 | 1 | | 2020002 | English | 99.0 | 3 | 1 | | 2020013 | English | 88.0 | 4 | 1 | | 2020008 | English | 86.0 | 5 | 1 | | 2020009 | English | 86.0 | 5 | 1 | | 2020011 | English | 84.0 | 7 | 1 | | 2020010 | English | 81.0 | 8 | 2 | | 2020003 | English | 80.0 | 9 | 2 | | 2020007 | English | 76.0 | 10 | 2 | | 2020012 | English | 75.0 | 11 | 2 | | 2020005 | English | 70.0 | 12 | 2 | | 2020006 | English | 70.0 | 12 | 2 | | 2020005 | mysql | 98.0 | 1 | 1 | | 2020001 | mysql | 90.0 | 2 | 1 | | 2020008 | mysql | 90.0 | 2 | 1 | | 2020011 | mysql | 90.0 | 2 | 1 | | 2020004 | mysql | 80.0 | 5 | 1 | | 2020003 | mysql | 78.0 | 6 | 1 | | 2020010 | mysql | 75.0 | 7 | 2 | | 2020009 | mysql | 70.0 | 8 | 2 | | 2020006 | mysql | 60.0 | 9 | 2 | | 2020002 | mysql | 50.0 | 10 | 2 | | 2020007 | mysql | 50.0 | 10 | 2 | +---------+---------+-------+----+----------+ 36 rows in set (0.01 sec)
3、視窗函式小結
MySQL中還有許多其他的視窗函式,本文列舉一些,大家可以自行測試
類別 | 函式 | 說明 |
排序 | ROW_NUMBER | 為表中的每一行分配一個序號,可以指定分組(也可以不指定)及排序欄位 |
DENSE_RANK | 根據排序欄位為每個分組中的每一行分配一個序號。 排名值相同時,序號相同,序號中沒有間隙(1,1,2,3這種) | |
RANK | 根據排序欄位為每個分組中的每一行分配一個序號。 排名值相同時,序號相同,但序號中存在間隙(1,3,4這種) | |
NTILE | 根據排序欄位為每個分組中根據指定欄位的排序再分成對應的組 | |
分佈 | PERCENT_RANK | 計算各分組或結果集中行的百分數等級 |
CUME_DIST | 計算某個值在一組有序的資料中累計的分佈 | |
前後 | LEAD | 返回分組中當前行之後的第N行的值。如果不存在對應行,則返回NULL。比如N=1時,第一名對應的值是第二名的,最後一名結果是NULL |
LAG | 返回分組中當前行之前的第N行的值。如果不存在對應行,則返回NULL。比如N=1時,第一名對應的值是是NUL,最後一名結果是倒數第2的值 | |
首尾中 | FIRST_VALUE | 返回每個分組中第一名對應的欄位(或表示式)的值,例如本文中可以是第一名的分數、學號等任意欄位的值 |
LAST_VALUE | 返回每個分組中最後一名對應的欄位(或表示式)的值,例如本文中可以是最後一名的分數、學號等任意欄位的值 | |
NTH_VALUE |
返回每個分組中排名第N的對應欄位(或表示式)的值,但小於N的行對應的值是NULL |
MySQL中主要的視窗函式先總結這麼多,建議還是得動手實踐一番。另外,MySQL5.7及之前版本的排序方式的實現很多人已總結,也建議實操一番。
總結
到此這篇關於MySQL8.0視窗函式入門實踐及總結的文章就介紹到這了,更多相關MySQL8.0視窗函式實踐內容請搜尋我們以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援我們!