1. 程式人生 > 資料庫 >Mysql巧用join優化sql的方法詳解

Mysql巧用join優化sql的方法詳解

0. 準備相關表來進行接下來的測試

相關建表語句請看:https://github.com/YangBaohust/my_sql

user1表,取經組
+----+-----------+-----------------+---------------------------------+
| id | user_name | comment   | mobile       |
+----+-----------+-----------------+---------------------------------+
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349   |
| 2 | 孫悟空 | 鬥戰勝佛  | 159384292,022-483432,+86-392432 |
| 3 | 豬八戒 | 淨壇使者  | 183208243,055-8234234   |
| 4 | 沙僧  | 金身羅漢  | 293842295,098-2383429   |
| 5 | NULL  | 白龍馬   | 993267899      |
+----+-----------+-----------------+---------------------------------+

user2表,悟空的朋友圈
+----+--------------+-----------+
| id | user_name | comment |
+----+--------------+-----------+
| 1 | 孫悟空  | 美猴王 |
| 2 | 牛魔王  | 牛哥  |
| 3 | 鐵扇公主  | 牛夫人 |
| 4 | 菩提老祖  | 葡萄  |
| 5 | NULL   | 晶晶  |
+----+--------------+-----------+

user1_kills表,取經路上殺的妖怪數量
+----+-----------+---------------------+-------+
| id | user_name | timestr    | kills |
+----+-----------+---------------------+-------+
| 1 | 孫悟空 | 2013-01-10 00:00:00 | 10 |
| 2 | 孫悟空 | 2013-02-01 00:00:00 |  2 |
| 3 | 孫悟空 | 2013-02-05 00:00:00 | 12 |
| 4 | 孫悟空 | 2013-02-12 00:00:00 | 22 |
| 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 |
| 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 |
| 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 |
| 8 | 沙僧  | 2013-01-10 00:00:00 |  3 |
| 9 | 沙僧  | 2013-01-22 00:00:00 |  9 |
| 10 | 沙僧  | 2013-02-11 00:00:00 |  5 |
+----+-----------+---------------------+-------+

user1_equipment表,取經組裝備
+----+-----------+--------------+-----------------+-----------------+
| id | user_name | arms   | clothing  | shoe   |
+----+-----------+--------------+-----------------+-----------------+
| 1 | 唐僧  | 九環錫杖  | 錦斕袈裟  | 僧鞋   |
| 2 | 孫悟空 | 金箍棒  | 梭子黃金甲  | 藕絲步雲履  |
| 3 | 豬八戒 | 九齒釘耙  | 僧衣   | 僧鞋   |
| 4 | 沙僧  | 降妖寶杖  | 僧衣   | 僧鞋   |
+----+-----------+--------------+-----------------+-----------------+

1. 使用left join優化not in子句

例子:找出取經組中不屬於悟空朋友圈的人

+----+-----------+-----------------+-----------------------+
| id | user_name | comment   | mobile    |
+----+-----------+-----------------+-----------------------+
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349 |
| 3 | 豬八戒 | 淨壇使者  | 183208243,055-8234234 |
| 4 | 沙僧  | 金身羅漢  | 293842295,098-2383429 |
+----+-----------+-----------------+-----------------------+

not in寫法:

select * from user1 a where a.user_name not in (select user_name from user2 where user_name is not null);

left join寫法:

首先看通過user_name進行連線的外連線資料集

select a.*,b.* from user1 a left join user2 b on (a.user_name = b.user_name);
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+
| id | user_name | comment   | mobile       | id | user_name | comment |
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+
| 2 | 孫悟空 | 鬥戰勝佛  | 159384292,+86-392432 | 1 | 孫悟空 | 美猴王 |
| 1 | 唐僧  | 旃檀功德佛  | 138245623,021-382349   | NULL | NULL  | NULL  |
| 3 | 豬八戒 | 淨壇使者  | 183208243,055-8234234   | NULL | NULL  | NULL  |
| 4 | 沙僧  | 金身羅漢  | 293842295,098-2383429   | NULL | NULL  | NULL  |
| 5 | NULL  | 白龍馬   | 993267899      | NULL | NULL  | NULL  |
+----+-----------+-----------------+---------------------------------+------+-----------+-----------+

可以看到a表中的所有資料都有顯示,b表中的資料只有b.user_name與a.user_name相等才顯示,其餘都以null值填充,要想找出取經組中不屬於悟空朋友圈的人,只需要在b.user_name中加一個過濾條件b.user_name is null即可。

select a.* from user1 a left join user2 b on (a.user_name = b.user_name) where b.user_name is null;
+----+-----------+-----------------+-----------------------+
| id | user_name | comment   | mobile    |
+----+-----------+-----------------+-----------------------+
| 1 | 唐僧  | 旃檀功德佛  | 138245623,098-2383429 |
| 5 | NULL  | 白龍馬   | 993267899    |
+----+-----------+-----------------+-----------------------+

看到這裡發現結果集中還多了一個白龍馬,繼續新增過濾條件a.user_name is not null即可。

select a.* from user1 a left join user2 b on (a.user_name = b.user_name) where b.user_name is null and a.user_name is not null;

2. 使用left join優化標量子查詢

例子:檢視取經組中的人在悟空朋友圈的暱稱

+-----------+-----------------+-----------+
| user_name | comment   | comment2 |
+-----------+-----------------+-----------+
| 唐僧  | 旃檀功德佛  | NULL  |
| 孫悟空 | 鬥戰勝佛  | 美猴王 |
| 豬八戒 | 淨壇使者  | NULL  |
| 沙僧  | 金身羅漢  | NULL  |
| NULL  | 白龍馬   | NULL  |
+-----------+-----------------+-----------+

子查詢寫法:

select a.user_name,a.comment,(select comment from user2 b where b.user_name = a.user_name) comment2 from user1 a;

left join寫法:

select a.user_name,b.comment comment2 from user1 a left join user2 b on (a.user_name = b.user_name);

3. 使用join優化聚合子查詢

例子:查詢出取經組中每人打怪最多的日期

+----+-----------+---------------------+-------+
| id | user_name | timestr    | kills |
+----+-----------+---------------------+-------+
| 4 | 孫悟空 | 2013-02-12 00:00:00 | 22 |
| 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 |
| 9 | 沙僧  | 2013-01-22 00:00:00 |  9 |
+----+-----------+---------------------+-------+

聚合子查詢寫法:

select * from user1_kills a where a.kills = (select max(b.kills) from user1_kills b where b.user_name = a.user_name);

join寫法:

首先看兩表自關聯的結果集,為節省篇幅,只取豬八戒的打怪資料來看

select a.*,b.* from user1_kills a join user1_kills b on (a.user_name = b.user_name) order by 1;
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| id | user_name | timestr    | kills | id | user_name | timestr    | kills |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 | 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 |
| 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 | 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 |
| 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 | 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 |
| 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 | 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 |
| 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 | 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 |
| 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 | 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 |
| 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 | 5 | 豬八戒 | 2013-01-11 00:00:00 | 20 |
| 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 | 6 | 豬八戒 | 2013-02-07 00:00:00 | 17 |
| 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 | 7 | 豬八戒 | 2013-02-08 00:00:00 | 35 |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+

可以看到當兩表通過user_name進行自關聯,只需要對a表的所有欄位進行一個group by,取b表中的max(kills),只要a.kills=max(b.kills)就滿足要求了。sql如下

select a.* from user1_kills a join user1_kills b on (a.user_name = b.user_name) group by a.id,a.user_name,a.timestr,a.kills having a.kills = max(b.kills);

4. 使用join進行分組選擇

例子:對第3個例子進行升級,查詢出取經組中每人打怪最多的前兩個日期

+----+-----------+---------------------+-------+
| id | user_name | timestr       | kills |
+----+-----------+---------------------+-------+
| 3 | 孫悟空  | 2013-02-05 00:00:00 |  12 |
| 4 | 孫悟空  | 2013-02-12 00:00:00 |  22 |
| 5 | 豬八戒  | 2013-01-11 00:00:00 |  20 |
| 7 | 豬八戒  | 2013-02-08 00:00:00 |  35 |
| 9 | 沙僧   | 2013-01-22 00:00:00 |   9 |
| 10 | 沙僧   | 2013-02-11 00:00:00 |   5 |
+----+-----------+---------------------+-------+

在oracle中,可以通過分析函式來實現

select b.* from (select a.*,row_number() over(partition by user_name order by kills desc) cnt from user1_kills a) b where b.cnt <= 2;

很遺憾,上面sql在mysql中報錯ERROR 1064 (42000): You have an error in your SQL syntax; 因為mysql並不支援分析函式。不過可以通過下面的方式去實現。

首先對兩表進行自關聯,為了節約篇幅,只取出孫悟空的資料

select a.*,b.* from user1_kills a join user1_kills b on (a.user_name=b.user_name and a.kills<=b.kills) order by a.user_name,a.kills desc;
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| id | user_name | timestr       | kills | id | user_name | timestr       | kills |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+
| 4 | 孫悟空  | 2013-02-12 00:00:00 |  22 | 4 | 孫悟空  | 2013-02-12 00:00:00 |  22 |
| 3 | 孫悟空  | 2013-02-05 00:00:00 |  12 | 3 | 孫悟空  | 2013-02-05 00:00:00 |  12 |
| 3 | 孫悟空  | 2013-02-05 00:00:00 |  12 | 4 | 孫悟空  | 2013-02-12 00:00:00 |  22 |
| 1 | 孫悟空  | 2013-01-10 00:00:00 |  10 | 1 | 孫悟空  | 2013-01-10 00:00:00 |  10 |
| 1 | 孫悟空  | 2013-01-10 00:00:00 |  10 | 3 | 孫悟空  | 2013-02-05 00:00:00 |  12 |
| 1 | 孫悟空  | 2013-01-10 00:00:00 |  10 | 4 | 孫悟空  | 2013-02-12 00:00:00 |  22 |
| 2 | 孫悟空  | 2013-02-01 00:00:00 |   2 | 1 | 孫悟空  | 2013-01-10 00:00:00 |  10 |
| 2 | 孫悟空  | 2013-02-01 00:00:00 |   2 | 3 | 孫悟空  | 2013-02-05 00:00:00 |  12 |
| 2 | 孫悟空  | 2013-02-01 00:00:00 |   2 | 4 | 孫悟空  | 2013-02-12 00:00:00 |  22 |
| 2 | 孫悟空  | 2013-02-01 00:00:00 |   2 | 2 | 孫悟空  | 2013-02-01 00:00:00 |   2 |
+----+-----------+---------------------+-------+----+-----------+---------------------+-------+

從上面的表中我們知道孫悟空打怪前兩名的數量是22和12,那麼只需要對a表的所有欄位進行一個group by,對b表的id做個count,count值小於等於2就滿足要求,sql改寫如下:

select a.* from user1_kills a join user1_kills b on (a.user_name=b.user_name and a.kills<=b.kills) group by a.id,a.kills having count(b.id) <= 2;

5. 使用笛卡爾積關聯實現一列轉多行

例子:將取經組中每個電話號碼變成一行

原始資料:

+-----------+---------------------------------+
| user_name | mobile             |
+-----------+---------------------------------+
| 唐僧   | 138245623,021-382349      |
| 孫悟空  | 159384292,+86-392432 |
| 豬八戒  | 183208243,055-8234234      |
| 沙僧   | 293842295,098-2383429      |
| NULL   | 993267899            |
+-----------+---------------------------------+

想要得到的資料:

+-----------+-------------+
| user_name | mobile   |
+-----------+-------------+
| 唐僧   | 138245623  |
| 唐僧   | 021-382349 |
| 孫悟空  | 159384292  |
| 孫悟空  | 022-483432 |
| 孫悟空  | +86-392432 |
| 豬八戒  | 183208243  |
| 豬八戒  | 055-8234234 |
| 沙僧   | 293842295  |
| 沙僧   | 098-2383429 |
| NULL   | 993267899  |
+-----------+-------------+

可以看到唐僧有兩個電話,因此他就需要兩行。我們可以先求出每人的電話號碼數量,然後與一張序列表進行笛卡兒積關聯,為了節約篇幅,只取出唐僧的資料

select a.id,b.* from tb_sequence a cross join (select user_name,mobile,length(mobile)-length(replace(mobile,',''))+1 size from user1) b order by 2,1;
+----+-----------+---------------------------------+------+
| id | user_name | mobile             | size |
+----+-----------+---------------------------------+------+
| 1 | 唐僧   | 138245623,021-382349      |  2 |
| 2 | 唐僧   | 138245623,021-382349      |  2 |
| 3 | 唐僧   | 138245623,021-382349      |  2 |
| 4 | 唐僧   | 138245623,021-382349      |  2 |
| 5 | 唐僧   | 138245623,021-382349      |  2 |
| 6 | 唐僧   | 138245623,021-382349      |  2 |
| 7 | 唐僧   | 138245623,021-382349      |  2 |
| 8 | 唐僧   | 138245623,021-382349      |  2 |
| 9 | 唐僧   | 138245623,021-382349      |  2 |
| 10 | 唐僧   | 138245623,021-382349      |  2 |
+----+-----------+---------------------------------+------+

a.id對應的就是第幾個電話號碼,size就是總的電話號碼數量,因此可以加上關聯條件(a.id <= b.size),將上面的sql繼續調整

select b.user_name,replace(substring(substring_index(b.mobile,a.id),char_length(substring_index(mobile,a.id-1)) + 1),'') as mobile from tb_sequence a cross join (select user_name,concat(mobile,') as mobile,''))+1 size from user1) b on (a.id <= b.size);

6. 使用笛卡爾積關聯實現多列轉多行

例子:將取經組中每件裝備變成一行

原始資料:

+----+-----------+--------------+-----------------+-----------------+
| id | user_name | arms     | clothing    | shoe      |
+----+-----------+--------------+-----------------+-----------------+
| 1 | 唐僧   | 九環錫杖   | 錦斕袈裟    | 僧鞋      |
| 2 | 孫悟空  | 金箍棒    | 梭子黃金甲   | 藕絲步雲履   |
| 3 | 豬八戒  | 九齒釘耙   | 僧衣      | 僧鞋      |
| 4 | 沙僧   | 降妖寶杖   | 僧衣      | 僧鞋      |
+----+-----------+--------------+-----------------+-----------------+

想要得到的資料:

+-----------+-----------+-----------------+
| user_name | equipment | equip_mame   |
+-----------+-----------+-----------------+
| 唐僧   | arms   | 九環錫杖    |
| 唐僧   | clothing | 錦斕袈裟    |
| 唐僧   | shoe   | 僧鞋      |
| 孫悟空  | arms   | 金箍棒     |
| 孫悟空  | clothing | 梭子黃金甲   |
| 孫悟空  | shoe   | 藕絲步雲履   |
| 沙僧   | arms   | 降妖寶杖    |
| 沙僧   | clothing | 僧衣      |
| 沙僧   | shoe   | 僧鞋      |
| 豬八戒  | arms   | 九齒釘耙    |
| 豬八戒  | clothing | 僧衣      |
| 豬八戒  | shoe   | 僧鞋      |
+-----------+-----------+-----------------+

union的寫法:

select user_name,'arms' as equipment,arms equip_mame from user1_equipment
union all
select user_name,'clothing' as equipment,clothing equip_mame from user1_equipment
union all
select user_name,'shoe' as equipment,shoe equip_mame from user1_equipment
order by 1,2;

join的寫法:

首先看笛卡爾資料集的效果,以唐僧為例

select a.*,b.* from user1_equipment a cross join tb_sequence b where b.id <= 3;
+----+-----------+--------------+-----------------+-----------------+----+
| id | user_name | arms     | clothing    | shoe      | id |
+----+-----------+--------------+-----------------+-----------------+----+
| 1 | 唐僧   | 九環錫杖   | 錦斕袈裟    | 僧鞋      | 1 |
| 1 | 唐僧   | 九環錫杖   | 錦斕袈裟    | 僧鞋      | 2 |
| 1 | 唐僧   | 九環錫杖   | 錦斕袈裟    | 僧鞋      | 3 |
+----+-----------+--------------+-----------------+-----------------+----+

使用case對上面的結果進行處理

select user_name,case when b.id = 1 then 'arms' 
when b.id = 2 then 'clothing'
when b.id = 3 then 'shoe' end as equipment,case when b.id = 1 then arms end arms,case when b.id = 2 then clothing end clothing,case when b.id = 3 then shoe end shoe
from user1_equipment a cross join tb_sequence b where b.id <=3;
+-----------+-----------+--------------+-----------------+-----------------+
| user_name | equipment | arms     | clothing    | shoe      |
+-----------+-----------+--------------+-----------------+-----------------+
| 唐僧   | arms   | 九環錫杖   | NULL      | NULL      |
| 唐僧   | clothing | NULL     | 錦斕袈裟    | NULL      |
| 唐僧   | shoe   | NULL     | NULL      | 僧鞋      |
+-----------+-----------+--------------+-----------------+-----------------+

使用coalesce函式將多列資料進行合併

select user_name,coalesce(case when b.id = 1 then arms end,case when b.id = 2 then clothing end,case when b.id = 3 then shoe end) equip_mame
from user1_equipment a cross join tb_sequence b where b.id <=3 order by 1,2;

7. 使用join更新過濾條件中包含自身的表

例子:把同時存在於取經組和悟空朋友圈中的人,在取經組中把comment欄位更新為"此人在悟空的朋友圈"

我們很自然地想到先查出user1和user2中user_name都存在的人,然後更新user1表,sql如下

update user1 set comment = '此人在悟空的朋友圈' where user_name in (select a.user_name from user1 a join user2 b on (a.user_name = b.user_name));

很遺憾,上面sql在mysql中報錯:ERROR 1093 (HY000): You can't specify target table 'user1' for update in FROM clause,提示不能更新目標表在from子句的表。

那有沒有其它辦法呢?我們可以將in的寫法轉換成join的方式

select c.*,d.* from user1 c join (select a.user_name from user1 a join user2 b on (a.user_name = b.user_name)) d on (c.user_name = d.user_name);
+----+-----------+--------------+---------------------------------+-----------+
| id | user_name | comment | mobile | user_name |
+----+-----------+--------------+---------------------------------+-----------+
| 2 | 孫悟空 | 鬥戰勝佛 | 159384292,+86-392432 | 孫悟空 |
+----+-----------+--------------+---------------------------------+-----------+

然後對join之後的檢視進行更新即可

update user1 c join (select a.user_name from user1 a join user2 b on (a.user_name = b.user_name)) d on (c.user_name = d.user_name) set c.comment = '此人在悟空的朋友圈';

再檢視user1,可以看到user1已修改成功

select * from user1;
+----+-----------+-----------------------------+---------------------------------+
| id | user_name | comment           | mobile             |
+----+-----------+-----------------------------+---------------------------------+
| 1 | 唐僧   | 旃檀功德佛         | 138245623,021-382349      |
| 2 | 孫悟空  | 此人在悟空的朋友圈     | 159384292,+86-392432 |
| 3 | 豬八戒  | 淨壇使者          | 183208243,055-8234234      |
| 4 | 沙僧   | 金身羅漢          | 293842295,098-2383429      |
| 5 | NULL   | 白龍馬           | 993267899            |
+----+-----------+-----------------------------+---------------------------------+

8. 使用join刪除重複資料

首先向user2表中插入兩條資料

insert into user2(user_name,comment) values ('孫悟空','美猴王');
insert into user2(user_name,comment) values ('牛魔王','牛哥');

例子:將user2表中的重複資料刪除,只保留id號大的

+----+--------------+-----------+
| id | user_name  | comment  |
+----+--------------+-----------+
| 1 | 孫悟空    | 美猴王  |
| 2 | 牛魔王    | 牛哥   |
| 3 | 鐵扇公主   | 牛夫人  |
| 4 | 菩提老祖   | 葡萄   |
| 5 | NULL     | 晶晶   |
| 6 | 孫悟空    | 美猴王  |
| 7 | 牛魔王    | 牛哥   |
+----+--------------+-----------+

首先檢視重複記錄

select a.*,b.* from user2 a join (select user_name,comment,max(id) id from user2 group by user_name,comment having count(*) > 1) b on (a.user_name=b.user_name and a.comment=b.comment) order by 2;
+----+-----------+-----------+-----------+-----------+------+
| id | user_name | comment  | user_name | comment  | id  |
+----+-----------+-----------+-----------+-----------+------+
| 1 | 孫悟空  | 美猴王  | 孫悟空  | 美猴王  |  6 |
| 6 | 孫悟空  | 美猴王  | 孫悟空  | 美猴王  |  6 |
| 2 | 牛魔王  | 牛哥   | 牛魔王  | 牛哥   |  7 |
| 7 | 牛魔王  | 牛哥   | 牛魔王  | 牛哥   |  7 |
+----+-----------+-----------+-----------+-----------+------+

接著只需要刪除(a.id < b.id)的資料即可

delete a from user2 a join (select user_name,comment having count(*) > 1) b on (a.user_name=b.user_name and a.comment=b.comment) where a.id < b.id;

檢視user2,可以看到重複資料已經被刪掉了

select * from user2;
+----+--------------+-----------+
| id | user_name  | comment  |
+----+--------------+-----------+
| 3 | 鐵扇公主   | 牛夫人  |
| 4 | 菩提老祖   | 葡萄   |
| 5 | NULL     | 晶晶   |
| 6 | 孫悟空    | 美猴王  |
| 7 | 牛魔王    | 牛哥   |
+----+--------------+-----------+

總結:

給大家就介紹到這裡,大家有興趣可以多造點資料,然後比較不同的sql寫法在執行時間上的區別。本文例子取自於網《sql開發技巧》。

好了,以上就是這篇文章的全部內容了,希望本文的內容對大家的學習或者工作具有一定的參考學習價值,謝謝大家對我們的支援。