MySQL Crash Course #06# Chapter 13. 14 GROUP BY. 子查詢
索引
- 理解 GROUP BY
- 過濾數據 vs. 過濾分組
- GROUP BY 與 ORDER BY 之不成文的規定
- 子查詢 vs. 聯表查詢
- 相關子查詢和不相關子查詢. 增量構造復雜查詢
- Always More Than One Solution As explained earlier in this chapter, although the sample code shown here works, it is often not the most efficient way to perform this type of data retrieval. You will revisit this example in a later chapter.
Understanding Data Grouping
mysql> SELECT COUNT(*) AS num_prods -> FROM products -> WHERE vend_id=1003; +-----------+ | num_prods | +-----------+ | 7 | +-----------+ 1 row in set (0.00 sec)
我們可以通過改變 WHERE 條件中與 vend_id 判等的值(1003. 1004. 1005 .. .)來獲取各個供貨商的產品數量,但是沒辦法一次性把它們羅列出來,GROUP BY 恰好可以解決這個問題:
mysql> SELECT vend_id, COUNT(*) AS num_prods -> FROM products -> GROUP BY vend_id; +---------+-----------+ | vend_id | num_prods | +---------+-----------+ | 1001 | 3 | | 1002 | 2 | | 1003 | 7 | | 1005 | 2 | +---------+-----------+ 4rows in set (0.00 sec)
分組允許把數據分為多個邏輯組,以便能對每個組進行聚集計算。
mysql> SELECT vend_id, COUNT(*) AS num_prods -> FROM products -> GROUP BY vend_id WITH ROLLUP; +---------+-----------+ | vend_id | num_prods | +---------+-----------+ | 1001 | 3 | | 1002 | 2 | | 1003 | 7 | | 1005 | 2 | | NULL | 14 | +---------+-----------+ 5 rows in set (0.00 sec)
↑ 利用該 關鍵字可以同時拿到匯總值。
The GROUP BY clause must come after any WHERE clause and before any ORDER BY clause.
Filtering Groups
mysql> SELECT cust_id, COUNT(*) AS orders -> FROM orders -> GROUP BY cust_id -> HAVING COUNT(*) >= 2; +---------+--------+ | cust_id | orders | +---------+--------+ | 10001 | 2 | +---------+--------+ 1 row in set (0.00 sec)
如果有 WHERE 那必須是在 GROUP BY 的上面。
WHERE filters before data is grouped, and HAVING filters after data is grouped.
Grouping and Sorting
mysql> SELECT order_num, SUM(quantity*item_price) AS ordertotal -> FROM orderitems -> GROUP BY order_num -> HAVING SUM(quantity*item_price) >= 50 -> ORDER BY ordertotal; # Finally, the output is sorted using the ORDER BY clause. +-----------+------------+ | order_num | ordertotal | +-----------+------------+ | 20006 | 55.00 | | 20008 | 125.00 | | 20005 | 149.87 | | 20007 | 1000.00 | +-----------+------------+ 4 rows in set (0.00 sec)
Don‘t Forget ORDER BY As a rule, anytime you use a GROUP BY clause, you should also specify an ORDER BY clause. That is the only way to ensure that data is sorted properly. Never rely on GROUP BY to sort your data.
總之,最好在用 GROUP BY 的時候順手給出 ORDER BY , 除非你完全不在意順序。
Understanding Subqueries
設計數據庫需要遵循一些範式,而做範式的基本手段就是拆表,因此數據被分散都若幹個表中是不可避免的,很多時候,采用子查詢會讓事情變得更簡單。下面是一個簡單的例子:
即將用到的幾張表 ↓
mysql> SELECT * -> FROM orders -> LIMIT 3; +-----------+---------------------+---------+ | order_num | order_date | cust_id | +-----------+---------------------+---------+ | 20005 | 2005-09-01 00:00:00 | 10001 | | 20006 | 2005-09-12 00:00:00 | 10003 | | 20007 | 2005-09-30 00:00:00 | 10004 | +-----------+---------------------+---------+ 3 rows in set (0.00 sec) mysql> SELECT * -> FROM orderitems -> LIMIT 3; +-----------+------------+---------+----------+------------+ | order_num | order_item | prod_id | quantity | item_price | +-----------+------------+---------+----------+------------+ | 20005 | 1 | ANV01 | 10 | 5.99 | | 20005 | 2 | ANV02 | 3 | 9.99 | | 20005 | 3 | TNT2 | 5 | 10.00 | +-----------+------------+---------+----------+------------+ 3 rows in set (0.00 sec) mysql> SELECT cust_id, cust_name -> FROM customers -> LIMIT 3; +---------+-------------+ | cust_id | cust_name | +---------+-------------+ | 10001 | Coyote Inc. | | 10002 | Mouse House | | 10003 | Wascals | +---------+-------------+ 3 rows in set (0.00 sec)
假設你現在希望得到購買了 TNT2 的顧客的清單,實際上可以分成下面幾個查詢:
先找出所有 ‘TNT2‘ 相關的訂單號,然後通過訂單號可以找到對應的顧客號,最後通過顧客號再找到顧客信息:
mysql> SELECT order_num -> FROM orderitems -> WHERE prod_id = ‘TNT2‘; +-----------+ | order_num | +-----------+ | 20005 | | 20007 | +-----------+ 2 rows in set (0.00 sec)
mysql> SELECT cust_id -> FROM orders -> WHERE order_num IN (20005,20007); +---------+ | cust_id | +---------+ | 10001 | | 10004 | +---------+ 2 rows in set (0.00 sec)
... / 這好幾個查詢是可以寫在一起的:
mysql> SELECT cust_name, cust_contact -> FROM customers -> WHERE cust_id IN (SELECT cust_id -> FROM orders -> WHERE order_num IN (SELECT order_num -> FROM orderitems -> WHERE prod_id = ‘TNT2‘)); +----------------+--------------+ | cust_name | cust_contact | +----------------+--------------+ | Coyote Inc. | Y Lee | | Yosemite Place | Y Sam | +----------------+--------------+ 2 rows in set (0.00 sec)
僅從拿數據的角度分析上面的命令:x. 最終是要拿到 cust_name 和 cust_contact ,所以首先 SELECT cust_name, cust_contact ,從哪裏拿呢?FROM customers,約束條件. cust_id 必須在某個集合內,然後 又 回到 x. 重復,一層一層寫下去 ...
效率問題:Subqueries and Performance The code shown here works, and it achieves the desired result. However, using subqueries is not always the most efficient way to perform this type of data retrieval, although it might be. More on this is in Chapter 15, "Joining Tables," where you will revisit this same example.
Using Subqueries As Calculated Fields
相關子查詢就像是一個嵌套的 for 循環 ...
SELECT cust_name, cust_state, (SELECT COUNT(*) FROM orders WHERE orders.cust_id = customers.cust_id) AS orders FROM customers ORDER BY cust_name;
使用字段全名是必要的,否則 mysql 會把 cust_id = cust_id 當成內查詢的表字段自己和自己比較。
外查詢每找到一條記錄就會 執行 一次 子查詢,類似於 SELECT x - 2017 ,每找到一條記錄都要做一次運算。
Build Queries with Subqueries Incrementally Testing and debugging queries with subqueries can be tricky, particularly as these statements grow in complexity. The safest way to build (and test) queries with subqueries is to do so incrementally, in much the same way as MySQL processes them. Build and test the innermost query first. Then build and test the outer query with hard-coded data, and only after you have verified that it is working embed the subquery. Then test it again. And keep repeating these steps as for each additional query. This will take just a little longer to construct your queries, but doing so saves you lots of time later (when you try to figure out why queries are not working) and significantly increases the likelihood of them working the first time.
MySQL Crash Course #06# Chapter 13. 14 GROUP BY. 子查詢