1. 程式人生 > >SQL優化的五個建議

SQL優化的五個建議

翻譯至:http://www.vertabelo.com/blog/technical-articles/5-tips-to-optimize-your-sql-queries,僅僅作為自己學習參考,翻譯不好,請見諒SQL語言看起來很容易學習-SQL命令遵循一個簡單的語法但是並不描述用於檢索資料的特定演算法,然而,簡單可能只是表面的,並不是所有的資料庫功能都有相同的操作效率,兩個非常相似的查詢語句在執行時間上可以有很大的不同,本文提出了一些能大大提升你的 SQL 查詢的最佳做法。

1.學習如何恰當地建立索引(Learn How to Create Indexes Properly)

學習如何正確地建立索引是你可以提高你的SQL查詢效能能做的最好的事情,在特殊情況下,索引能更快地訪問資料庫,對於資料庫初學者來說,索引是一個神祕或者說困難的事情,他們要麼什麼都沒檢索到,要麼試圖檢索所有東西。當然,這些方法都不正確,如果一點索引都沒有,你的查詢有可能會很慢;如果你索引所有東西,會導致你的updates和insert觸發器效率很低下。

2.只檢索你真正需要的資料(Only Retrieve the Data You Really Need)

查詢需要的列資訊最常用的方法是使用*,但可能有些列不是你真正需要的;如果表很小,檢索附加列都沒有太大區別,但是,對於較大的資料集,指定列查詢可能會節省大量的查詢時間。然而,請牢記一點,許多流行的ORM 不會簡單讓你建立擇表中的列的子集的一個查詢。

Similarly, if you only need a limited number of rows you should use the LIMIT clause (or your database’s equivalent). Take a look at the following code:

For instance, if you only want to display the first 10 records out of 50,000 on your website, it is advisable to inform the database about it. This way, the database will stop the search after finding 10 rows rather than scan the whole table:

The LIMIT statement is available in MySQL and Postgresql, but other databases have ways to achieve a similar effect.

These above examples illustrate the general idea – you should always think whether you need all the rows returned by an SQL statement. If you don’t, there is always some room for improvement.

3.避免在左手邊的運算子的功能(Avoid Functions on the Left Hand-Side of the Operator)

Functions are a handy way to provide complex tasks and they can be used both in the SELECTclause and in the WHERE clause. Nevertheless, their application in WHERE clauses may result in major performance issues. Take a look at the following example:

Even if there is an index on the appointment_date column in the table users, the query will still need to perform a full table scan. This is because we use the DATEDIFF function on the column appointment_date. The output of the function is evaluated at run time, so the server has to visit all the rows in the table to retrieve the necessary data. To enhance performance, the following change can be made:

This time, we aren’t using any functions in the WHERE clause, so the system can utilize an index to seek the data more efficiently.

4.考慮擺脫相關子查詢(Consider Getting Rid of Correlated Subqueries)

A correlated subquery is a subquery which depends on the outer query. It uses the data obtained from the outer query in its WHERE clause. Suppose you want to list all users who have made a donation. You could retrieve the data with the following code:

In the above case, the subquery runs once for each row of the main query, thus causing possible inefficiency. Instead, we can apply a join:

If there are millions of users in the database, the statement with the correlated subquery will most likely be less efficient than the INNER JOIN because it needs to run millions of times. But if you were to look for donations made by a single user, the correlated subquery might not be a bad idea. As a rule of thumb, if you look for many or most of the rows, try to avoid using correlated subqueries. Keep in mind, however, that using correlated subqueries might be inevitable in some cases.

5.避免LIKE模式開頭的萬用字元字元(Avoid Wildcard Characters at the Beginning of a LIKEPattern)

Whenever possible, avoid using the LIKE pattern in the following way:

The use of the % wildcard at the beginning of the LIKE pattern will prevent the database from using a suitable index if such exists. Since the system doesn’t know what the beginning of the name column is, it will have to perform a full table scan anyway. In many cases, this may slow the query execution. If the query can be rewritten in the following way:

then the performance may be enhanced. You should always consider whether a wildcard character at the beginning is really essential.

小貼士-讀取執行計劃(Read the Execution Plan)

The performance of your SQL queries depends on multiple factors, including your database model, the indexes available and the kind of information you wish to retrieve. The best way to keep track of what’s happening with your queries is to analyse the execution plan produced by the optimizer. You can use it to experiment and find the best solution for your statements