in,exists和not exists ,not in與null的一些關係記載
In和exist,not in和not exists,其中可能還有null,一直是個很糾結的問題,直到現在自己也不能完全說出兩種語句的優劣,使用情況等。
先從查詢結果上來講,上個星期五pub的一篇帖子說到了not in和not exists查詢結果不同,記得曾經碰到過這種問題,null的影響,因為not in和not exists兩種從根本上”演算法”是不同的。
下面自己做個了測試的例子可以很清晰的看到結果不同
SQL> select * from jj_one;
ID NAME
------ ----------------------------------------
1 as
2 ad
kj
1 as
SQL> select * from jj_two;
COL1 COL2
---------- ----------
1 as
3 df
SQL> select * from jj_one where id not in (select col1 from jj_two);
ID NAME
------ ----------------------------------------
2 ad
SQL> select * from jj_one a where not exists (select 1 from jj_two b where a.id=b.col1);
ID NAME
------ ----------------------------------------
kj
2 ad
Null的運算只能是無法用於等值判斷的,null和任何值比較包括null都是false,一般null的判斷是is null或is not null。需要明確下in於null值判斷是返回false的。
in與not in也就是跟一系列範圍的值做等值判斷,所以會略去相應的null值的一行,而not exists是一種邏輯判斷是否存在,其實這個not exists可以理解為:JJ_one和JJ_two等值連線然後由於存在null不符合要求不會返回結果集,而外部存在一個
還有一個測試例子:
SQL> select * from jj_one;
ID NAME
------ ----------------------------------------
1 as
2 ad
kj
1 as
SQL> select * from jj_two;
COL1 COL2
---------- ----------
1 as
3 df
ok
SQL> select * from jj_one where id in (select col1 from jj_two);
ID NAME
------ ----------------------------------------
1 as
1 as
SQL> select * from jj_one a where exists (select 1 from jj_two b where a.id=b.col1);
ID NAME
------ ----------------------------------------
1 as
1 as
SQL> select * from jj_one a where not exists (select 1 from jj_two b where a.id=b.col1);
ID NAME
------ ----------------------------------------
kj
2 ad
In和exists效能上也有一定區別,但是並不能簡單的認為exists一定優於exists,還是要具體情況具體分析。下面有個not in和not exists的執行計劃。
SQL> select id from jj_one where not exists(select 1 from jj_two where id=col1);
執行計劃
----------------------------------------------------------
Plan hash value: 2332573458
-----------------------------------------------------------------------------
| Id| Operation| Name | Rows| Bytes | Cost (%CPU)| Time|
-----------------------------------------------------------------------------
|0 | SELECT STATEMENT| |3 |48 |6(17)| 00:00:01 |
|*1 |HASH JOIN ANTI ||3 |48 |6(17)| 00:00:01 |
|2 |TABLE ACCESS FULL| JJ_ONE |4 |12 |3(0)| 00:00:01 |
|3 |TABLE ACCESS FULL| JJ_TWO |1 |13 |2(0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ID"="COL1")
統計資訊
----------------------------------------------------------
1recursive calls
0db block gets
14consistent gets
0physical reads
0redo size
439bytes sent via SQL*Net to client
385bytes received via SQL*Net from client
2SQL*Net roundtrips to/from client
0sorts (memory)
0sorts (disk)
2rows processed
SQL> select id from jj_one where id not in (select col1 from jj_two);
未選定行
執行計劃
----------------------------------------------------------
Plan hash value: 2272190419
-----------------------------------------------------------------------------
| Id| Operation| Name | Rows| Bytes | Cost (%CPU)| Time|
-----------------------------------------------------------------------------
|0 | SELECT STATEMENT| |2 |6 |7(0)| 00:00:01 |
|*1 |FILTER||||||
|2 |TABLE ACCESS FULL| JJ_ONE |4 |12 |3(0)| 00:00:01 |
|*3 |TABLE ACCESS FULL| JJ_TWO |1 |13 |2(0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "JJ_TWO" "JJ_TWO" WHERE
LNNVL("COL1"<>:B1)))
3 - filter(LNNVL("COL1"<>:B1))
統計資訊
----------------------------------------------------------
1recursive calls
0db block gets
25consistent gets
0physical reads
0redo size
268bytes sent via SQL*Net to client
374bytes received via SQL*Net from client
1SQL*Net roundtrips to/from client
0sorts (memory)
0sorts (disk)
0rows processed
可以看出來無論是邏輯讀還是cost至少在not exists上是優於not in的,cbo模式下not in 採用的filter對外層表做全表掃描再去filter內層查詢結果集,not exists採用記憶體查詢結果集作hash連線,而hash連線顯而易見會取得更好的結果!
這裡自己也想起來以前的工作的一個拉資料的例子,兩張表inf_apply的表大概有20w以上資料主鍵no,inf_apply_test大概有3w資料,想實現把inf_apply表中更新no不在inf_apply_test表中no記錄的inf_apply表中的相應資料,剛開始同事一個not in十幾分鍾還沒有反應。最好改成not exists只要一分鐘左右的樣子就可以了!