1. 程式人生 > 資料庫 >ORACLE隱式型別轉換

ORACLE隱式型別轉換

 

隱式型別轉換簡介

 

通常ORACLE資料庫存在顯式型別轉換(Explicit Datatype Conversion和隱式型別轉換(Implicit Datatype Conversion)兩種型別轉換方式。如果進行比較或運算的兩個值的資料型別不同時(源資料的型別與目標資料的型別),而且此時又沒有轉換函式時,那麼ORACLE必須將其中一個值進行型別轉換,使其能夠運算。這就是所謂的隱式型別轉換。其中隱式型別轉換是自動進行的,當然,只有在這種轉換是有意義的時候,才會自動進行。

 

Data Conversion

Generally an expression cannot contain values of different datatypes. For example, an expression cannot multiply 5 by 10 and then add 'JAMES'. However, Oracle supports both implicit and explicit conversion of values from one datatype to another.

 

 

關於隱式型別轉換,建議翻看官方文件Data Type Comparison Rules章節,下面是官方文件中的隱式型別轉換矩陣。從下面這個表格,我們就能對哪些資料型別能進行轉換一目瞭然。

 

clip_image001[4]

 

 

 

 

隱式轉換的規則:

 

 

其實隱式型別轉換髮生在很多地方,只是我們很多時候沒有留意罷了,不打算一一舉例,自行翻閱官方文件的介紹,摘抄隱式型別轉換的一些常見的規則如下:

 

The following rules govern implicit data type conversions:

  • During INSERT and UPDATE operations, Oracle converts the value to the data type of the affected column.
  • During SELECT FROM operations, Oracle converts the data from the column to the type of the target variable.
  • When manipulating numeric values, Oracle usually adjusts precision and scale to allow for maximum capacity. In such cases, the numeric data type resulting from such operations can differ from the numeric data type found in the underlying tables.
  • When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.
  • Conversions between character values or NUMBER values and floating-point number values can be inexact, because the character types and NUMBER use decimal precision to represent the numeric value, and the floating-point numbers use binary precision.
  • When converting a CLOB value into a character data type such as VARCHAR2, or converting BLOB to RAW data, if the data to be converted is larger than the target data type, then the database returns an error.
  • During conversion from a timestamp value to a DATE value, the fractional seconds portion of the timestamp value is truncated. This behavior differs from earlier releases of Oracle Database, when the fractional seconds portion of the timestamp value was rounded.
  • Conversions from BINARY_FLOAT to BINARY_DOUBLE are exact.
  • Conversions from BINARY_DOUBLE to BINARY_FLOAT are inexact if the BINARY_DOUBLE value uses more bits of precision that supported by the BINARY_FLOAT.
  • When comparing a character value with a DATE value, Oracle converts the character data to DATE.
  • When you use a SQL function or operator with an argument of a data type other than the one it accepts, Oracle converts the argument to the accepted data type.
  • When making assignments, Oracle converts the value on the right side of the equal sign (=) to the data type of the target of the assignment on the left side.
  • During concatenation operations, Oracle converts from noncharacter data types to CHAR or NCHAR.
  • During arithmetic operations on and comparisons between character and noncharacter data types, Oracle converts from any character data type to a numeric, date, or rowid, as appropriate. In arithmetic operations between CHAR/VARCHAR2 and NCHAR/NVARCHAR2, Oracle converts to a NUMBER.
  • Most SQL character functions are enabled to accept CLOBs as parameters, and Oracle performs implicit conversions between CLOB and character types. Therefore, functions that are not yet enabled for CLOBs can accept CLOBs through implicit conversion. In such cases, Oracle converts the CLOBs to CHAR or VARCHAR2 before the function is invoked. If the CLOB is larger than 4000 bytes, then Oracle converts only the first 4000 bytes to CHAR.
  • When converting RAW or LONG RAW data to or from character data, the binary data is represented in hexadecimal form, with one hexadecimal character representing every four bits of RAW data. Refer to "RAW and LONG RAW Data Types" for more information.
  • Comparisons between CHAR and VARCHAR2 and between NCHAR and NVARCHAR2 types may entail different character sets. The default direction of conversion in such cases is from the database character set to the national character set. Table 2-9 shows the direction of implicit conversions between different character types.

 

對上面官方文件資料的翻譯如下,如有不對或不夠確切的地方,敬請指出

 

1.  對於INSERT和UPDATE操作,ORACLE會把插入值或者更新值隱式轉換為對應欄位的資料型別。

 

2.  對於SELECT語句,ORACLE會把欄位的資料型別隱式轉換為變數的資料型別。

 

3.  當處理數值時,ORACLE通常會調整精度和小數位,以實現最大容量。在這種情況下,由此類操作產生的數字資料型別可能與在基礎表中找到的數字資料型別不同。

 

4.  當比較一個字元型和數值型的值時,ORACLE會把字元型的值隱式轉換為數值型。

 

5.  字元值或NUMBER值與浮點數值之間的轉換可能不準確,因為字元型別和NUMBER使用十進位制精度表示數字值,而浮點數則使用二進位制精度。

 

6.  將CLOB值轉換為字元資料型別(例如VARCHAR2)或將BLOB轉換為RAW資料時,如果要轉換的資料大於目標資料型別,則資料庫將返回錯誤。

 

7.   當timestamp型別轉換為DATE時(按照第三條,隱式轉換不應該把timestamp轉換為date,除非insert這樣的),timestamp後幾位會被truncated忽略,至於忽略幾位,取決於資料庫版本。

 

8.  從BINARY_FLOAT到BINARY_DOUBLE的轉換是準確的。

 

9.  從BINARY_DOUBLE到BINARY_FLOAT的轉換是不精確的,因為BINARY_DOUBLE精度更高。

 

10.  當比較字元型和日期型的資料時,ORACLE會把字元型轉換為日期型。

 

11. 如果呼叫函式(過程)或運算子操作時,如果輸入引數的資料型別與函式(儲存過程)定義的引數資料型別不一致或不是可接受的資料型別時,則ORACLE會把輸入引數的資料型別轉換為函式或者過程定義的資料型別。

 

12. 當使用賦值符號(等號)時,右邊的型別轉換為左邊的型別

 

13. 當連線操作(concatenation,一般為||)時,ORACLE會隱式轉換非字元型到字元型

 

14. 如果字元型別的資料和非字元型別的資料(如number、date、rowid等)作算術運算,則ORACLE會將字元型別的資料轉換為合適的資料型別,這些資料型別可能是number、date、rowid等。

  如果CHAR/VARCHAR2 和NCHAR/NVARCHAR2之間作算術運算,則ORACLE會將她們都轉換為number型別的資料再做比較。

 

 

15. 比較CHAR/VARCHAR2 和NCHAR/NVARCHAR2時,如果兩者字符集不一樣,則預設的轉換方式是將資料編碼從資料庫字符集轉換為國家字符集

 

 

下面簡單舉兩個例子,看看隱式轉換髮生的場景:

 

例子:

 

SQL> create table test(object_id varchar2(12), object_name varchar2(64));
 
Table created.
 
SQL> insert into test
  2  select object_id, object_name from dba_objects;
 
63426 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index ix_test_n1 on test(object_id);
 
Index created.
 
SQL> select count(*) from test where object_id=20;
 
  COUNT(*)
----------
         1
 
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
SQL_ID  4bh7yzj5ma0ks, child number 0
-------------------------------------
select count(*) from test where object_id=20
 
Plan hash value: 1950795681
 
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    45 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
|*  2 |   TABLE ACCESS FULL| TEST |     3 |    24 |    45  (20)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(TO_NUMBER("OBJECT_ID")=20)
 
Note
-----
   - dynamic sampling used for this statement
 
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
 
23 rows selected.

 

如上所示,這個發生隱式轉換是因為這個規則: 當比較一個字元型和數值型的值時,ORACLE會把字元型的值隱式轉換為數值型(對於SELECT語句,ORACLE會把欄位的資料型別隱式轉換為變數的資料型別。似乎這個規則也對),此時由於隱式轉換髮生在OBJECT_ID欄位上(TO_NUMBER("OBJECT_ID")),導致執行計劃走全表掃描。如果我們稍微修改一下SQL的寫法,就會發現執行計劃會走INDEX RANGE SCAN。 如下所示:

 

SQL>  select count(*) from test where object_id='20';
 
  COUNT(*)
----------
         1
 
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  7800f6da7c909, child number 0
-------------------------------------
 select count(*) from test where object_id='20'
 
Plan hash value: 4037411162
 
--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |     1 (100)|          |
|   1 |  SORT AGGREGATE   |            |     1 |     6 |            |          |
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|*  2 |   INDEX RANGE SCAN| IX_TEST_N1 |     1 |     6 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"='20')
 
 
19 rows selected.

 

下面再介紹一個案例(當比較字元型和日期型的資料時,ORACLE會把字元型轉換為日期型。),這種轉換雖然大部分情況下都是正常的,但是有時候會成為一個隱藏的邏輯炸彈,當NLS_DATE_FORMAT環境變數改變時,則有可能出現錯誤或邏輯錯誤。

 

SQL> SELECT *
  2  FROM scott.emp
  3  WHERE hiredate between '01-JAN-1981' and '01-APR-1981';
 
     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30
      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30
 
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------
SQL_ID  czyc76busj56d, child number 0
-------------------------------------
SELECT * FROM scott.emp WHERE hiredate between '01-JAN-1981' and
'01-APR-1981'
 
Plan hash value: 3956160932
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     2 (100)|          |
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
|*  1 |  TABLE ACCESS FULL| EMP  |     2 |    74 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("HIREDATE"<=TO_DATE(' 1981-04-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "HIREDATE">=TO_DATE(' 1981-01-01 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
 
 
21 rows selected.

 

 

 

 

隱式型別轉換問題

 

 

Implicit and Explicit Data Conversion

 

Oracle recommends that you specify explicit conversions, rather than rely on implicit or automatic conversions, for these reasons:

 

·         SQL statements are easier to understand when you use explicit datatype conversion functions.

 

·         Implicit datatype conversion can have a negative impact on performance, especially if the datatype of a column value is converted to that of a constant rather than the other way around.

 

·         Implicit conversion depends on the context in which it occurs and may not work the same way in every case. For example, implicit conversion from a datetime value to a VARCHAR2 value may return an unexpected year depending on the value of the NLS_DATE_FORMAT parameter.

 

·         Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.

 

雖然隱式轉換在很多地方自動發生,但是不推薦使用隱式型別轉換,Oracle官方建議指定顯式型別轉換,而不要依賴隱式或自動轉換,主要有下面一下原因:

 

    使用顯式型別轉換函式時,SQL語句更易於理解。

 

    隱式型別轉換可能會對效能產生負面影響,尤其是如果將列值的資料型別轉換為常量而不是相反的資料型別轉換操作時。

 

    隱式轉換取決於發生這種轉換的上下文,在不同的情況下,隱式轉換的工作方式可能不同。例如,從日期時間值到VARCHAR2值的隱式轉換可能會返回錯誤(意外)的年份,具體取決於NLS_DATE_FORMAT引數的值。

 

    隱式轉換演算法可能會在軟體版本之間以及Oracle產品之間發生變化。明確轉換的行為更容易預測。否則有可能埋下一個大坑。

 

   如果在索引表示式中發生隱式型別轉換,則Oracle資料庫可能不使用索引,因為它是pre-conversion data type.。這可能會對效能產生負面影響。

 

Tom Kyte的這篇博文,還總結了隱式資料型別轉換會帶來的一些問題:

 

 

The resulting code typically has logic bombs in it. The code appears to work in certain circumstances but will not work in others.

  •  The resulting code relies on default settings. If someone changes the default settings, the way the code works will be subject to change. (A DBA     changing a setting can make your code work entirely differently from the way it does today.)
  •  The resulting code can very well be subject to SQL injection bugs.
  •  The resulting code may end up performing numerous unnecessary repeated conversions (negatively affecting performance and consuming many more resources than necessary).
  •  The implicit conversion may be precluding certain access paths from being available to the optimizer, resulting in suboptimal query plans. (In fact, this is exactly what is happening to you!)

    隱式轉換可能會阻止某些訪問路徑無法用於優化器,從而導致查詢計劃不理想。 (實際上,這正是您資料庫當中正在發生的事情!)

  •  Implicit conversions may prevent partition elimination.

 

 

    其實上面已經有相關例子介紹,下面介紹一個例子,主要用來說明,隱式型別轉換不一定導致執行計劃不走索引,只有當隱式轉換函數出現在查詢條件中的索引欄位上,而且左值的型別被隱式轉為了右值的型別時才會出現嚴重效能問題。

 

SQL> drop table test;
 
Table dropped.
 
SQL> create table test
  2  as
  3  select * from dba_objects;
 
Table created.
 
SQL> create index ix_test_n1 on test(object_id);
 
Index created.
 
SQL> select count(*) from test where object_id='20';
 
  COUNT(*)
----------
         1
 
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  29jmhh43kkrv4, child number 0
-------------------------------------
select count(*) from test where object_id='20'
 
Plan hash value: 4037411162
 
--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |       |       |     1 (100)|          |
|   1 |  SORT AGGREGATE   |            |     1 |    13 |            |          |
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|*  2 |   INDEX RANGE SCAN| IX_TEST_N1 |    10 |   130 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("OBJECT_ID"=20)
 
Note
-----
   - dynamic sampling used for this statement
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
 
23 rows selected.
 
SQL> 

 

clip_image002[4]

 

 

其實SQL語句發生了隱式轉換,而且轉換的地方在字串’20'上面,轉換為數字20。這樣的變化沒有發生在OBJECT_ID列上面。其次,這種轉換沒有發生在左值列上面,沒有影響到IX_TEST_N1的路徑。

 

所以以後,如果遇到”隱式轉換一定不走索引嗎?”或”隱式型別轉換一定導致索引失效嗎?”這類問題,你都要辯證的來分析,不能一概而論。

 

 

 

下面介紹一個繫結變數發生隱式型別轉換的例子:

 

 

 

 

 

SQL> create table test
  2  as
  3  select * from dba_objects;             
 
Table created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index ix_test_object_name on test(object_name);
 
Index created.
 
SQL> variables v_object_name nvarchar2(30);
SP2-0734: unknown command beginning "variables ..." - rest of line ignored.
SQL> 
SQL> variable v_object_name nvarchar2(30);
SQL> exec :v_object_name :='I_OBJ1';
 
PL/SQL procedure successfully completed.
 
SQL> select count(*) from test where object_name=:v_object_name;
 
  COUNT(*)
----------
         1
 
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  ft05prnxtpk9u, child number 0
-------------------------------------
select count(*) from test where object_name=:v_object_name
 
Plan hash value: 1950795681
 
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   113 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    66 |            |          |
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|*  2 |   TABLE ACCESS FULL| TEST |    10 |   660 |   113  (11)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SYS_OP_C2C("OBJECT_NAME")=:V_OBJECT_NAME)
 
Note
-----
   - dynamic sampling used for this statement
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
 
23 rows selected.

 

clip_image003[4]

 

 

這裡發生隱式型別轉換,是因為隱式型別規則:比較CHAR/VARCHAR2 和NCHAR/NVARCHAR2時,如果兩者字符集不一樣,則預設的轉換方式是將資料編碼從資料庫字符集轉換為國家字符集 ,而此時是藉助內部函式SYS_OP_C2C實現的

 

 

SYS_OP_C2C is an internal function which does an implicit conversion of varchar2 to national character set using TO_NCHAR function. Thus, the filter completely changes as compared to the filter using normal comparison.

 

 

如何找出存在隱式轉換的SQL?

 

有些公司可能對釋出的SQL進行全面審計,能夠從源頭上杜絕大多數存在隱式型別轉換的SQL,但是大多數公司可能沒有這個能力或資源來實現這個目標,那麼,最重要的就是如何找出資料庫中存在隱式轉換的SQL,關於如何找出存在隱式資料型別轉換的SQL,一般有下面兩個SQL:

 

 

SELECT

     SQL_ID,

     PLAN_HASH_VALUE

 FROM

     V$SQL_PLAN X

 WHERE

     X.FILTER_PREDICATES LIKE '%INTERNAL_FUNCTION%'

 GROUP BY

     SQL_ID,

     PLAN_HASH_VALUE;

 

 

SELECT

     SQL_ID,

     PLAN_HASH_VALUE

 FROM

     V$SQL_PLAN X

 WHERE

     X.FILTER_PREDICATES LIKE '%SYS_OP_C2C%'

 GROUP BY

     SQL_ID,

     PLAN_HASH_VALUE;

 

 

但是需要注意的是,即使執行計劃中存在INTERNAL_FUNCTION,也不一定說明SQL語句出現了隱式資料型別轉換,關於這個問題,參考我的部落格。 所以還必須對找出的相關SQL進行仔細甄別、鑑定。

 

 

另外,這篇部落格,也值得對隱式型別轉換了解不深的同學看看。

 

 

 

 

 

如何避免隱式型別轉換呢?

 

1:在資料庫設計階段和寫SQL期間,儘量遵循一致的原則,避免不必要的資料型別轉換。

 

 

   在建模時,要統一欄位型別,尤其是和其它表進行關聯的相關欄位必須保證資料型別一致。這樣可以避免不必要的隱式資料型別轉換。

 

 

   查詢SQL中條件與欄位型別保持一致,另外,確保繫結變數的資料型別。使其與對應欄位的資料型別一致

 

 

2:使用轉換函式,進行顯示型別轉換。

 

 

例如有下面一些常見的型別轉換函式:

 

·         TO_CHAR:把DATE或NUMBER轉換成字串;

·         TO_DATE:把NUMBER、CHAR或VARCHAR2轉換成DATE。當用到時間戳時,可以用到TO_TIMESTAMP或TO_TIMESTAMP_TZ。

·         TO_NUMBER:  把CHAR或VARCHAR2轉換成NUMBER。

 

 

3:建立帶有SYS_OP_C2C的函式索引。

 

 

這種方法比較少用,不過確實也是特殊場景下的一種優化方法。

 

 

 

 

參考資料:

 

https://blogs.oracle.com/oraclemagazine/on-implicit-conversions-and-more

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Data-Type-Comparison-Rules.html#GUID-98BE3A78-6E33-4181-B5CB-D96FD9DC1694