1. 程式人生 > 實用技巧 >Oracle行列轉換函式--Pivot 和 Unpivot

Oracle行列轉換函式--Pivot 和 Unpivot

https://www.oracle.com/cn/database/articles/technology/pivot-and-unpivot.html

Pivot 和 Unpivot
使用簡單的 SQL 以電子表格型別的交叉表報表顯示任何關係表中的資訊,並將交叉表中的所有資料儲存到關係表中。

Pivot
如您所知,關係表是表格化的,即,它們以列-值對的形式出現。假設一個表名為 CUSTOMERS。

COPY
Copied to ClipboardError: Could not Copy
SQL> desc customers
Name Null? Type


CUST_ID NUMBER(10)
CUST_NAME VARCHAR2(20)
STATE_CODE VARCHAR2(2)
TIMES_PURCHASED NUMBER(3)
選定該表:
select cust_id, state_code, times_purchased
from customers
order by cust_id;
輸出結果如下:
CUST_ID STATE_CODE TIMES_PURCHASED


  1 CT                       1
  2 NY                      10
  3 NJ                       2
  4 NY                       4

...

and so on ...
SQL> desc customers
Name Null? Type


CUST_ID NUMBER(10)
CUST_NAME VARCHAR2(20)
STATE_CODE VARCHAR2(2)
TIMES_PURCHASED NUMBER(3)
選定該表:
select cust_id, state_code, times_purchased
from customers
order by cust_id;
輸出結果如下:
CUST_ID STATE_CODE TIMES_PURCHASED


  1 CT                       1
  2 NY                      10
  3 NJ                       2
  4 NY                       4

...

and so on ...
注意資料是如何以行值的形式顯示的:針對每個客戶,該記錄顯示了客戶所在的州以及該客戶在商店購物的次數。當該客戶從商店購買更多物品時,列 times_purchased 會進行更新。

現在,假設您希望統計一個報表,以瞭解各個州的購買頻率,即,各個州有多少客戶只購物一次、兩次、三次等等。如果使用常規 SQL,您可以執行以下語句:

COPY
Copied to ClipboardError: Could not Copy
select state_code, times_purchased, count(1) cnt
from customers
group by state_code, times_purchased;
輸出如下:
ST TIMES_PURCHASED CNT


CT 0 90
CT 1 165
CT 2 179
CT 3 173
CT 4 173
CT 5 152
...

and so on ...
select state_code, times_purchased, count(1) cnt
from customers
group by state_code, times_purchased;
輸出如下:
ST TIMES_PURCHASED CNT


CT 0 90
CT 1 165
CT 2 179
CT 3 173
CT 4 173
CT 5 152
...

and so on ...
這就是您所要的資訊,但是看起來不太方便。使用交叉表報表可能可以更好地顯示這些資料,這樣,您可以垂直排列資料,水平排列各個州,就像電子表格一樣:

COPY
Copied to ClipboardError: Could not Copy
Times_purchased
CT NY NJ ...

and so on ...

1 0 1 0 ...
2 23 119 37 ...
3 17 45 1 ...
...

and so on ...
Times_purchased
CT NY NJ ...

and so on ...

1 0 1 0 ...
2 23 119 37 ...
3 17 45 1 ...
...

and so on ...
在 Oracle 資料庫 11g 推出之前,您需要針對每個值通過 decode 函式進行以上操作,並將每個不同的值編寫為一個單獨的列。但是,該方法一點也不直觀。

慶幸的是,您現在可以使用一種很棒的新特性 PIVOT 通過一種新的操作符以交叉表格式顯示任何查詢,該操作符相應地稱為 pivot。下面是查詢的編寫方式:

COPY
Copied to ClipboardError: Could not Copy
select * from (
select times_purchased, state_code
from customers t
)
pivot
(
count(state_code)
for state_code in ('NY','CT','NJ','FL','MO')
)
order by times_purchased
/
select * from (
select times_purchased, state_code
from customers t
)
pivot
(
count(state_code)
for state_code in ('NY','CT','NJ','FL','MO')
)
order by times_purchased
/
輸出如下:

COPY
Copied to ClipboardError: Could not Copy
. TIMES_PURCHASED 'NY' 'CT' 'NJ' 'FL' 'MO'


      0      16601         90          0          0          0
      1      33048        165          0          0          0
      2      33151        179          0          0          0
      3      32978        173          0          0          0
      4      33109        173          0          1          0

... and so on ...
. TIMES_PURCHASED 'NY' 'CT' 'NJ' 'FL' 'MO'


      0      16601         90          0          0          0
      1      33048        165          0          0          0
      2      33151        179          0          0          0
      3      32978        173          0          0          0
      4      33109        173          0          1          0

... and so on ...
這表明了 pivot 操作符的威力。state_codes 作為標題行而不是列顯示。下面是傳統的表格化格式的圖示:

Times Purchased

圖 1 傳統的表格化顯示

在交叉表報表中,您希望將 Times Purchased 列的位置掉換到標題行,如圖 2 所示。該列變為行,就好像該列逆時針旋轉 90 度而變為標題行一樣。該象徵性的旋轉需要有一個支點 (pivot point),在本例中,該支點為 count(state_code) 表示式。

Times Purchased2

圖 2 執行了 Pivot 操作的顯示

該表示式需要採用以下查詢語法:

COPY
Copied to ClipboardError: Could not Copy
...
pivot
(
count(state_code)
for state_code in ('NY','CT','NJ','FL','MO')
)
...
...
pivot
(
count(state_code)
for state_code in ('NY','CT','NJ','FL','MO')
)
...
第二行“for state_code ...”限制查詢物件僅為這些值。該行是必需的,因此不幸的是,您需要預先知道可能的值。該限制在 XML 格式的查詢將有所放寬,如本文後面部分所述。

注意輸出中的標題行:

COPY
Copied to ClipboardError: Could not Copy
. TIMES_PURCHASED 'NY' 'CT' 'NJ' 'FL' 'MO'
--------------- ---------- ---------- ---------- ---------- ----------
列標題是來自表本身的資料:州程式碼。縮寫可能已經相當清楚無需更多解釋,但是假設您希望顯示州名而非縮寫(“Connecticut”而非“CT”),那又該如何呢?如果是這樣,您需要在查詢的 FOR 子句中進行一些調整,如下所示:
select * from (
select times_purchased as "Puchase Frequency", state_code
from customers t
)
pivot
(
count(state_code)
for state_code in ('NY' as "New York",'CT' "Connecticut",
'NJ' "New Jersey",'FL' "Florida",'MO' as "Missouri")
)
order by 1
/

Puchase Frequency New York Connecticut New Jersey Florida Missouri


      0      16601         90           0          0          0
      1      33048        165           0          0          0
      2      33151        179           0          0          0
      3      32978        173           0          0          0
      4      33109        173           0          1          0

...

and so on ...
. TIMES_PURCHASED 'NY' 'CT' 'NJ' 'FL' 'MO'
--------------- ---------- ---------- ---------- ---------- ----------
列標題是來自表本身的資料:州程式碼。縮寫可能已經相當清楚無需更多解釋,但是假設您希望顯示州名而非縮寫(“Connecticut”而非“CT”),那又該如何呢?如果是這樣,您需要在查詢的 FOR 子句中進行一些調整,如下所示:
select * from (
select times_purchased as "Puchase Frequency", state_code
from customers t
)
pivot
(
count(state_code)
for state_code in ('NY' as "New York",'CT' "Connecticut",
'NJ' "New Jersey",'FL' "Florida",'MO' as "Missouri")
)
order by 1
/

Puchase Frequency New York Connecticut New Jersey Florida Missouri


      0      16601         90           0          0          0
      1      33048        165           0          0          0
      2      33151        179           0          0          0
      3      32978        173           0          0          0
      4      33109        173           0          1          0

...

and so on ...
FOR 子句可以提供其中的值(這些值將成為列標題)的別名。

Unpivot
就像有物質就有反物質一樣,有 pivot 就應該有“unpivot”,對吧?

好了,不開玩笑,但 pivot 的反向操作確實需要。假設您有一個顯示交叉表報表的電子表格,如下所示:

Purchase Frequency New York Connecticut New Jersey Florida Missouri
0 12 11 1 0 0
1 900 14 22 98 78
2 866 78 13 3 9
... .
現在,您希望將這些資料載入到一個名為 CUSTOMERS 的關係表中:

COPY
Copied to ClipboardError: Could not Copy
SQL> desc customers
Name Null? Type


CUST_ID NUMBER(10)
CUST_NAME VARCHAR2(20)
STATE_CODE VARCHAR2(2)
TIMES_PURCHASED NUMBER(3)
SQL> desc customers
Name Null? Type


CUST_ID NUMBER(10)
CUST_NAME VARCHAR2(20)
STATE_CODE VARCHAR2(2)
TIMES_PURCHASED NUMBER(3)
必須將電子表格資料去規範化為關係格式,然後再進行儲存。當然,您可以使用 DECODE 編寫一個複雜的 SQL*:Loader 或 SQL 指令碼,以將資料載入到 CUSTOMERS 表中。或者,您可以使用 pivot 的反向操作 UNPIVOT,將列打亂變為行,這在 Oracle 資料庫 11g 中可以實現。

通過一個示例對此進行演示可能更簡單。讓我們首先使用 pivot 操作建立一個交叉表:

COPY
Copied to ClipboardError: Could not Copy
1 create table cust_matrix
2 as
3 select * from (
4 select times_purchased as "Puchase Frequency", state_code
5 from customers t
6 )
7 pivot
8 (
9 count(state_code)
10 for state_code in ('NY' as "New York",'CT' "Conn",
'NJ' "New Jersey",'FL' "Florida",
'MO' as "Missouri")
11 )
12* order by 1
1 create table cust_matrix
2 as
3 select * from (
4 select times_purchased as "Puchase Frequency", state_code
5 from customers t
6 )
7 pivot
8 (
9 count(state_code)
10 for state_code in ('NY' as "New York",'CT' "Conn",
'NJ' "New Jersey",'FL' "Florida",
'MO' as "Missouri")
11 )
12* order by 1
您可以檢視資料在表中的儲存方式:

COPY
Copied to ClipboardError: Could not Copy
SQL> select * from cust_matrix
2 /

Puchase Frequency New York Conn New Jersey Florida Missouri


            1      33048        165          0          0          0
            2      33151        179          0          0          0
            3      32978        173          0          0          0
            4      33109        173          0          1          0

... and so on ...
SQL> select * from cust_matrix
2 /

Puchase Frequency New York Conn New Jersey Florida Missouri


            1      33048        165          0          0          0
            2      33151        179          0          0          0
            3      32978        173          0          0          0
            4      33109        173          0          1          0

... and so on ...
這是資料在電子表格中的儲存方式:每個州是表中的一個列(“New York”、“Conn”等等)。

COPY
Copied to ClipboardError: Could not Copy
SQL> desc cust_matrix
Name Null? Type


Puchase Frequency NUMBER(3)
New York NUMBER
Conn NUMBER
New Jersey NUMBER
Florida NUMBER
Missouri NUMBER
SQL> desc cust_matrix
Name Null? Type


Puchase Frequency NUMBER(3)
New York NUMBER
Conn NUMBER
New Jersey NUMBER
Florida NUMBER
Missouri NUMBER
您需要將該表打亂,使行僅顯示州程式碼和該州的購物人數。通過 unpivot 操作可以達到此目的,如下所示:

COPY
Copied to ClipboardError: Could not Copy
select *
from cust_matrix
unpivot
(
state_counts
for state_code in ("New York","Conn","New Jersey","Florida","Missouri")
)
order by "Puchase Frequency", state_code
/
select *
from cust_matrix
unpivot
(
state_counts
for state_code in ("New York","Conn","New Jersey","Florida","Missouri")
)
order by "Puchase Frequency", state_code
/
輸出如下:

輸出如下:

COPY
Copied to ClipboardError: Could not Copy
Puchase Frequency STATE_CODE STATE_COUNTS


            1 Conn                165
            1 Florida               0
            1 Missouri              0
            1 New Jersey            0
            1 New York          33048
            2 Conn                179
            2 Florida               0
            2 Missouri              0

...

and so on ...
Puchase Frequency STATE_CODE STATE_COUNTS


            1 Conn                165
            1 Florida               0
            1 Missouri              0
            1 New Jersey            0
            1 New York          33048
            2 Conn                179
            2 Florida               0
            2 Missouri              0

...

and so on ...
注意每個列名如何變為 STATE_CODE 列中的一個值。Oracle 如何知道 state_code 是一個列名?它是通過查詢中的子句知道的,如下所示:

for state_code in ("New York","Conn","New Jersey","Florida","Missouri")

這裡,您指定“New York”、“Conn”等值是您要對其執行 unpivot 操作的 state_code 新列的值。我們來看看部分原始資料: