sys_connect_by_path函式配合group by 進行分組拼接
最近,碰到一個需求將 approval_code值對應的多個FIRST_NAME值通過line_no的asc排序 合併為一個最長的欄位 ,對應的表 如下:
對應表的sql 語句如下:
SELECT DISTINCT t1.FIRST_NAME, t2.approval_code, t2.line_no FROM K2_ACCESS_USER@k2global t1 INNER JOIN k2_approval_path t2 ON t1.DOMAIN_NAME=t2.USER_ID right joink2_credit_limit_hist t4 on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd') ORDER BY t2. APPROVAL_CODE,t2.line_no
起初,我是打算這樣獲取approval_code對應的FIRST_NAME合併值(當時還不知道 可以直接配合group by 獲取到分組的最大的FIRST_NAME的合併值)
------------------------start to combine the approver's name-------SELECT max(substr(sys_connect_by_path(FIRST_NAME,','),2))FIRST_NAME FROM ( SELECT ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM, FIRST_NAME, APPROVAL_CODE FROM (select distinct FIRST_NAME, approval_code from (SELECT DISTINCT t1.FIRST_NAME, t2.approval_code, t2.line_no FROM K2_ACCESS_USER@k2global t1 INNER JOIN k2_approval_path t2 ON t1.DOMAIN_NAME=t2.USER_ID right join k2_credit_limit_hist t4 on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd') ORDER BY t2. APPROVAL_CODE,t2.line_no )) ) t3 START WITH t3.APPROVAL_CODE='RQP0001105' --RQP0001105 用來作為一個測試的值 CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM ------------------------end to combine the approver's name--------
但是很快我發現我獲取到的不是我想要的:
我去掉包含sys_connect_by_path函式的max()之後,並在select 列表中增加ROW_NUM
View Code
獲得的結果如下:
可以看到 其實在呼叫sys_connect_by_path函式的過程中 已經生成了我們想要的值'Kenneth,Lawrence' 但是由於一些原因這個值最後被重寫為Lawrence.
我觀察了下早先的程式碼 sys_connect_by_path最後的條件部分:
START WITH
t3.APPROVAL_CODE='RQP0001105'
CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
我start with 用的條件是t3.APPROVAL_CODE='RQP0001105' ('RQP0001105'是代入的測試值), 而實際上在表中APPROVAL_CODE值為'RQP0001105'
有兩個為別為ROW_NUM1106和1107的兩條記錄.於是我在執行函式sys_connect_by_path的時候其實是分為兩步來執行的 ,它會分別從ROW_NUM=1106和1107兩條記錄開始執行一次,也就是說它是這樣的
start with t3.APPROVAL_CODE='RQP0001105' and t3.ROW_NUM='1106'
CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
執行結果:
和
start with t3.APPROVAL_CODE='RQP0001105' and t3.ROW_NUM='1107'
CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
執行結果:
我們可以判斷出來是由於從start with t3.APPROVAL_CODE='RQP0001105' and t3.ROW_NUM='1107'的時候將上一步 呼叫函式生成的值'Kenneth,Lawrence' 重寫為Lawrence.
於是問題就清楚了, 解決方法是在start with 的時候再加上一個條件使他只從最上面的那條記錄開始執行. 我的方法是新增一個rank列,rank列的值只會和多條記錄中的第一個記錄的ROW_NUM相同
ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM,
ltrim(APPROVAL_CODE,'RQP')+RANK() over(ORDER BY APPROVAL_CODE ) RANK_NUM,
同時 下面的條件改為:
START WITH t3.APPROVAL_CODE=hist.approval_code and t3.ROW_NUM=t3.RANK_NUM
CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
小組的leader建議我的方法是在原先的程式碼中新增了一列
rank_num,它是由表中分塊排序而來 見如下:
SELECT ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM, row_number() over(partition by APPROVAL_CODE ORDER BY APPROVAL_CODE ) RANK_NUM, FIRST_NAME, APPROVAL_CODE from (select distinct FIRST_NAME, approval_code FROM ( SELECT DISTINCT t1.FIRST_NAME, t2.approval_code, t2.line_no FROM K2_ACCESS_USER@k2global t1 INNER JOIN k2_approval_path t2 ON t1.DOMAIN_NAME=t2.USER_ID right join k2_credit_limit_hist t4 on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd') ORDER BY t2. APPROVAL_CODE,t2.line_no ))
執行之後可以看到獲取到的資料如下:
我們將原先的
START WITH t3.APPROVAL_CODE='RQP0001105'
CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
修改為
START WITH t3.APPROVAL_CODE='RQP0001105'and t3.RANK_NUM=1
CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
即可.
SELECT max(substr(sys_connect_by_path(FIRST_NAME,','),2)) FIRST_NAME -- , length(FIRST_NAME),t3.ROW_NUM,t3.APPROVAL_CODE FROM ( SELECT ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM, -- ltrim(APPROVAL_CODE,'RQP')+RANK() over(ORDER BY APPROVAL_CODE ) RANK_NUM, row_number() over(partition by APPROVAL_CODE ORDER BY APPROVAL_CODE ) RANK_NUM, -- row_number(), FIRST_NAME, APPROVAL_CODE from (select distinct FIRST_NAME, approval_code FROM ( SELECT DISTINCT t1.FIRST_NAME, t2.approval_code, t2.line_no FROM K2_ACCESS_USER@k2global t1 INNER JOIN k2_approval_path t2 ON t1.DOMAIN_NAME=t2.USER_ID right join k2_credit_limit_hist t4 on t2.approval_code=t4.approval_code and t4.expired_date>=to_date('2012-01-01','yyyy-mm-dd') -- where APPROVAL_CODE='RQP0001199' ORDER BY t2. APPROVAL_CODE,t2.line_no )) ) t3 START WITH t3.APPROVAL_CODE='RQP0001105' and t3.RANK_NUM=1 --and t3.app CONNECT BY t3.ROW_NUM -1 = prior t3.ROW_NUM
執行後結果如下:
好吧,上面寫的是我之前走比較繞的路子.實際上要實現我們要的值只需要配合group by 進行分組拼接即可 程式碼如下:
SELECT max(SUBSTR(SYS_CONNECT_BY_PATH(create_by, ','), 2)) create_by FROM (SELECT ltrim(APPROVAL_CODE,'RQP')+row_number() over(ORDER BY APPROVAL_CODE ) ROW_NUM, -- row_number() over(partition by APPROVAL_CODE ORDER BY APPROVAL_CODE ) RANK_NUM, create_by, approval_code from (select distinct create_by, approval_code FROM (SELECT DISTINCT t.create_by, t.approval_code, t.line_no FROM k2_approval_path t RIGHT JOIN K2_CREDIT_LIMIT_HIST T4 on t.approval_code=t4.approval_code and t4.expired_date>=to_date('2010-01-01','yyyy-mm-dd') ORDER BY t.approval_code,t.line_no ) ) )T1 START WITH t1.approval_code= hist.approval_code --t1.approval_code= hist.approval_code and t1.RANK_NUM=1 CONNECT BY T1.ROW_NUM -1 = PRIOR T1.ROW_NUM group by t1.approval_code