1. 程式人生 > 資料庫 >記一次postgresql資料匯出

記一次postgresql資料匯出

1. 問題一:實現同一欄位的多條記錄合併到一條記錄

 方法一:array_agg(expression) 把表示式變成一個數組 一般配合 array_to_string() 函式使用

 方法二:string_agg(expression, delimiter) 直接把一個表示式變成字串,引數二為拼接連線符

方法三:自定義函式
-- 建立合併方法 CREATE AGGREGATE group_concat(anyelement) ( sfunc = array_append, -- 每行的操作函式,將本行append到數組裡 stype = anyarray, -- 聚集後返回陣列型別 initcond = '{}' -- 初始化空陣列 );

  實現SQL

SELECT
    pd.project_id,
    array_agg ( pd.file_name ) AS filename 
FROM
    a pp
    LEFT JOIN b pd ON pp.ID = pd.project_id 
WHERE
    pp.ENABLEd = 'T' 
    AND pd.ENABLEd = 'T' 
GROUP BY
    pd.project_id;

SELECT
    pd.project_id,
    string_agg ( pd.file_name,',' ) AS filename 
FROM
    a pp
    LEFT JOIN b pd ON pp.ID = pd.project_id 
WHERE
    pp.ENABLEd = 'T' 
    AND pd.ENABLEd = 'T' 
GROUP BY
    pd.project_id;

SELECT
pd.project_id, group_concat ( pd.file_name ) AS filename
FROM
    a pp
    LEFT JOIN b pd ON pp.ID = pd.project_id 
WHERE
    pp.ENABLEd = 'T' 
    AND pd.ENABLEd = 'T' 
GROUP BY
    pd.project_id;
 

 補充:array_agg和string_agg函式還有一些其他操作,如排序,去重,按下表取元素等。之後碰上在補充了

問題二:pg中的json欄位操作

SELECT

pp.duty :: json -> 0 ->> 'label' AS 負責人

from a pp

當然這個和json的儲存格式有關係,因為庫中存的是json陣列,所以以上方法使用了 0 這個數字去第一個json,如果直接儲存的是json則可以省去

問題三:查詢父專案及子專案的資料,並排序。這裡涉及了pg中的遞迴函式

with RECURSIVE cte as
(
select a.id,cast(a.name as varchar(100)) from tb a 
union all 
select k.id,cast(c.name||'>'||k.name as varchar(100)) as name  from tb k inner join cte c on c.id = k.pid
)select id,name from cte ;

解釋下該遞迴的意思,上半部分為父級,下半部分為遞迴部分,之後將結果作為臨時表,還可以和其他表做關聯查詢。

問題四:pg中分組,並查詢每個分組中最大或者最小的值 row_number () over  PARTITION BY 相關函式的使用

SELECT
    *,
    ROW_NUMBER ( ) OVER ( PARTITION BY a.proc_inst_id ORDER BY last_modified_date DESC NULLS LAST ) t1 
FROM
    a 
    ) AS TEMP   
WHERE
    t1 < 2

row_number有點類似虛列,該SQL的意思是按照實力id(proc_inst_id)分組,並按修改時間降序排序,並取出每一組的第一條記錄。即最新修改的資料

附上最終SQL,SQL中的表將用a,b,c代替,以及中文註釋隨意刪除了,請不要太在意

-- 建立合併方法
    CREATE AGGREGATE group_concat(anyelement)
(
    sfunc = array_append, -- 每行的操作函式,將本行append到數組裡 
    stype = anyarray,  -- 聚集後返回陣列型別 
    initcond = '{}'    -- 初始化空陣列
);

WITH RECURSIVE T AS (
  SELECT ID
    ,
    pid,
    root_id,
        name,
    ID :: TEXT AS PATH,
    1 AS LEVEL 
  FROM
    a
  WHERE
    pid IS NULL 
    UNION ALL
  SELECT
    D.ID,
    D.pid,
    D.root_id,
        d.name,
    ( T.PATH || '/' || D.ID ) :: TEXT AS PATH,
    ( T.LEVEL + 1 ) AS LEVEL 
  FROM
    a D
    JOIN T ON D.pid = T.ID 
  ) SELECT 
    (case when T.LEVEL='1' then '一級計劃'
                when T.LEVEL='2' then '二級計劃'
                when T.LEVEL='3'  then '三級計劃'
                when T.LEVEL='4' then '四級計劃'
                end
                
        ) as 計劃, 
        parent.name as 父級專案,
  ( pp.extent_1 :: json ->> 0 ) :: json ->> 'label' AS 功能,-- 二級功能
   pp.assist_depts :: json -> 0  ->> 'label' AS 部門,
    ((case when ppr.operate ='ADD' then '新增流程'
                when ppr.operate ='UPDATE' then '更新流程'
                when ppr.operate = 'FINISH' then '結束流程'
                END
        ) || '/' ||
        (
            case when ppr.status ='active' then '執行中'
                        when ppr.status ='end' then '已結束'
                        when ppr.status ='cancel' then '已取消'
                        END
        )
        )
    as 展示,
  pr.CONTENT as 日報,
  sub.filename as 交,
    pp.ID,
  pp.pid
FROM
  a pp
  INNER JOIN T T ON T.ID = pp.ID
    LEFT JOIN a parent on pp.pid = parent.ID
  LEFT JOIN ( SELECT plan_id,operate,status
      FROM
       ( SELECT *, row_number ( ) over ( PARTITION BY b.proc_inst_id ORDER BY last_modified_date DESC nulls last ) t1 FROM   b) AS temp   WHERE 
       t1 < 2) ppr ON pp.ID = ppr.plan_id -- 關聯表
  LEFT JOIN ( SELECT project_id,content
      FROM
       ( SELECT *, row_number ( ) over ( PARTITION BY c.project_id ORDER BY last_modified_date DESC nulls last ) t1 FROM   c) AS temp   WHERE 
       t1 < 2) pr on pr.project_id = pp.id -- 關聯表
  LEFT JOIN (
  SELECT
    pd.project_id,
    group_concat ( pd.file_name ) AS filename 
  FROM
    a pp
    LEFT JOIN d pd ON pp.ID = pd.project_id  -- 關聯表
  WHERE
    pp.ENABLEd = 'T' 
    AND pd.ENABLEd = 'T' 
  GROUP BY
    pd.project_id 
  ) sub ON sub.project_id = pp.ID    
    
ORDER BY
  T.root_id,
  T."path";