PostgreSQL Limit對索引的影響

阿新 • • 發佈：2018-11-10

伺服器CPU排行榜

相關行業的同學如看不懂應該該好好反思一下自己了,思考人生了.

1.建立測試表

drop table if exists test;
create table test(
    objectid serial not null,
    num integer not null,
    ref integer[] not null,
    constraint pk_test_objectid primary key(objectid)
)with (fillfactor=100);
alter table test cluster on pk_test_objectid;

為加快插入速度,其它索引在生成資料完成後再建立.

2.建立函式

函式用於控制num和ref的值分佈,以便num和ref欄位上的索引具有較高的可選擇性.

drop function if exists saveAsTest(integer,integer[]);
drop function if exists gen_row(integer[],tweights[],tweights[]);
drop function if exists gen_array(integer[],tweights[]);
drop function if exists get_next_index(tweights[]); 

drop type if exists  tweights;
/****************************************************************************************
    建立平滑加權輪詢係數型別
        weight:設定的係數
        curweight:當前使用的係數,初始化設定為0即可
****************************************************************************************/
create type tweights as 
(weight integer,curweight integer);
/****************************************************************************************
    平滑加權輪詢（smooth weighted round-robin balancing）演算法
    示例: array[((50,0)::tweights),((30,0)::tweights),((15,0)::tweights),((5,0)::tweights)]
            配置了4個係數引數,注意所有係數值累加為100,每呼叫一百次
                第一個係數返回索引1的概率為50%
                第二個係數返回索引2的概率為30%
                第三個係數返回索引3的概率為15%
                第四個係數返回索引4的概率為5%
****************************************************************************************/
create or replace function get_next_index(tweights[])
  returns table(index integer, weights tweights[])
as $$
    declare
        v_i integer;
        v_len integer;
        v_index integer;
        v_total integer;
        v_tmp tweights;
        v_tmpindex tweights;
    begin
        v_len := array_length($1,1);
        if (1 = v_len) then
          return query select 1,$1;
        end if;
        v_index := -1; v_total := 0;

        for v_i in 1..v_len loop
          v_tmp := $1[v_i];
          v_tmp.curweight := (v_tmp.curweight + v_tmp.weight);
          v_total := (v_total + v_tmp.weight);
          $1[v_i] = v_tmp;
          if (-1 = v_index or ($1[v_index]).curweight < v_tmp.curweight) then
            v_index := v_i;
          end if;
        end loop;

        v_tmpindex := $1[v_index];
        v_tmpindex.curweight :=  v_tmpindex.curweight - v_total;
        $1[v_index] = v_tmpindex;
        return query select v_index,$1;
    end;
$$ language plpgsql strict;


/****************************************************************************************
    隨機生成1-4個元素的陣列
drop function if exists gen_array(integer[],tweights[]);
****************************************************************************************/
create or replace function gen_array(integer[],tweights[])
    returns table(vals integer[], weights tweights[])
as $$
      with recursive cte(id,val,weights,count) as (
    			(select 1,$1[index],weights,((random()*(4-1)+1)::integer) from get_next_index($2))	 
    			union all
    			select (p.id+1),$1[a.index],a.weights,p.count from cte as p,get_next_index(p.weights) as a where p.id  < count
			) select array_agg(val),(select weights from cte where id=count) from cte;
$$ language sql strict;
/****************************************************************************************
    生成行
    $1、$2、$3的陣列大小必須一至
    $2:為生成integer的平滑加權輪詢係數
    $3:為生成integer[]的平滑加權輪詢係數
drop function if exists gen_row(integer[],tweights[],tweights[]);
****************************************************************************************/
create or replace function gen_row(integer[],tweights[],tweights[])
    returns table(num integer,weights1 tweights[],ref integer[],weights2 tweights[])
as $$
  select $1[num.index],num.weights,ref.*  
  from get_next_index($2) as num,gen_array($1,$3) as ref;
$$ language sql strict;
/****************************************************************************************
    函式測試是否符合預期
****************************************************************************************/
/*
select *
from gen_row(
  array[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],
  array[
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights
  ],
  array[
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
    (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights
  ]);
*/
/****************************************************************************************
    儲存資料到Test表
drop function if exists saveAsTest(integer,integer[]);
****************************************************************************************/
create or replace function saveAsTest(integer,integer[])
    returns integer
as $$
  insert into test(num,ref) values($1,$2) returning objectid;
$$ language sql strict;

3.生成測試資料

num的值範圍為1-20,平均分佈(各個的值佔比為5%).
ref的值範圍為1-20,陣列大小控制在1-4(隨機大小),每生成100個數值各個值的佔比也為5%.

delete from test;
select setval(pg_get_serial_sequence('test','objectid'), 1, false);
/****************************************************************************************
    匯入測試資料,開10個終端,每個終端都執行以下指令碼.
    博主測試機cpu為雙路16核,因此開了16個終端.CPU型號為Intel(R) Xeon(R) CPU E5530  @ 2.40GHz,現屬於垃圾cpu,排行榜在倒數...
    因表比較簡單匯入測試資料硬碟寫入較少(最高約16MB/s,大多數情況下小於2MB/s).
    本例主要是cpu運算,因此16個終端同時執行cpu達到了100%.kao運行了一會風扇狂響.......
****************************************************************************************/
\timing on
do $$
    declare
        v_nums integer[];
    v_weights1 tweights[];
    v_weights2 tweights[];

    v_num integer;
    v_ref integer[];
    v_coun integer;
    begin    
    v_coun := 1;
        v_nums:=array[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20];
    v_weights1:=array[
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights
    ];
    v_weights2:=array[
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,
      (5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights,(5,0)::tweights
    ];

        for i in 1..1000000 loop            
      select num,weights1,ref,weights2 into v_num,v_weights1,v_ref,v_weights2 from gen_row(v_nums,v_weights1,v_weights2);
      perform saveAsTest(v_num,v_ref);
      --raise notice  '%  %', v_num,v_ref;
      if ( 0 = (i % 1000) ) then      
        raise notice  '%', v_coun;
        v_coun := v_coun + 1;
      end if;
        end loop;
    end;
$$;

序號	耗時(ms)
1	1491206.016
2	1511390.919
3	1517245.568
4	1509241.432
5	1519552.252
6	1514420.896
7	1520820.174
8	1512984.280
9	1519851.215
10	1514590.502
11	1505463.332
12	1503091.390
13	1503749.024
14	1501670.722
15	1500027.669
16	1503459.150

4.建立索引

插入完成後vacuum表,測試時結果更準確.

vacuum  freeze verbose  analyze test;
select count(*) from test;
/*
count   
----------
16000000
(1 row)

Time: 587.956 ms
*/

/*B樹索引*/
create index idx_test_num on test(num);

/*陣列索引
 使用gin__int_ops,截止目前根據我的需求陣列索引測試下來gin__int_ops效果最好
 gin__int_ops依賴intarray擴充套件
 create extension intarray;
*/
create index idx_test_ref on test using gin(ref gin__int_ops);
/*其它陣列型別索引,需要相關擴充套件*/
--create index idx_test_ref on test using gist(ref gist__int_ops);
--create index idx_test_ref on test using rum(ref rum_anyarray_ops);

/*可以檢視一下表結構*/
\dS+ test;

5.查詢測試

注意不要加order by,order by會影響執行計劃,目前只單純的測試limit和索引之間的關係.

執行查詢時多執行幾次,直至不讀取磁碟(沒有Buffers: shared read).

因為資料在表中的佔比一樣,因此只要查詢一個值就可以了.

/*表包含的資料,b樹索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where num=1;
--Execution time: 2568.059 ms

/*表裡不包含的資料,b樹索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where num=21;
--Execution time: 0.044 ms

/*表包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where [email protected]>array[1];
--Execution time: 6589.734 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[1,2];
--Execution time: 9037.726 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[1,2,3];
--Execution time: 11621.418 ms

/*表不包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where [email protected]>array[21];
--Execution time: 0.065 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[21,22];
--Execution time: 0.056 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[21,22,23];
--Execution time: 0.060 ms

6.常規limt測試

/*表包含的資料,b樹索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where num=1 limit 50;
--Execution time: 0.535 ms

/*表裡不包含的資料,b樹索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where num=21 limit 50;
--Execution time: 0.050 ms


/*表包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where [email protected]>array[1] limit 50;
--Execution time: 0.585 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[1,2] limit 50;
--Execution time: 0.561 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[1,2,3] limit 50;
--Execution time: 0.537 ms

/*表不包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where [email protected]>array[21]  limit 50;
--Execution time: 3572.286 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[21,22] limit 50;
--Execution time: 3944.530 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[21,22,23] limit 50;
--Execution time: 4130.662 ms

通過對比可以看到B樹索引新增limit效能更高,只返回limit限定的資料,無論表中是否包含條件值.

陣列索引分兩種情況,表中包含條件值、表中不包含條件值.

6.1 陣列索引和limit

6.1.1 表中包含條件值

不會使用陣列索引,使用全表掃描,但是有limit限定,所以速度很快.

6.1.2 表中不包含條件值

不會使用陣列索引,使用全表掃描,因為值不包含在表中,所以需要全表掃描,然後過濾所有資料,速度非常慢.

6.1.2.1 解決方案-使用with

with會使用陣列索引.

/*表包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where [email protected]>array[1]
)select * from cte limit 10;
--Execution time: 293.301 ms

explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where ref&&array[1,2]
)select * from cte limit 10;
--Execution time: 464.427 ms

explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where ref&&array[1,2,3]
)select * from cte limit 10;
--Execution time: 717.172 ms

/*表不包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where [email protected]>array[21]
)select * from cte limit 10;
--Execution time: 0.075 ms

explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where ref&&array[21,22]
)select * from cte limit 10;
--Execution time: 0.078 ms

explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where ref&&array[21,22,23]
)select * from cte limit 10;
--Execution time: 0.079 ms

6.1.2.2 解決方案-禁用全表掃描

禁用全表掃描後,PostgreSQL會自動選擇合適的索引,在本例中使用了索引idx_test_ref.類似Oracle的強制索引.

set enable_seqscan只對當前會話有效,注意使用完成後要開啟.

set enable_seqscan = off;
/*表包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where [email protected]>array[1] limit 50;
--Execution time: 297.018 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[1,2] limit 50;
--Execution time: 466.661 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[1,2,3] limit 50;
--Execution time: 708.372 ms

/*表不包含的資料,陣列索引*/
explain (analyze,verbose,costs,buffers,timing)
select objectid from test where [email protected]>array[21]  limit 50;
--Planning time: 0.089 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[21,22] limit 50;
--Execution time: 0.065 ms

explain (analyze,verbose,costs,buffers,timing)
select objectid from test where ref&&array[21,22,23] limit 50;
--Execution time: 0.066 ms
set enable_seqscan = on;

6.1.3 小結

索引掃描的成本較昂貴,但因返回的資料少,所以比較快.
limit會對查詢行為產生較大的影響,設定了limit後需重新檢視執行計劃.
order by也會對查詢行為產生較大的影響,需結合需求和執行計劃調整.
如果是單個條件(例如本例),且大多數情況下表包含值,建議使用”6.常規limt測試”,偶爾有表不包含的值時對總體影響不大.
如果是多個條件,建議使用”6.1.2.1 解決方案-使用with”,它和禁用全表掃描效果差不多.具體使用那種需結合需求和執行計劃調整.如下:

--多個條件
explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where num=1 and ref&&array[1,2,3]
)select * from cte limit 10;

explain (analyze,verbose,costs,buffers,timing)
with cte as(
  select objectid from test where num=1 and ref&&array[21,22,23]
)select * from cte limit 10;

PostgreSQL Limit對索引的影響

伺服器CPU排行榜相關行業的同學如看不懂應該該好好反思一下自己了,思考人生了. 1.建立測試表 drop table if exists test; create table test( objectid serial not null, num integer

nulls first & nulls last 對索引影響

-- 當我們需要排序欄位時，比如order by name,如果name欄位定義時沒有not null時，就有可能涉及到null值的排序 -- 如果不注意，可能會造成隱藏的bug，pg預設null是無窮大，在升序時排在最後面，當然在排序時也可以指定 nulls first

MySQL_列值為null對索引的影響_實踐

一.首先看一個我在某公眾號看到的一個關於資料庫優化的舉措二.如果where子句中查詢的列執行了 “is null” 或者 “is not null” 或者 “<=> null” 會不會使用索引呢？先列出結論：where子句中使用上述對null的判斷，如果判斷的列設定了索

mysql的儲存引擎innodb、myisam對插入影響和索引對插入的影響

前言一直好奇mysql的儲存引擎innodb和myisam對插入影響和索引對插入的影響。這次我就來做個測試，以下測試供大家參考。 drop table userinfo; CREATE TAB

分割槽表刪除分割槽對索引的影響(Oracle分割槽表刪除分割槽資料時導致索引失效解決)

http://www.itpub.net/thread-1942951-1-1.html在oracle中，建立分割槽表，然後建立索引的時候有全域性索引和本地索引，因為需要定時刪除分割槽，所以建立本地索引，可以在查詢的時候走索引，那麼全域性索引的意義是什麼？據說全域性索引會快，

mysql 型別轉換對索引的影響

create table if not exists `test`( `id` int unsigned not NULL AUTO_INCREMENT, `name` varchar(10) DEFAULT NULL, `age` int(5) NOT NULL DEFAULT '0',

資料量對where in語句的索引影響

開發十年，就只剩下這套架構體系了！ >>>

mysql創建索引以及對索引的理解

bsp 空間 select table 創建表類型但是排除假設創建索引是指在某個表的一列或多列上建立一個索引，以便提高對表的訪問速度。創建索引有3種方式，這3種方式分別是創建表的時候創建索引、在已經存在的表上創建索引和使用ALTER TABLE語句來創建索引。本節

PostgreSQL查看索引的使用情況

init postgres 通過 where 刪除 seek odi exe isnull --========================================== --查看索引的使用情況 --索引在重建或刪除新建時sys.dm_db_index_usag

對我影響最深的三個老師

和我一聲沒有但是物理接下來真的自己早就轉眼自己已是一名大學生，林老師布置的這個作業讓我回想起從我讀書以來教過我的老師。她們對我的影響很大，我想要不是她們就不會有今天的我。第一個是五年級英語老師，因為來自農村，教育比較落後，以致到五年級才開始接觸英語，而別

回憶印象中對我影響最大的三位老師

在人一生記憶的長河中，總有著一些人。他們說的一些話，做的一些事，銘刻在你的生命中伴隨著你成長。今天，我來就來說說記憶中對我影響最大的三位老師。郭小龍，我的初中班主任，也是我的死對頭。說到這可能很多人覺得奇怪。死對頭？那不是應該是提都不想提的嗎。恰恰相反,其實包括後面要說的老師，還都算和我“命裡犯衝”的。那

lucene 對索引的文件進行新增修改刪除

第一步：新增相關的依賴 （新增 junit依賴） <dependency> <groupId>junit<

PostgreSQL中的索引

索引是一種快速查詢資料的方法，它記錄了表中一列或多列與其物理位置之間的對應關係。常用的索引有B-tree，Hash，GIST及GIN等。（1）B-tree索引適合處理等值查詢和範圍查詢。（2）Hash只適合處理簡單的等值查詢。（3）GI

修改主機時間對MySQL影響

背景在裝機實施時，BIOS忘記調整時間，導致伺服器時間與CST不符合；待發現問題時，MySQL環境已經在執行，所以只能通過作業系統進行更改；但是更改完成後，MySQL進行重啟時發生了問題。以下為問題復現和解決過程測試環境 MySQL 5.7.24 CentOS 7.4 [email p

TL431穩定性對電路影響

製作某正弦波放大電路，發現輸出有較大紋波，紋波幅值約1Vpp，呈鋸齒波形式。電路如下：開始時懷疑為前端跨阻放大器不穩定導致，量測前端訊號波形均無異常。因此該紋波應該是最後一級放大時引入導致。將示波器表筆調成交流檔，發現運放U5的3引腳大約偶大約十幾毫伏

作為Java程式設計師，對你影響最大的黃金五年，你準備如何把握好？

在Java業界流行著一種說法——黃金5年，就是從程式設計師入職時算起，前五年的工作選擇直接影響整個職業生涯的職業發展和薪資走向。如何把握這五年，從一個剛入行的菜鳥蛻變成一個處事不驚的大佬，這是一個涉及到自身的專業知識儲備和選擇的難題，那麼，一個Java程式設計師如何做才能完成從入行到大佬的晉升之路呢？參加

對我影響最大的三位老師

”師者，傳道授業解惑者也。“ 老師不一定得是學校的老師。只要教會你學習，讓你獲得知識就算是老師，在我經歷過的人生路上，我覺得對我的影響最大的三位老師分別是我的父母，我的初一班主任，我的高中班主任。父母肯定是最重要的第一任老師，他們從小就陪在我們身邊，教會我們走路，說話，吃飯。在我們還小的時候教會我們一些道理

人生路上對我影響最大的三位老師

從小學到目前的大學。這一路走來，遇到很多的老師。要說記得很清楚、能算上是影響最大的老師還真有三位老師。第一位就是我的小學老師，王老師。他是班主任同時也是教語文的。父母因為工作原因，很難得能夠於父母見一面。不知道為什麼總能從慈祥的王老師哪裡得到安慰。他很細心，很關注我。小學語文可能最難的就是寫作文了。王老師很細

try catch 對效能影響

引言之前一直沒有去研究try catch的內部機制，只是一直停留在了感覺上，正好這週五開會交流學習的時候，有人提出了相關的問題。藉著週末，正好研究一番。討論的問題當時討論的是這樣的問題：比較下面兩種try catch寫法，哪一種效能更好。

PostgreSQL之INDEX 索引

之前總結了PostgreSQL的序列相關知識，今天總結下索引。我們都知道，資料庫索引最主要的作用是可以提高檢索資料的速度，但是索引也不是越多越好。因為索引會增加資料庫的儲存空間，查詢資料是要花較多的時間。 1、建立索引 SQL語句如下： CREAT

PostgreSQL Limit對索引的影響

1.建立測試表

2.建立函式

3.生成測試資料

4.建立索引

5.查詢測試

6.常規limt測試

6.1 陣列索引和limit

6.1.1 表中包含條件值

6.1.2 表中不包含條件值

6.1.2.1 解決方案-使用with

6.1.2.2 解決方案-禁用全表掃描

6.1.3 小結

相關推薦