PostgreSQL 索引壞塊處理
1、某張表查詢報錯,報錯資訊如下
back=# select max(create_time) from public.tbl_index_table where create_time>='2010-10-08';
ERROR: could not read block 41381 of relation 16779/24769/24938: read only 0 of 8192 bytes
看到這個錯誤資訊,首先想到的是表 tbl_index_table 上有壞塊,估計需要表重建下。
2、--檢視執行計劃
back=# \d tbl_index_table;
Table "public.tbl_index_table"
Column | Type | Modifiers
----------------+-----------------------------+------------------------
total | integer |
logined | integer |
logining | integer |
http | integer |
rawtcp | integer |
create_time | timestamp without time zone | not null default now()
logincountdesc | character varying |
logincountaddr | character varying | not null
Indexes:
"tbl_index_table_pkey" PRIMARY KEY, btree (create_time, logincountaddr)
"index_tbl_index_table_create_time" btree (create_time)
back=# select max(create_time) from public.tbl_index_table where create_time>='2010-10-08';
ERROR: could not read block 41381 of relation 16779/24769/24938: read only 0 of 8192 bytes
back=# explain select max(create_time) from public.tbl_index_table where create_time>='2010-10-08';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.04..0.05 rows=1 width=0)
InitPlan
-> Limit (cost=0.00..0.04 rows=1 width=8)
-> Index Scan Backward using index_tbl_index_table_create_time on tbl_index_table (cost=0.00..66.28 rows=1507 width=8)
Index Cond: (create_time >= '2010-10-08 00:00:00'::timestamp without time zone)
Filter: (create_time IS NOT NULL)
(6 rows)
發現上面的查詢走的索引 index_tbl_index_table_create_time,猜測索引可能有問題。
3、--根據報錯資訊,從relation後面的數字分析
back=# select oid,relname from pg_class where oid=24938;
oid | relname
-------+-----------------------------------------
24938 | index_tbl_index_table_create_time
(1 row)
Time: 0.596 ms
back=# select oid,relname from pg_class where oid=24769;
oid | relname
-----+---------
(0 rows)
Time: 0.369 ms
back=# select oid,relname from pg_class where oid=16779;
oid | relname
-----+---------
(0 rows)
發現 24938正好是表上的索引 index_tbl_index_table_create_time。
--檢視索引狀態
back=# select * from pg_index where indexrelid=24938;
indexrelid | indrelid | indnatts | indisunique | indisprimary | indisclustered |indisvalid| indcheckxmin | indisready | indkey | indclass | indoption | indexprs | indpred
------------+----------+----------+-------------+--------------+----------------+------------+--------------+------------+--------+----------+-----------+----------+---------
24938 | 24823 | 1 | f | f | f | t | f | t | 6 | 10053 | 0 | |
(1 row)
indisvalid=t 表示索引處於可用狀態。
4、--嘗試下重建索引
back=# select current_query from pg_stat_activity;
current_query
---------------------------------------------
select current_query from pg_stat_activity;
<IDLE>
back=# \timing
Timing is on.
back=# reindex index index_tbl_index_table_create_time;
REINDEX
Time: 107796.232 ms
--索引重建後,查詢恢復正常
back=# select max(create_time) from public.tbl_index_table where create_time>='2010-10-08';
max
-----
(1 row)
Time: 73.600 ms
back=# select pg_size_pretty(pg_relation_size('index_tbl_index_table_create_time'));
pg_size_pretty
----------------
327 MB
(1 row)
總結:
(1) 網上查了下,說是 Postgresql 的bug 2197, 但從上面的處理過程來看,應該是索引上有壞塊,
索引重建後,查詢恢復正常。
(2)
這種情況,一般是出現了壞塊,你要分析壞塊的原因就比較難了。
有個引數 test=# show zero_damaged_pages;
zero_damaged_pages
--------------------
off
(1 row)
如果設定為on,可以跳過壞塊,但會破壞資料的一致性。