1. 程式人生 > >你所不知道的Oracle後臺程序SMON功能

你所不知道的Oracle後臺程序SMON功能

SMON(system monitor process)系統監控後臺程序,有時候也被叫做system cleanup process,這麼叫的原因是它負責完成很多清理(cleanup)任務。但凡學習過Oracle基礎知識的技術人員都會或多或少對該background process的功能有所瞭解。

我們所熟知的SMON是個兢兢業業的傢伙,它負責完成一些列系統級別的任務。與PMON(Process Monitor)後臺程序不同的是,SMON負責完成更多和整體系統相關的工作,這導致它會去做一些不知名的”累活”,當系統頻繁產生這些”垃圾任務”,則SMON可能忙不過來。因此在10gSMON

變得有一點懶惰了,如果它在短期內接收到過多的工作通知(SMON: system monitor process posted),那麼它可能選擇消極怠工以便讓自己不要過於繁忙(SMON: Posted too frequently, trans recovery disabled),之後會詳細介紹。

瞭解你所不知道的SMON功能():清理臨時段

觸發場景

很多人錯誤地理解了這裡所說的臨時段temporary segments,認為temporary segments是指temporary tablespace臨時表空間上的排序臨時段(sort segment)

。事實上這裡的臨時段主要指的是永久表空間(permanent tablespace)上的臨時段,當然臨時表空間上的temporary segments也是由SMON來清理(cleanup)的,但這種清理僅發生在資料庫例項啟動時(instance startup)

永久表空間上同樣存在臨時段,譬如當我們在某個永久表空間上使用create table/indexDDL命令建立某個表/索引時,服務程序一開始會在指定的永久表空間上分配足夠多的區間(Extents),這些區間在命令結束之前都是臨時的(Temporary Extents),直到表/索引完全建成才將該temporary segment

轉換為permanent segment。另外當使用drop命令刪除某個段時,也會先將該段率先轉換為temporary segment,之後再來清理該temporary segment(DROP object converts the segment to temporary and then cleans up the temporary segment)。 常規情況下清理工作遵循誰建立temporary segment,誰負責清理的原則。換句話說,因服務程序rebuild index所產生的temporary segmentrebuild完成後應由服務程序自���負責清理。一旦服務程序在成功清理temporary segment之前就意外終止了,亦或者服務程序在工作過程中遇到了某些ORA-錯誤導致語句失敗,那麼SMON都會被要求(posted)負責完成temporary segment的清理工作。

對於永久表空間上的temporary segmentSMON會三分鐘清理一次(前提是接到post),如果SMON過於繁忙那麼可能temporary segment長期不被清理。temporary segment長期不被清理可能造成一個典型的問題是:rebuild index online失敗後,後續執行的rebuild index命令要求之前產生的temporary segment已被cleanup,如果cleanup沒有完成那麼就需要一直等下去。在10gR2中我們可以使用dbms_repair.online_index_clean來手動清理online index rebuild的遺留問題:

The dbms_repair.online_index_clean function has been created to cleanup online index rebuilds.

Use the dbms_repair.online_index_clean function to resolve the issue.

Please note if you are unable to run the dbms_repair.online_index_clean function it is due to the fact

that you have not installed the patch for Bug 3805539 or are not running on a release that includes this fix.

The fix for this bug is a new function in the dbms_repair package called dbms_repair.online_index_clean,

which has been created to cleanup online index [[sub]partition] [re]builds.

New functionality is not allowed in patchsets;

therefore, this is not available in a patchset but is available in 10gR2.

Check your patch list to verify the database is patched for Bug 3805539

using the following command and patch for the bug if it is not listed:

opatch lsinventory -detail

Cleanup after a failed online index [re]build can be slow to occurpreventing subsequent such operations

until the cleanup has occured.

接著我們通過實踐來看一下smon是如何清理永久表空間上的temporary segment:

設定10500事件以跟蹤smon程序,這個診斷事件後面會介紹

SQL> alter system set events '10500 trace name context forever,level 10';

System altered.

在第一個會話中執行create table命令,這將產生一定量的Temorary Extents

SQL> create table smon as select * from ymon;

在另一個會話中執行對DBA_EXTENTS檢視的查詢,可以發現產生了多少臨時區間

SQL> SELECT COUNT(*) FROM DBA_EXTENTS WHERE SEGMENT_TYPE='TEMPORARY';

COUNT(*)

----------

117

終止以上create tablesession,等待一段時間後觀察smon後臺程序的trc可以發現以下資訊:

*** 2011-06-07 21:18:39.817

SMON: system monitor process posted msgflag:0x0200 (-/-/-/-/TMPSDROP/-/-)

*** 2011-06-07 21:18:39.818

SMON: Posted, but not for trans recovery, so skip it.

*** 2011-06-07 21:18:39.818

SMON: clean up temp segments in slave

SQL> SELECT COUNT(*) FROM DBA_EXTENTS WHERE SEGMENT_TYPE='TEMPORARY';

COUNT(*)

----------

0

可以看到smon通過slave程序完成了對temporary segment的清理

與永久表空間上的臨時段不同,出於效能的考慮臨時表空間上的Extents並不在操作(operations)完成後立即被釋放和歸還。相反,這些Temporary Extents會被標記為可用,以便用於下一次的排序操作。SMON仍會清理這些Temporary segments,但這種清理僅發生在例項啟動時(instance startup):

For performance issues, extents in TEMPORARY tablespaces are not released ordeallocated

once the operation is complete.Instead, the extent is simply marked as available for the next sort operation.

SMON cleans up the segments at startup.

A sort segment is created by the first statement that used a TEMPORARY tablespacefor sorting, after startup.

A sort segment created in a TEMPOARY tablespace is only released at shutdown.

The large number of EXTENTS is caused when the STORAGE clause has been incorrectly calculated.

現象

可以通過以下查詢瞭解資料庫中Temporary Extent的總數,在一定時間內比較其總數,若有所減少那麼說明SMON正在清理Temporary segment

SELECT COUNT(*) FROM DBA_EXTENTS WHERE SEGMENT_TYPE='TEMPORARY';

也可以通過v$sysstat檢視中的”SMON posted for dropping temp segment”事件統計資訊來了解SMON收到清理要求的情況:

SQL> select name,value from v$sysstat where name like '%SMON%';

NAME                                                                  VALUE

---------------------------------------------------------------- ----------

total number of times SMON posted                                         8

SMON posted for undo segment recovery                                     0

SMON posted for txn recovery for other instances                          0

SMON posted for instance recovery                                         0

SMON posted for undo segment shrink                                       0

SMON posted for dropping temp segment                                     1

另外在清理過程中SMON會長期持有Space Transacton(ST)佇列鎖,其他會話可能因為得不到ST鎖而等待超時出現ORA-01575錯誤:

01575, 00000, "timeout waiting for space management resource"

// *Cause: failed to acquire necessary resource to do space management.

// *Action: Retry the operation.

如何禁止SMON清理臨時段

可以通過設定診斷事件event=’10061 trace name context forever, level 10′禁用SMON清理臨時段(disable SMON from cleaning temp segments)

alter system set events '10061 trace name context forever, level 10';

相關診斷事件

瞭解你所不知道的SMON功能():合併空閒區間

SMON的作用還包括合併空閒區間(coalesces free extent)

觸發場景

早期Oracle採用DMT字典管理表空間,不同於今時今日的LMT本地管理方式,DMT下通過對FET$UET$2張字典基表的遞迴操作來管理區間。SMON5分鐘(SMON wakes itself every 5 minutes and checks for tablespaces with default pctincrease != 0)會自發地去檢查哪些預設儲存引數pctincrease不等於0的字典管理表空間,注意這種清理工作是針對DMT的,而LMT則無需合併。SMON對這些DMT表空間上的連續相鄰的空閒Extents實施coalesce操作以合併成一個更大的空閒Extent,這同時也意味著SMON需要維護FET$字典基表。

現象

以下查詢可以檢查資料庫中空閒Extents的總數,如果這個總數在持續減少那麼說明SMON正在coalesce free space

SELECT COUNT(*) FROM DBA_FREE_SPACE;

在合併區間時SMON需要排他地(exclusive)持有ST(Space Transaction)佇列鎖, 其他會話可能因為得不到ST鎖而等待超時出現ORA-01575錯誤。同時SMON可能在繁瑣的coalesce操作中消耗100%CPU

如何禁止SMON合併空閒區間

可以通過設定診斷事件event=’10269 trace name context forever, level 10′來禁用SMON合併空閒區間(Don’t do coalesces of free space in SMON)

10269, 00000, "Don't do coalesces of free space in SMON"
// *Cause:    setting this event prevents SMON from doing free space coalesces
alter system set events '10269 trace name context forever, level 10';

瞭解你所不知道的SMON功能():清理obj$基表

SMON的作用還包括清理obj$資料字典基表(cleanup obj$)

OBJ$字典基表是Oracle Bootstarp啟動自舉的重要物件之一:

SQL> set linesize 80 ;
SQL> select sql_text from bootstrap$ where sql_text like 'CREATE TABLE OBJ$%';
SQL_TEXT
--------------------------------------------------------------------------------
CREATE TABLE OBJ$("OBJ#" NUMBER NOT NULL,"DATAOBJ#" NUMBER,"OWNER#" NUMBER NOT N
ULL,"NAME" VARCHAR2(30) NOT NULL,"NAMESPACE" NUMBER NOT NULL,"SUBNAME" VARCHAR2(
30),"TYPE#" NUMBER NOT NULL,"CTIME" DATE NOT NULL,"MTIME" DATE NOT NULL,"STIME"
DATE NOT NULL,"STATUS" NUMBER NOT NULL,"REMOTEOWNER" VARCHAR2(30),"LINKNAME" VAR
CHAR2(128),"FLAGS" NUMBER,"OID$" RAW(16),"SPARE1" NUMBER,"SPARE2" NUMBER,"SPARE3
" NUMBER,"SPARE4" VARCHAR2(1000),"SPARE5" VARCHAR2(1000),"SPARE6" DATE) PCTFREE
10 PCTUSED 40 INITRANS 1 MAXTRANS 255 STORAGE (  INITIAL 16K NEXT 1024K MINEXTEN
TS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 18 EXTENTS (FILE 1 BLOCK 121))

觸發場景

OBJ$基表是一張低階資料字典表,該���幾乎對庫中的每個物件(表、索引、包、檢視等)都包含有一行記錄。很多情況下,這些條目所代表的物件是不存在的物件(non-existent),引起這種現象的一種可能的原因是物件本身已經被從資料庫中刪除了,但是物件條目仍被保留下來以滿足消極依賴機制(negative dependency)。因為這些條目的存在會導致OBJ$表不斷膨脹,這時就需要由SMON程序來刪除這些不再需要的行。SMON會在例項啟動(after startup of DB is started cleanup function again)時以及啟動後的每12個小時執行一次清理任務(the cleanup is scheduled to run after startup and then every 12 hours)

我們可以通過以下演示來了解SMON清理obj$的過程:

SQL>  BEGIN
  2      FOR i IN 1 .. 5000 LOOP
  3      execute immediate ('create synonym gustav' || i || ' for
  4  perfstat.sometable');
  5      execute immediate ('drop   synonym gustav' || i );
  6      END LOOP;
  7    END;
  8    /
PL/SQL procedure successfully completed.
SQL> startup force;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.
Database opened.
SQL>   select count(*) from user$ u, obj$ o
  2        where u.user# (+)=o.owner# and o.type#=10 and not exists
  3        (select p_obj# from dependency$ where p_obj# = o.obj#);
  COUNT(*)
----------
      5000
SQL> /
  COUNT(*)
----------
      5000
SQL> /
  COUNT(*)
----------
      4951
SQL> oradebug setospid 18457;
Oracle pid: 8, Unix process pid: 18457, image: [email protected] (SMON)
SQL> oradebug event 10046 trace name context forever ,level 1;
Statement processed.
SQL> oradebug tracefile_name;
/s01/admin/G10R2/bdump/g10r2_smon_18457.trc
select o.owner#,
       o.obj#,
       decode(o.linkname,
              null,
              decode(u.name, null, 'SYS', u.name),
              o.remoteowner),
       o.name,
       o.linkname,
       o.namespace,
       o.subname
  from user$ u, obj$ o
 where u.use r#(+) = o.owner#
   and o.type# = :1
   and not exists
 (select p_obj# from dependency$ where p_obj# = o.obj#)
 order by o.obj#
   for update
<