1. 程式人生 > >RAC 10G叢集啟動指令碼

RAC 10G叢集啟動指令碼

11GR2版本GI中新增加的重要元件OHAS(Oracle High Availability Service)和其他相關的元件,資源,下圖是11GR2版本中GI元件之間啟動關係。

 

OHAS

 OHAS是11GR2版本新推出的一個重要的元件,隨著這個元件的產生,Oracle叢集管理軟體很多方面發生了改變。這些改變主要體現在叢集啟動方式和資源管理方式方面

 

叢集啟動方式10G版本

10G版本叢集管理軟體(CRS)。從叢集的啟動角度來說,10G版本的叢集通過/etc/inittab檔案中下面標紅的三行程式碼來啟動。資料庫版本Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production

 

# cat /etc/inittab

ap::sysinit:/sbin/autopush -f /etc/iu.ap

sp::sysinit:/sbin/soconfig -f /etc/sock2path

smf::sysinit:/lib/svc/bin/svc.startd    >/dev/msglog 2<>/dev/msglog </dev/console

p3:s1234:powerfail:/usr/sbin/shutdown -y -i5 -g0 >/dev/msglog 2<>/dev/msglog

h1:3:respawn:/etc/init.d/init.evmd run >/dev/null 2>&1 </dev/null

h2:3:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&1 </dev/null

h3:3:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1 </dev/null

 

雖然以上三個指令碼是同時被呼叫的,但是守護程序之間是有依存關係的。首先需要啟動cssd.bin並確保其能夠正常工作,之後才能夠啟動crsd.bin並確保其正常工作,最後啟動evmd.bin並確保其正常工作。

 

Init.cssd:負責啟動ocssd.bin守護程序和其他css層面的守護程序,從而完成對叢集的構建工作。

Init.crsd:負責啟動crsd.bin守護程序並且呼叫racg模組來啟動相應的資源,從而完成對叢集應用程式資源的啟動。

Init.evmd:負責啟動evmd.bin守護程序,從而實現叢集節點的事件釋出。

 

 

接下來,看一下每個指令碼的內容,只列舉一部分指令碼,主要體現主要功能。

(1)init.crsd指令碼

...............................................................................................................

ORA_CRS_HOME=/opt/oracle/product/CRS

ORACLE_USER=oracle

 

ORACLE_HOME=$ORA_CRS_HOME

 

export ORACLE_HOME

export ORA_CRS_HOME

export ORACLE_USER

 

# Set DISABLE_OPROCD to false. Platforms that do not ship an oprocd

# binary should override this below.

DISABLE_OPROCD=false

# Default OPROCD timeout values defined here, so that it can be

# over-ridden as needed by a platform.

# default Timout of 1000 ms and a margin of 500ms

OPROCD_DEFAULT_TIMEOUT=1000

OPROCD_DEFAULT_MARGIN=500

# default Timeout for other actions

OPROCD_CHECK_TIMEOUT=2000

OPROCD_STOP_TIMEOUT=2000

OPROCD_DEFAULT_HISTORGRAM=

 

# Incase /bin/hostname is not present in a particular platform, we

# may have to do something different.

HOSTN=/bin/hostname

EXPRN=/usr/bin/expr

CUT=/usr/bin/cut

AWK='/bin/awk'

ECHO='echo'

 

TR=/bin/tr

#solaris on amd and SPARC has issue with /bin/tr

[ 'SunOS' = `/bin/uname` ] && TR=/usr/xpg4/bin/tr

#on Linux tr is at /usr/bin/tr

[ 'Linux' = `/bin/uname` ] && TR=/usr/bin/tr

 

 

 

#If the hostname is an IP address, let hostname

#remain as IP address

HOST=`$HOSTN`

len1=`$EXPRN "$HOST" : '.*'`

len2=`$EXPRN match $HOST '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*'`

 

# Strip off domain name in case /bin/hostname returns

# FQDN hostname

if [ $len1 != $len2 ]; then

 HOST=`$ECHO $HOST | $CUT -d'.' -f1 `

fi

 

HOST=`$ECHO $HOST | $TR '[:upper:]' '[:lower:]'`

 

# Default Location for commands on most platforms

PS='/bin/ps'

# ps -e is expected to search for all processes on the box and provide

# terse binary name output so that column count does not truncate binary

# names and confuse grep.

PSE='/bin/ps -e'

PSEF='/bin/ps -ef'

HEAD='/bin/head'

GREP='/bin/grep'

KILL='/bin/kill'

KILLTERM='/bin/kill -TERM'

KILLDIE='/bin/kill -9'

KILLCHECK="/bin/kill -0 $$"

SLEEP='/bin/sleep'

NULL='/dev/null'

............................................................可以看到,首先定義了叢集使用的一些環境變數和需要使用的作業系統命令。

...............................................................................................................

 

PLATFORM=`$UNAME`

 

MAXFILE=65536

 

case $PLATFORM in

Linux)

 LD_LIBRARY_PATH=$ORA_CRS_HOME/lib

       export LD_LIBRARY_PATH

       FAST_REBOOT="/sbin/reboot -n -f & $SLEEP 1 ; $ECHO b > /proc/sysrq-trigger"

       HEAD='/usr/bin/head'

...............................................................................................................

HP-UX) MACH_HARDWARE=`/bin/uname -m`

...............................................................................................................

     LD_LIBRARY_PATH=$ORA_CRS_HOME/lib:$NMAPIDIR_64:/usr/lib:$LD_LIBRARY_PATH

       export LD_LIBRARY_PATH

       # Presence of this file indicates that vendor clusterware is installed

       SKGXNLIB=${NMAPIDIR_64}/libnmapi2.${SO_EXT}

       if [ -f $SKGXNLIB ]; then

         USING_VC=1

       fi

...............................................................................................................

SunOS) MACH_HARDWARE=`/bin/uname -i`

 ARCH=`/usr/bin/isainfo -b`

       CLUSTERDIR=/opt/ORCLcluster

 LD_LIBRARY_PATH=$ORA_CRS_HOME/lib:$CLUSTERDIR/lib:/usr/lib:/usr/ucblib:$LD_LIBRARY_PATH

       LD_LIBRARY_PATH_64=$ORA_CRS_HOME/lib:$CLUSTERDIR/lib:/usr/lib:/usr/ucblib:$LD_LIBRARY_PATH_64

       if [ "${MACH_HARDWARE}${ARCH}" = "i86pc64" ]; then

           LD_LIBRARY_PATH=$ORA_CRS_HOME/lib:$CLUSTERDIR/lib:/usr/lib/amd64:/usr/ucblib/amd64:$LD_LIBRARY_PATH

           LD_LIBRARY_PATH_64=$ORA_CRS_HOME/lib:$CLUSTERDIR/lib:/usr/lib/amd64:/usr/ucblib/amd64:$LD_LIBRARY_PATH_64

...............................................................................................................可以看到為不同作業系統設定對應環境變數。

...............................................................................................................

'stop')

    $LOGMSG "Oracle CSSD being stopped"

 

    # disable CSS startup until the next boot

    $ID/init.cssd norun

 

    # shutdown the OPROCD process if it is running

    if [ ! -f $NOOPROCD ]; then

       $OPROCD stop -t $OPROCD_STOP_TIMEOUT 2>$NULL

    fi

 

    # No steps are necessary for shutting down clsomon. It will go down

    # automatically when CSS is shutdown.

 

    # Shut down oclsvmon if it is up.

    if [ ! -f $NOCLSVMON ]; then

      $EVAL $FINDCLSVMON | $AWK '{ print $2 ; }' | $XARGS $KILLTERM > $NULL 2>&1

    fi

 

    # Invalidate init.cssd fatal pidfiles.

    $ECHO "stopped" > $CSSFBOOT

 

    $TOUCH $NOOPROCD

    $TOUCH $NOCLSVMON

    $TOUCH $NOCLSOMON

 

    # Now tell it to shut down.

    if [ -x "$CRSCTL" ]; then

      $CRSCTL stop crs

    fi

 

    $ECHO "Shutdown has begun. The daemons should exit soon."

    ;;

 

'run')

    # Foreground run, for single instance or single-node installs only.

    # If this is used in a cluster install, RDBMS datafile corruption is

    # likely.

 

    # Run the startcheck to see whether we should continue

    $ID/init.cssd startcheck

    while [ "$?" != "0" ]; do

      $SLEEP $RUNRECHECKTIME

      $ID/init.cssd startcheck

    done

 

    cd $ORA_CRS_HOME/log/$HOST/cssd

 

    # If there is an old corefile by such a collision prone name, then

    # rename it to something safe.

    if [ -f ./core ]; then

      $MVF ./core "$UNIQUECORE"

    fi

 

    # Arguments. By default none.

    OCSSD_ARGS=

    

    $ORA_CRS_HOME/bin/ocssd $OCSSD_ARGS

    ;;

 

'fatal')

    # This action is invoked to start the CSS daemon in cluster mode,

    # and one or more of its accompanying daemons oprocd or clsvmon or clsomon

    # This respawn wrapper is done in lieu of adding new entries to inittab.

 

    # Check to see if we are supposed to run this boot.

    $ID/init.cssd startcheck

    while [ "$?" != "0" ]; do

      $SLEEP $RUNRECHECKTIME

      $ID/init.cssd startcheck

    done

 

    # See discussion in LocalFence

$EVAL $CLEANREBOOTLOCK

..........................................................................................................

    $ECHO "See documentation at the top of $0 about supported commands."

    exit 1;

    ;;

..........................................................................................................init.cssd根據輸入的引數決定需要執行的操作,如果輸入啟動引數為fatal則正常啟動cssd守護程序和其他相關守護程序。

 

 

(2)Init.crsd

ORA_CRS_HOME=/opt/oracle/product/CRS

ORACLE_HOME=$ORA_CRS_HOME

export ORA_CRS_HOME

export ORACLE_HOME

 

ORACLE_USER=oracle

 

UMASK=/bin/umask

SED=/bin/sed

CAT=/bin/cat

LOGMSG="/bin/logger -puser.err"

ECHO=/bin/echo

.............................................................定義crsd需要使用的環境變數和作業系統命令。

---------------------------------------------------------------------------------------------------------------------------

case $PLATFORM in

Linux)

    SCRDIR=/etc/oracle/scls_scr/$HOST

    ID=/etc/init.d

    LOGGER="/usr/bin/logger"

    if [ ! -f "$LOGGER" ]; then

      LOGGER="/bin/logger"

    fi

    LOGMSG="$LOGGER -puser.err"

 

    if [ ! -f "$UMASK" ]; then

      UMASK=umask

......................................................................................................................................................

OSF1)  

    ID=/sbin/init.d

    # No restriction in opening files on TRU64. Refer b7623099.

    MAXFILE=unlimited

    ;;

*)  /bin/echo "ERROR: Unknown Operating System"

    exit -1

    ;;

esac

....................................................................................根據不同平臺設定不同的環境變數。

......................................................................................................................................................

 

case $1 in

'home')

    $ECHO $ORA_CRS_HOME

    exit 0;

    ;;

'stop')

    [ -r $PIDFILE ] && crspid=`$CAT $PIDFILE`

    $LOGMSG "Oracle CRSD $crspid set to stop"

 

    # Indicate that the next time we start up, it may be an initial startup.

    $ECHO "stopped" > $CRSDBOOT

    

    $LOGMSG "Oracle CRSD $crspid shutdown completed"

    ;;

'run') # foreground run out of init

.....................................................................................................................................................

    $ECHO "Manual invocation of $0 is not supported."

    ;;

Esac

....................................................................根據輸入引數值決定相應的操作。輸入引數為run,則表示啟動crsd.bin守護程序。

 

 

(3)Init.evmd

ORA_CRS_HOME=/opt/oracle/product/CRS

ORACLE_USER=oracle

 

ORACLE_HOME=$ORA_CRS_HOME

export ORACLE_HOME

export ORA_CRS_HOME

 

CAT=/bin/cat

RMF="/bin/rm -f"

LOGMSG="/bin/logger -puser.err"

ECHO=/bin/echo

KILL=/bin/kill

..............................................................................根據不同平臺設定不同的環境變數。

 

case $PLATFORM in

Linux)

       ID=/etc/init.d

       LOGGER="/usr/bin/logger"

       if [ ! -f "$LOGGER" ];then

        LOGGER="/bin/logger"

       fi

       LOGMSG="$LOGGER -puser.err"

       SU="/bin/su -l"

 

       ;;

HP-UX)

       ID=/sbin/init.d

       ;;

.....................................................................................................................................................

       ;;

Esac

.......................................................................根據不同平臺設定不同的環境變數。

....................................................................................................................................................

case $1 in

'home')

    $ECHO $ORA_CRS_HOME

    exit 0;

    ;;

'user')

    $ECHO $ORACLE_USER

    exit 0;

    ;;

'stop')

    $LOGMSG "Oracle EVMD set to stop"

      

    ;;

'run') # foreground run out of init

根據輸入引數值決定相應的操作。輸入引數為run,則表示啟動crsd.bin守護程序。

 

 

(4)小結

看了 init. cssd、init.crsd和 init. evmd三個指令碼的內容後,可以發現這三個指令碼的基本結構是:首先定義變數和作業系統命令,之後根據不同的作業系統平臺設定對應的環境變數,最後根據輸入的引數來決定對應的操作。但是這樣做也為叢集管理軟體帶來了問題:如果由於某種原因指令碼的內容或者許可權被修改,很可能導致叢集無法被啟動,並且很難進行診斷,而且所有的操作都儲存在指令碼中也會存在安全性的問題,所以,從11.2.0.2版本開始,叢集的啟動方式發生了改變。