1. 程式人生 > >seq_trace叢集訊息鏈跟蹤利器

seq_trace叢集訊息鏈跟蹤利器

做過網路叢集伺服器的的同學都知道,叢集服務通常由不同的伺服器組成,這些不同角色的伺服器組合在一起共同完成了特定的服務。一個服務通常需要一個協調者,和不同的工作者。 協調者負責派發任務,接收工作者的完成情況,最終回饋給使用者。舉個例子來講,撥打電話:首先需要確認你的號碼是在有效的,同時還要看下你的帳號裡面有錢不,還要看下你撥打的電話號碼是不是由許可權,電話打的時候需要扣錢,等等。 這些服務中間的任何一個環節出問題了,服務就不正常了。那麼我們在服務出問題的時候,如何定位問題呢?通常的辦法是打日誌,在所有的參與服務的節點上開啟日誌記錄,之後到所有的節點上收集日誌,集中分析日誌,相當的麻煩。

這時候seq_trace來救助了,seq_trace的目標就是能夠跟蹤一條訊息經過的所有環節,最終把路徑展現給使用者。

鋪墊材料:

seq_trace工作原理,請參考這裡

ttb對seq_trace的支援參考這裡

tdbg對seq_trace的支援參考這裡

我們來設計個案例來演示下這個過程:

該服務功能是對一個列表進行消重,翻倍,求和,這3個動作分別由不同的角色做。

叢集共有3個節點提供服務, [email protected], [email protected], [email protected],角色設計上比較簡單:y先上面提供消重服務,z提供翻倍, x負責最後的求和。

流程是這樣的:

1. 使用者給x發列表.

2. x收到列表後,轉發發給y.

3. y收到後先進行列表消重, 再發給z.

4. z收到後進行列表翻倍, 然後回給y.

5. y收到回給x.

6. x收到後進行最後的加和,回給使用者。

我們首先看下我們的測試程式碼:

$ cat test.erl

-module(test).

-export([start/0, stop/1, demo/2]).



x_fun(Y)->

    receive

    {From, L} ->

        Y ! {self(), L},

        receive

        {Y, L1} ->

            From ! {self(), lists:sum(L1)}

        end,

        x_fun(Y);

    quit ->

        ok

    end.





y_fun(Z)->

    receive

    {From, L} ->

        Z ! {self(), sets:to_list(sets:from_list(L))},

        receive

        {Z, L1}->

            From ! {self(), L1}

        end,

        y_fun(Z);

    quit ->

        ok

    end.



z_fun()->

    receive

    {From, L} ->

        From ! {self(), [2 * R || R <-L]},

        z_fun();

    quit ->

        ok

    end.



start()->

    pong = net_adm:ping('
[email protected]
'), pong = net_adm:ping('[email protected]'), pong = net_adm:ping('[email protected]'), Z = spawn('[email protected]', fun() -> global:register_name(z_p, self()), z_fun() end), Y = spawn('[email protected]', fun() -> global:register_name(y_p, self()), y_fun(Z) end), X = spawn('[email protected]', fun() -> global:register_name(x_p, self()), x_fun(Y) end), global:sync(), {X, Y, Z}. stop({X,Y,Z})-> [P ! quit || P <- [X,Y,Z]], ok. demo(L, {X,_,_}) when is_list(L) -> X ! {self(), L}, %% io:format("sent ~p~n", [L]), receive {X, _Res} -> %% io:format("got ~p~n",[_Res]), _Res end.

我們首先準備下叢集環境,開3個節點Erlang節點:

#terminal 1

$ erl -name [email protected]

Erlang R14B04 (erts-5.8.5) 1 [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]



Eshell V5.8.5  (abort with ^G)

([email protected])1>



#terminal 2

$ erl -name [email protected]

Erlang R14B04 (erts-5.8.5) 1 [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]



Eshell V5.8.5  (abort with ^G)

([email protected])1>



#terminal 3

$ erl -name [email protected]

Erlang R14B04 (erts-5.8.5) 1 [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]



Eshell V5.8.5  (abort with ^G)

([email protected])1>

好了,現在開始我們的測試:

$ erlc test.erl $ erl -name [email protected] Erlang R14B04 (erts-5.8.5) 1 [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.8.5 (abort with ^G) 
([email protected])1> Ref=test:start(). {<6584.42.0>,<6585.42.0>,<6586.42.0>} 
([email protected])2> test:demo([1,1,3,5],Ref). 18 
([email protected])3> nodes(). ['[email protected]','[email protected]','[email protected]'] 
([email protected])4> global:registered_names(). [z_p,y_p,x_p] 
([email protected])5> test:stop(Ref). ok

我們看到結果算出來了,參與的節點x,y,z也都看到了,服務的程序名z_p,y_p,x_p也都是正常的,可是我們無法看到訊息是如何在叢集裡面流通的。

神奇的ttb加上seq_trace可以幫我們的忙。原理是用ttb在呼叫test:demo的時候,給我們的程序設定個token,demo呼叫把列表通過訊息傳遞給X的時候,seq_trace會在後臺把這個token順著服務鏈條傳遞出去,最終訊息回到我們手裡,我們就知道這個訊息所走過的路徑,包括節點間的路徑。

$ erl -name [email protected]
Erlang R14B04 (erts-5.8.5) 1 [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]
 
Eshell V5.8.5  (abort with ^G)
([email protected])1> Ref=test:start().
{<6584.60.0>,<6585.60.0>,<6586.60.0>}
([email protected])2>
([email protected])2> nodes().
['[email protected]','[email protected]','[email protected]']
([email protected])3> global:registered_names().
[z_p,y_p,x_p]
([email protected])5> ttb:tracer(all).
{ok,['[email protected]','[email protected]','[email protected]',
     '[email protected]']}
([email protected])6> ttb:p(self(),call).
{ok,[{<0.37.0>,[{matched,'[email protected]',1}]}]}
([email protected])7> ttb:tp(test,demo,ttb:seq_trigger_ms()).
{ok,[{matched,'[email protected]',1},
     {matched,'[email protected]',1},
     {matched,'[email protected]',1},
     {matched,'[email protected]',1},
     {saved,1}]}
([email protected])8> test:demo([1,1,3,5],Ref),seq_trace:reset_trace().
true
([email protected])9> ttb:stop([fetch]).
Stored logs in /Users/yufeng/ttb_upload-20111003-012347
stopped

到現在為止,日誌已經儲存下來了, 檔名是ttb_upload-20111003-012347, 我們先文字方式格式化下訊息流:

([email protected])10> ttb:format("ttb_upload-20111003-012347").

({<0.37.0>,{erlang,apply,2},'[email protected]'}) call test:demo([1,1,3,5],{<6584.60.0>,<6585.60.0>,<6586.60.0>})

SeqTrace {1317,576219,213176} [0]: ({<0.37.0>,{erlang,apply,2},'[email protected]'}) {<6584.60.0>,{global,z_p},'[email protected]'} ! {<0.37.0>,[1,1,3,5]} [Serial: {0,1}]SeqTrace {1317,576219,213398} [0]: ({<6584.60.0>,{global,z_p},'[email protected]'}) << {<0.37.0>,[1,1,3,5]} [Serial: {0,1}, From: {<0.37.0>,{erlang, apply,2},'[email protected]'}]

SeqTrace {1317,576219,213409} [0]: ({<6584.60.0>,{global,z_p},'[email protected]'}) {<6585.60.0>,{global,y_p},'[email protected]'} ! {<6584.60.0>,[1,1,3,5]} [Serial: {1, 2}]SeqTrace {1317,576219,213601} [0]: ({<6585.60.0>,{global,y_p},'[email protected]'}) << {<6584.60.0>,[1,1,3,5]} [Serial: {1,2}, From: {<6584.60.0>,{global,z_p},'[email protected]'}]

SeqTrace {1317,576219,213640} [0]: ({<6585.60.0>,{global,y_p},'[email protected]'}) {<6586.60.0>,
{global,x_p},'[email protected]'} ! {<6585.60.0>,[3,5,1]} [Serial: {2,3}]

SeqTrace {1317,576219,213814} [0]: ({<6586.60.0>,{global,x_p},'[email protected]'}) << {<6585.60.0>,[3,5,1]} [Serial: {2,3}, From: {<6585.60.0>,{global,y_p},'[email protected]'}]

SeqTrace {1317,576219,213825} [0]: ({<6586.60.0>,{global,x_p},'[email protected]'}) {<6585.60.0>,{global,y_p},'[email protected]'} ! {<6586.60.0>,[6,10,2]} [Serial: {3,4}]
SeqTrace {1317,576219,213929} [0]: ({<6585.60.0>,{global,y_p},'[email protected]'}) << {<6586.60.0>,[6,10,2]} [Serial: {3,4}, From: {<6586.60.0>,{global,x_p},'[email protected]'}]

SeqTrace {1317,576219,213935} [0]: ({<6585.60.0>,{global,y_p},'[email protected]'}) {<6584.60.0>,{global,z_p},'[email protected]'} ! {<6585.60.0>,[6,10,2]} [Serial: {4,5}]

SeqTrace {1317,576219,214054} [0]: ({<6584.60.0>,{global,z_p},'[email protected]'}) << {<6585.60.0>,[6,10,2]} [Serial: {4,5}, From: {<6585.60.0>,{global,y_p},'[email protected]'}]

SeqTrace {1317,576219,214062} [0]: ({<6584.60.0>,{global,z_p},'[email protected]'}) {<0.37.0>,{erlang,apply,2},'[email protected]'} ! {<6584.60.0>,18} [Serial: {5,6}]

SeqTrace {1317,576219,214241} [0]: ({<0.37.0>,{erlang,apply,2},'[email protected]'}) << {<6584.60.0>,18} [Serial: {5,6}, From: {<6584.60.0>,{global,z_p},'[email protected]'}]

ok

([email protected])11>

透過訊息互動的文字,我們可以看到訊息在流動,整個過程都在掌握中,如果哪個環節出問題,也一目瞭然。

我們還可以用圖形方式顯示路徑,神奇的et可以幫忙,et可以參考這裡

演示下這個功能:

([email protected])12> ttb:format("ttb_upload-20111003-012347",[{handler,et}]).

圖形化得路徑看起來就更清楚了:

結論: seq_trace+ttb在跟蹤訊息方面真是無敵!

祝大家玩得開心!