Erlang 開啟和關閉SMP選項之差異觀察
##以下觀察是基於版本otp_19.1
#觀察SMP enabled
Step 1. 啟動erl with SMP enabled (erl -smp enable)
sexxx484: ~/erlang/test > erl
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V8.1 (abort with ^G)
1>
#[smp:8:8] 表示在8核的機器上啟動了8個scheduler, 每個core對應一個scheduler, 每一個scheduler對應於獨立的一個作業系統執行緒. 啟動erlang的時候可以通過選項 +S來設定啟動多少個scheduler, 也可以在執行時通過erlang:system_flag
來調整.
Note:
$ erl -smp disable
.
To see if your Erlang VM runs with or without SMP support in the first place, start a new VM without any options and look for the first line output. If you can spot the text [smp:2:2] [rq:2]
If you wanted to know, [smp:2:2] means there are two cores available, with two schedulers. [rq:2] means there are two run queues active. In earlier versions of Erlang, you could have multiple schedulers, but with only one shared run queue. Since R13B, there is one run queue per scheduler by default; this allows for better parallelism.
#[async-threads:10] 表示啟動了非同步執行緒池, 含有10個非同步IO執行緒. 啟動erlang的時候可以通過選項+A來設定啟動多少個非同步IO執行緒.
既然說到async thread pool, 這裡就解釋一下,這個非同步執行緒池的作用. 當你有大量的檔案IO操作的時候,這個非同步執行緒池可以改善系統的響應效率,因為這些相對比較耗時的IO操作都放到非同步執行緒池裡面來完成了,不會影響到正常的scheduler排程.
下面一段話引用自stackoverflow.
The BEAM runs Erlang code in special threads it calls schedulers. By default it will start a scheduler for every core in your processor. This can be controlled and start up time, for instance if you don't want to run Erlang on all cores but "reserve" some for
other things. Normally when you do a file I/O operation then it is run in a scheduler and as file I/O operations are relatively slow they will block that scheduler while they are running. Which can affect the real-time properties. Normally you don't do that
much file I/O so it is not a problem.
The asynchronous thread pool are OS threads which are used for I/O operations. Normally the pool is empty but if you use the +A at startup time then the BEAM will create extra threads for this pool. These threads will then only be used for file I/O operations
which means that the scheduler threads will no longer block waiting for file I/O and the real-time properties are improved. Of course this costs as OS threads aren't free. The threads don't mix so scheduler threads are just scheduler threads and async threads
are just async threads.
If you are writing linked-in drivers for ports these can also use the async thread pool. But you have to detect when they have been started yourself.
How many you need is very much up to your application. By default none are started. Like @demeshchuk I have also heard that Riak likes to have a large async thread pool as they open many files. My only advice is to try it and measure. As with all optimisation?
Step 2. 找到程序beam.smp的程序ID
sexxx484: ~/erlang/test >ps -ef
| grep beam
Step 3. 檢視改程序包括的所有執行緒
sexxx484: ~/erlang/test > top -H -p 699
top - 10:41:40 up 205 days, 22:31, 72 users, load average: 1.22, 1.65, 1.97
Tasks: 22 total, 0 running, 22 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.0%us, 4.7%sy, 0.0%ni, 88.8%id, 0.0%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 32241M total, 28556M used, 3684M free, 842M buffers
Swap: 0M total, 0M used, 0M free, 14000M cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
699 20 0 1757m 20m 2980 S 0 0.1 0:00.05 beam.smp
703 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
704 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
705 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
706 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
707 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
708 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
709 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
710 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
711 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
712 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
713 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
714 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
716 20 0 1757m 20m 2980 S 0 0.1 0:00.03 beam.smp
717 20 0 1757m 20m 2980 S 0 0.1 0:00.22 beam.smp
718 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
719 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
720 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
721 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
722 20 0 1757m 20m 2980 S 0 0.1 0:00.02 beam.smp
723 20 0 1757m 20m 2980 S 0 0.1 0:00.02 beam.smp
724 20 0 1757m 20m 2980 S 0 0.1 0:00.00 beam.smp
#觀察SMP disabled
Step 1. 啟動erl with SMP enabled (erl -smp disable)
sexxx484: ~/erlang/test > erl -smp disable
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V8.1 (abort with ^G)
1>
這裡沒有了smp的標誌,所以不會在每個core上都啟動一scheduler,因為每個scheduler對應於一個獨立的作業系統執行緒, 所以beam程序的內部執行緒數量相比於開啟smp時就會少一些.
Step 2. 找到程序beam.smp的程序ID
sexxx484: ~/erlang/test >ps -ef | grep beam
Step 3. 檢視改程序包括的所有執行緒
484: ~/erlang/test > top -H -p 31599
top - 10:39:35 up 205 days, 22:29, 72 users, load average: 1.21, 1.88, 2.08
Tasks: 11 total, 0 running, 11 sleeping, 0 stopped, 0 zombie
Cpu(s): 50.0%us, 0.0%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32241M total, 28549M used, 3691M free, 842M buffers
Swap: 0M total, 0M used, 0M free, 14000M cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31599 20 0 1579m 15m 2836 S 0 0.0 0:00.32 beam
31607 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31608 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31609 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31610 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31611 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31612 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31613 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31614 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31615 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
31616 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
#觀察SMP enabled
# spawn 4個Erlang processes, 分別開啟和關閉SMP選項,觀察processes在CPU上的執行情況差異
1. 啟動erl with SMP enabled (erl -smp enable)
2. 編譯執行以下程式(test_smp.erl), test_smp:start(99999999).
3. 找到程序beam.smp的程序ID
4. 檢視該程序包括的所有執行緒
5. 檢視top with type 1 輸出
# step 4 輸出 , 有四個執行緒全力執行中(100%), 在本例中,我啟動了四個erlang processes,所以說明這四個processes被分配到了不同的core上同時執行.
top - 10:31:49 up 205 days, 22:21, 72 users, load average: 3.49, 2.40, 2.16
Tasks: 22 total, 4 running, 18 sleeping, 0 stopped, 0 zombie
Cpu(s): 54.1%us, 2.3%sy, 0.0%ni, 43.4%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 32241M total, 28554M used, 3686M free, 842M buffers
Swap: 0M total, 0M used, 0M free, 14000M cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3448 20 0 1695m 21m 2988 R 100 0.1 0:41.07 beam.smp
3449 20 0 1695m 21m 2988 R 100 0.1 0:40.60 beam.smp
3450 20 0 1695m 21m 2988 R 100 0.1 0:40.75 beam.smp
3451 20 0 1695m 21m 2988 R 100 0.1 0:40.54 beam.smp
3428 20 0 1695m 21m 2988 S 0 0.1 0:00.05 beam.smp
3432 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3433 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3434 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3435 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3436 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3437 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3438 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3439 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3440 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3442 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3443 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3444 20 0 1695m 21m 2988 S 0 0.1 0:00.00 beam.smp
3452 20 0 1695m 21m 2988 S 0 0.1 0:00.20 beam.smp
3453 20 0 1695m 21m 2988 S 0 0.1 0:00.21 beam.smp
3454 20 0 1695m 21m 2988 S 0 0.1 0:00.24 beam.smp
3455 20 0 1695m 21m 2988 S 0 0.1 0:00.04 beam.smp
3456 20 0 1695m 21m 2988 S 0 0.1 0:00.01 beam.smp
# step 5 輸出, beam.smp 佔用cpu百分比是398,基本上是4個cpu core在全速執行
top - 10:27:42 up 205 days, 22:17, 72 users, load average: 3.25, 2.35, 2.11
Tasks: 562 total, 1 running, 538 sleeping, 21 stopped, 2 zombie
Cpu0 : 42.2%us, 6.3%sy, 0.0%ni, 51.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 78.1%us, 0.0%sy, 0.0%ni, 21.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 13.9%us, 7.3%sy, 0.0%ni, 78.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 99.7%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu4 : 79.1%us, 1.3%sy, 0.0%ni, 19.2%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu5 : 59.6%us, 6.4%sy, 0.0%ni, 34.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 14.7%us, 7.4%sy, 0.0%ni, 77.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 72.8%us, 0.3%sy, 0.0%ni, 26.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32241M total, 28552M used, 3689M free, 842M buffers
Swap: 0M total, 0M used, 0M free, 14000M cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23244 20 0 1759m 20m 2992 S 398 0.1 1:56.44 beam.smp
15903 20 0 88656 42m 3760 S 41 0.1 9427:46 .vasd
27877 20 0 24932 5728 1876 S 2 0.0 1:00.69 dtworker
26756 20 0 24932 5624 1864 S 2 0.0 0:58.43 dtworker
15886 20 0 24932 5708 1868 S 1 0.0 0:55.55 dtworker
16540 20 0 24836 5616 1868 S 1 0.0 0:56.49 dtworker
#觀察SMP disabled
1. 啟動erl with SMP disabled (erl -smp disable)
2. 編譯執行以下程式(test_smp.erl), test_smp:start(99999999).
3. 找到程序beam.smp的程序ID
4. 檢視該程序包括的所有執行緒
5. 檢視top with type 1 輸出
# step 4 輸出 , 只有一個執行緒全力執行中(100%), 在本例中,我啟動了四個erlang processes,所以說明這四個processes只有一個處於執行中,另外三個處於等待中.
top - 10:33:20 up 205 days, 22:23, 72 users, load average: 3.00, 2.72, 2.31
Tasks: 11 total, 1 running, 10 sleeping, 0 stopped, 0 zombie
Cpu(s): 20.0%us, 3.8%sy, 0.0%ni, 76.1%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 32241M total, 28549M used, 3692M free, 842M buffers
Swap: 0M total, 0M used, 0M free, 14000M cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11044 20 0 1579m 15m 2836 R 100 0.0 0:23.10 beam
11048 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11050 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11051 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11052 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11053 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11056 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11057 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11058 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11059 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
11060 20 0 1579m 15m 2836 S 0 0.0 0:00.00 beam
# step 5 輸出, beam.smp 佔用cpu百分比是 98, 只有一個cpu core在全速執行,剩下的CPU雖然空閒,但是不會被排程,因為SMP選項是 disabled.
top - 10:26:35 up 205 days, 22:16, 72 users, load average: 2.12, 2.07, 2.01
Tasks: 563 total, 2 running, 538 sleeping, 21 stopped, 2 zombie
Cpu0 : 14.6%us, 0.7%sy, 0.0%ni, 84.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 21.1%us, 0.7%sy, 0.0%ni, 77.9%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu2 : 14.3%us, 1.7%sy, 0.0%ni, 84.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 14.7%us, 1.7%sy, 0.0%ni, 83.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 99.0%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 15.0%us, 0.3%sy, 0.0%ni, 84.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 15.3%us, 1.3%sy, 0.0%ni, 83.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 14.8%us, 0.7%sy, 0.0%ni, 84.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32241M total, 28548M used, 3692M free, 842M buffers
Swap: 0M total, 0M used, 0M free, 14000M cached
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11851 20 0 1579m 15m 2844 R 98 0.0 2:54.56 beam
25650 20 0 2120m 685m 33m S 2 2.1 592:22.53 java
27877 20 0 24932 5728 1876 S 1 0.0 1:00.18 dtworker
4804 20 0 828m 50m 7596 S 1 0.2 19:23.46 java
15781 20 0 191m 81m 2756 S 1 0.3 1:20.85 beam.smp
20531 20 0 22424 1928 1164 R 1 0.0 0:00.19 top
1570 20 0 87352 2204 1188 S 0 0.0 264:23.39 vmtoolsd
3207 20 0 32032 2452 540 S 0 0.0 101:07.79 avahi-daemon
# test_smp.erl
-module(test_smp).
-compile(export_all).
start(N)->
spawn(?MODULE, tail_fac, [N]),
spawn(?MODULE, tail_fac, [N]),
spawn(?MODULE, tail_fac, [N]),
spawn(?MODULE, tail_fac, [N]).
tail_fac(N) -> tail_fac(N,1).
tail_fac(0,Acc) -> Acc;
tail_fac(N,Acc) when N > 0 -> tail_fac(N-1,N+Acc).
##部落格僅作個人記錄##