【PXC】關於限流的引數,狀態值說明
一.什麼是流控(FC)?如何工作?
節點接收寫集並把它們按照全域性順序組織起來,節點將接收到的未應用和提交的事務儲存在接收佇列中,
當這個接收佇列達到一定的大小,將觸發限流;此時節點將暫停複製,節點會先處理接收佇列中的任務。
當接收佇列減小到一個可管理的值後,複製將恢復。
它普遍存在於galera集群系統。
二.流控是發生了什麼,會有哪些全域性值可以觀察到流控?
mysql> show global status like 'wsrep_flow%'; +----------------------------------+----------------+ | Variable_name |Value | +----------------------------------+----------------+ | wsrep_flow_control_paused_ns | 0 | | wsrep_flow_control_paused | 0.000000 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 0 | | wsrep_flow_control_interval |[ 1024, 1024 ] | | wsrep_flow_control_interval_low | 1024 | | wsrep_flow_control_interval_high | 1024 | | wsrep_flow_control_status | OFF | +----------------------------------+----------------+
wsrep_flow_control_paused_ns
限流發生時,複製同步暫停的時間(各節點都可能出現,不適合作為監控項)
wsrep_flow_control_paused
該狀態值發生變化,含義為從上一次SHOW GLOBAL STATUS命令開始,
限流佔全體同步資料時間的百分比(初始值0.0),理想情況下應該趨近於0.0;
當它比較大的時候(超過0.6),我們需要採取一些手段(新增新節點、刪除慢節點,調高wsrep_slave_threads值)
來改善限流情況
wsrep_flow_control_sent
本地節點發送給叢集的限流事件資訊數量,可以用來當做監控項,來確認哪個節點導致了限流產生
wsrep_flow_control_recv
本地節點收到的叢集限流事件資訊數量
wsrep_flow_control_interval
wsrep_flow_control_interval_low
wsrep_flow_control_interval_high
wsrep_flow_control_status
如何進行限流調優?
1. wsrep_slave_threads
The number of threads to use for applying slave write sets.
用於設定讀節點執行寫集的執行緒個數
mysql> show global variables like 'wsrep_slave_threads'; +---------------------+-------+ | Variable_name | Value | +---------------------+-------+ | wsrep_slave_threads | 24 | +---------------------+-------+
預設值1是遠遠不夠的,我們需要根據另外一個狀態值進行調整
公司安裝的PXC叢集,預設該引數值為16(也是不夠的)
2. wsrep_cert_deps_distance
可以並行執行的最高與最低佇列值之間的平均距離
代表可以同時並行執行多少個寫集的操作
mysql> show global status like 'wsrep_cert_deps_distance'; +--------------------------+-----------+ | Variable_name | Value | +--------------------------+-----------+ | wsrep_cert_deps_distance | 95.288623 | +--------------------------+-----------+
我們可以將wsrep_slave_threads的值按照wsrep_cert_deps_distance的值設定
注意:剛做完SST的時候,這個狀態值會非常高,然後緩慢下降,此時該值不具備參考性
mysql> show global status like 'wsrep_cert_deps_distance'; +--------------------------+-------------+ | Variable_name | Value | +--------------------------+-------------+ | wsrep_cert_deps_distance | 6015.952973 | +--------------------------+-------------+ 1 row in set (0.00 sec) mysql> show global status like 'wsrep_cert_deps_distance'; +--------------------------+-------------+ | Variable_name | Value | +--------------------------+-------------+ | wsrep_cert_deps_distance | 5840.756210 | +--------------------------+-------------+ 1 row in set (0.00 sec) mysql> show global status like 'wsrep_cert_deps_distance'; +--------------------------+-------------+ | Variable_name | Value | +--------------------------+-------------+ | wsrep_cert_deps_distance | 5421.252076 | +--------------------------+-------------+ 1 row in set (0.00 sec)
其他引數、狀態值
1. wsrep_local_recv_queue_%
mysql> show global status like 'wsrep_local_recv_queue_avg'; +----------------------------+----------+ | Variable_name | Value | +----------------------------+----------+ | wsrep_local_recv_queue_avg | 0.110581 | +----------------------------+----------+ 1 row in set (0.00 sec)
When the node returns a value higher than 0.0 it means that the node cannot apply write-sets as fast as it receives them,
which can lead to replication throttling.
簡單地說:這個值高於0.0,說明發生同步延遲,將會引起限流
mysql> show global status like 'wsrep_local_recv_queue_m%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | wsrep_local_recv_queue_max | 3788 | | wsrep_local_recv_queue_min | 0 | +----------------------------+-------+ 2 rows in set (0.00 sec)
In addition to this status variable, you can also use wsrep_local_recv_queue_max and wsrep_local_recv_queue_min
to see the maximum and minimum sizes the node recorded for the local received queue.