1. 程式人生 > >MySQL壓測--註意事項和FAQ

MySQL壓測--註意事項和FAQ

一鍵 2.3 strong fan 同步 com 5.4 flush backtrace

上次我們講了TPCC的安裝和一些使用方法,今天先不著急真正的做MySQL壓力測試,先把我最近做壓力測試的遇到的一些問題羅列出來,這樣

後面做測試可以事半功倍。


1.註意事項


(1).提前規劃好具體要測試什麽,即測試目的,比如

  • MySQL5.6與5.7的性能差異;

  • 異步復制和半同步復制的TPS;

  • 設置雙1(innodb_flush_log_at_trx_commit=1、sync_binlog=1)性能對數據庫性能影響多少;

  • 確認即將上線的新業務對MySQL負載影響多少,是否能承載得住,是否需要對服務器進行擴容或升級配置;


(2).壓力測試比較耗時,我們不可能時刻監督每次壓測什麽時候結束,然後手動的進行下一次測試,所以要寫個腳本幫助我們一鍵測試;


(3).根據測試環境的硬件配置給予機器適當的壓力,對於有限的硬件配置,不能無限的增壓,這樣數據庫裏全是lock wait timeout等報錯,最後會使MySQL Crash的,這個可以參考大師葉老師的博客《MySQL壓力測試基準值》

http://imysql.com/2015/07/28/mysql-benchmark-reference.shtml


2.FAQ

這一塊主要講測試的時候系統和數據庫出現的問題,導致壓測無法進行


(1)tpcc數據庫在預熱rampup時候的報錯

sht-sgmhadoopcm-01:mysqladmin:/usr/local/mysql/tpcc-mysql-master/logs:>less tpcc_runlog_parameter_20180620234749_512_THREADS
RAMP-UP TIME.(120 sec.)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)
1461, 42000, Can't create more than max_prepared_stmt_count statements (current value: 16382)


解決方法:

錯誤很明顯,參數max_prepared_stmt_count(默認16382)現在的大小已經無法滿足負載需求了,這個參數是限制數據庫中已經準備的statements數量,即sql數量,我們把它增大十倍。

sht-sgmhadoopdn-02.telenav.cn:mysqladmin:/usr/local/mysql/data:>vim /ect/my.cnf

max_prepared_stmt_count = 163820

然後重啟MySQL

mysql> show variables like '%max_prepared_stmt_count%';

+-------------------------+-------+

| Variable_name | Value |

+-------------------------+-------+

| max_prepared_stmt_count = | 163820 |

+-------------------------+-------+


參數官方解析:

This variable limits the total number of prepared statements in the server. It can be used in environments where there is the potential for denial-of-service attacks based on running the server out of memory by preparing huge numbers of statements. If the value is set lower than the current number of prepared statements, existing statements are not affected and can be used, but no new statements can be prepared until the current number drops below the limit. The default value is 16,382. The permissible range of values is from 0 to 1 million. Setting the value to 0 disables prepared statements.


(2)MySQL負載過高,導致crash

數據庫日誌也出現大量的日誌信息:

sht-sgmhadoopdn-02.telenav.cn:mysqladmin:/usr/local/mysql/data:>vim sht-sgmhadoopdn-01.err
2018-06-20T17:04:42.935422Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5690ms. The settings might not be optimal. (flushed=1119 and evicted=0, during the time.)
2018-06-20T17:05:45.061790Z 0 [ERROR] Error in accept: Too many open files
2018-06-20T17:05:47.846070Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5057ms. The settings might not be optimal. (flushed=912 and evicted=0, during the time.)
2018-06-20T17:06:09.564302Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4569ms. The settings might not be optimal. (flushed=913 and evicted=0, during the time.)
2018-06-20T17:06:35.456128Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5822ms. The settings might not be optimal. (flushed=959 and evicted=0, during the time.)
2018-06-20T17:06:47.324711Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4312ms. The settings might not be optimal. (flushed=673 and evicted=0, during the time.)
2018-06-20T17:06:55.651725Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 5018ms. The settings might not be optimal. (flushed=592 and evicted=0, during the time.)
2018-06-20T17:07:09.104978Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4983ms. The settings might not be optimal. (flushed=936 and evicted=0, during the time.)
2018-06-20T17:07:21.873005Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4723ms. The settings might not be optimal. (flushed=677 and evicted=0, during the time.)
2018-06-20T17:07:40.903065Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 4432ms. The settings might not be optimal. (flushed=431 and evicted=0, during the time.)
2018-06-20T17:08:13.283422Z 1085 [ERROR] Can't generate a unique log-filename /usr/local/mysql/arch/mysql-bin.(1-999)
2018-06-20T17:08:13.283501Z 1085 [ERROR] mysqld: Binary logging not possible. Message: Either disk is full or file system is read only while rotating the binlog. Aborting the server.
17:08:13 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=268435456
read_buffer_size=2097152
max_used_connections=981
max_threads=214
thread_count=981
connection_count=981
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1272281 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f55ddfa9710
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f55bac30ea8 thread_stack 0x40000
mysqld(my_print_stacktrace+0x35)[0xf4a495]
mysqld(handle_fatal_signal+0x4a4)[0x7ce2f4]
/lib64/libpthread.so.0(+0xf5e0)[0x7f5774a0b5e0]
/lib64/libc.so.6(gsignal+0x37)[0x7f57733fd1f7]
/lib64/libc.so.6(abort+0x148)[0x7f57733fe8e8]
mysqld[0xee249a]
mysqld(_ZN13MYSQL_BIN_LOG13new_file_implEbP28Format_description_log_event+0x50c)[0xeeec4c]
mysqld(_ZN13MYSQL_BIN_LOG6rotateEbPb+0x3e)[0xeeed0e]
mysqld(_ZN13MYSQL_BIN_LOG14ordered_commitEP3THDbb+0x497)[0xef1137]
mysqld(_ZN13MYSQL_BIN_LOG6commitEP3THDb+0x585)[0xef1825]
mysqld(_Z15ha_commit_transP3THDbb+0x174)[0x81f594]
mysqld(_Z12trans_commitP3THD+0x49)[0xdd1589]
mysqld(_Z21mysql_execute_commandP3THDb+0x2be0)[0xd186d0]
mysqld(_Z11mysql_parseP3THDP12Parser_state+0x40d)[0xd1af7d]
mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x119a)[0xd1c19a]
mysqld(_Z10do_commandP3THD+0x194)[0xd1d044]
mysqld(handle_connection+0x29c)[0xded7ac]
mysqld(pfs_spawn_thread+0x174)[0xf707b4]
/lib64/libpthread.so.0(+0x7e25)[0x7f5774a03e25]
/lib64/libc.so.6(clone+0x6d)[0x7f57734c034d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f55de4918b0): is an invalid pointer
Connection ID (thread ID): 1085
Status: NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.


解決方法:

建議:不要給與數據庫太大的壓力,控制好thread並發數


(3)linux某一用戶達到max user processes後,導致暫時無法使用該用戶mysqladmin登錄和無法使用任何bash命令

-bash-4.2$ sudo su mysqladmin

su: failed to execute /bin/bash: Resource temporarily unavailable


sht-sgmhadoopcm-01:mysqladmin:/usr/local/mysql/tpcc-mysql-master/logs:>ls

bash: fork: retry: No child processes

bash: fork: retry: No child processes

bash: fork: retry: No child processes

bash: fork: retry: No child processes

bash: fork: Resource temporarily unavailable



解決方法:

查看系統的一些限制設置,發現max user processes 只有4096,進程被用完,所以無論寫入什麽命令,系統都無法及時作出響應

[root@sht-sgmhadoopcm-01 ~]# ulimit -a

core file size (blocks, -c) 0

data seg size (kbytes, -d) unlimited

scheduling priority (-e) 0

file size (blocks, -f) unlimited

pending signals (-i) 31207

max locked memory (kbytes, -l) 64

max memory size (kbytes, -m) unlimited

open files (-n) 65536

pipe size (512 bytes, -p) 8

POSIX message queues (bytes, -q) 819200

real-time priority (-r) 0

stack size (kbytes, -s) 8192

cpu time (seconds, -t) unlimited

max user processes (-u) 4096

virtual memory (kbytes, -v) unlimited

file locks (-x) unlimited



[root@sht-sgmhadoopcm-01 ~]# vim /etc/security/limits.conf

* soft nproc 65536

* hard nproc 65536

* soft nofile 65536

* hard nofile 65536



還需要修改一個默認的配置文件才能真正的生效:

[root@sht-sgmhadoopcm-01 ~]# vim /etc/security/limits.d/20-nproc.conf

# Default limit for number of user's processes to prevent

# accidental fork bombs.

# See rhbz #432903 for reasoning.

* soft nproc 65536 #把這裏的4096改大為65536或者為unlimited

root soft nproc unlimited


查看已經生效:

sht-sgmhadoopcm-01:mysqladmin:/usr/local/mysql/tpcc-mysql-master/logs:>ulimit -u

65536


(4)並發太大,導致大量的lock

當線程數達到1536及以上的時候,日誌中出現大量的lock信息:

$less tpcc_runlog_parameter_20180621170023_1536_THREADS
payment 1466:1
payment 247:1
1205, HY000, Lock wait timeout exceeded; try restarting transaction
1205, HY000, Lock wait timeout exceeded; try restarting transaction
payment 954:1
1205, HY000, Lock wait timeout exceeded; try restarting transaction
payment 485:1
1205, HY000, Lock wait timeout exceeded; try restarting transaction
payment 266:1
1205, HY000, Lock wait timeout exceeded; try restarting transaction
......
.......
2013, HY000, Lost connection to MySQL server at 'reading initial communication packet', system error: 104
2013, HY000, Lost connection to MySQL server at 'reading initial communication packet', system error: 104
2013, HY000, Lost connection to MySQL server at 'reading initial communication packet', system error: 104
2013, HY000, Lost connection to MySQL server at 'reading initial communication packet', system error: 104
2013, HY000, Lost connection to MySQL server at 'reading initial communication packet', system error: 104


解決方法:

此時查看MySQL中的一些Lock狀態信息,可以看出來全部都是一些x鎖,說明現在並發過大,導致數據庫處理不過來,產生大量排它鎖。

mysql> select * from INFORMATION_SCHEMA.INNODB_LOCKS;
+------------------+-------------+-----------+-----------+----------------------+------------+------------+-----------+----------+-----------+
| lock_id          | lock_trx_id | lock_mode | lock_type | lock_table           | lock_index | lock_space | lock_page | lock_rec | lock_data |
+------------------+-------------+-----------+-----------+----------------------+------------+------------+-----------+----------+-----------+
| 15043927:63:3:8  | 15043927    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043750:63:3:8  | 15043750    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043742:63:3:8  | 15043742    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043700:63:3:8  | 15043700    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043656:63:3:8  | 15043656    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043646:63:3:8  | 15043646    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043607:63:3:8  | 15043607    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043399:63:3:8  | 15043399    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
| 15043388:63:3:8  | 15043388    | X         | RECORD    | `tpccdb`.`warehouse` | PRIMARY    |         63 |         3 |        8 | 7         |
......
| 15043727:64:3:56 | 15043727    | X         | RECORD    | `tpccdb`.`district`  | PRIMARY    |         64 |         3 |       56 | 6, 5      |
| 15035687:64:3:62 | 15035687    | X         | RECORD    | `tpccdb`.`district`  | PRIMARY    |         64 |         3 |       62 | 7, 1      |
| 15043855:64:3:62 | 15043855    | X         | RECORD    | `tpccdb`.`district`  | PRIMARY    |         64 |         3 |       62 | 7, 1      |
| 15032784:64:3:56 | 15032784    | X         | RECORD    | `tpccdb`.`district`  | PRIMARY    |         64 |         3 |       56 | 6, 5      |
+------------------+-------------+-----------+-----------+----------------------+------------+------------+-----------+----------+-----------+
1756 rows in set, 1 warning (3.87 sec)

建議:不要給與數據庫太大的壓力,控制好thread並發數


(5)由於磁盤空間不足,導致MySQL啟動失敗

在啟動MySQL的時候err日誌報的錯誤:

sht-sgmhadoopdn-01.telenav.cn:mysqladmin:/usr/local/mysql/data:>tail -f hostname.err 
2018-06-23T08:21:35.417981Z 0 [Note] --secure-file-priv is set to NULL. Operations related to importing and exporting data are disabled
2018-06-23T08:21:35.418138Z 0 [Note] mysqld (mysqld 5.7.21-log) starting as process 16512 ...
2018-06-23T08:21:35.438102Z 0 [Note] InnoDB: PUNCH HOLE support available
2018-06-23T08:21:35.438190Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-06-23T08:21:35.438202Z 0 [Note] InnoDB: Uses event mutexes
2018-06-23T08:21:35.438242Z 0 [Note] InnoDB: GCC builtin __sync_synchronize() is used for memory barrier
2018-06-23T08:21:35.438340Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.3
2018-06-23T08:21:35.438390Z 0 [Note] InnoDB: Using Linux native AIO
2018-06-23T08:21:35.449551Z 0 [Note] InnoDB: Number of pools: 1
2018-06-23T08:21:35.449862Z 0 [Note] InnoDB: Using CPU crc32 instructions
2018-06-23T08:21:35.456460Z 0 [Note] InnoDB: Initializing buffer pool, total size = 5G, instances = 8, chunk size = 128M
2018-06-23T08:21:36.274392Z 0 [Note] InnoDB: Completed initialization of buffer pool
2018-06-23T08:21:36.426894Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2018-06-23T08:21:36.441471Z 0 [Note] InnoDB: Opened 3 undo tablespaces
2018-06-23T08:21:36.441504Z 0 [Note] InnoDB: 3 undo tablespaces made active
2018-06-23T08:21:36.441913Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2018-06-23T08:21:36.629202Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2018-06-23T08:21:36.629339Z 0 [Note] InnoDB: Setting file '/usr/local/mysql/data/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2018-06-23T08:21:36.646819Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 794624 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:36.790869Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 921600 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:38.380026Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 925696 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:38.390611Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 929792 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:38.399767Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 933888 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:38.412285Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 937984 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:38.419670Z 0 [Warning] InnoDB: 1048576 bytes should have been written. Only 942080 bytes written. Retrying for the remaining bytes.
2018-06-23T08:21:38.436278Z 0 [Warning] InnoDB: Retry attempts for writing partial data failed.
2018-06-23T08:21:38.436345Z 0 [ERROR] InnoDB: Write to file /usr/local/mysql/data/ibtmp1failed at offset 11534336, 1048576 bytes should have been written, only 942080 were written. Operating system error number 28. Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
2018-06-23T08:21:38.436408Z 0 [ERROR] InnoDB: Error number 28 means 'No space left on device'
2018-06-23T08:21:38.436426Z 0 [Note] InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html
2018-06-23T08:21:38.436442Z 0 [ERROR] InnoDB: Could not set the file size of '/usr/local/mysql/data/ibtmp1'. Probably out of disk space
2018-06-23T08:21:38.436457Z 0 [ERROR] InnoDB: Unable to create the shared innodb_temporary
2018-06-23T08:21:38.436467Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2018-06-23T08:21:39.040518Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2018-06-23T08:21:39.040570Z 0 [ERROR] Plugin 'InnoDB' init function returned error.
2018-06-23T08:21:39.040581Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2018-06-23T08:21:39.040597Z 0 [ERROR] Failed to initialize builtin plugins.
2018-06-23T08:21:39.040673Z 0 [ERROR] Aborting
2018-06-23T08:21:39.040699Z 0 [Note] Binlog end
2018-06-23T08:21:39.041472Z 0 [Note] mysqld: Shutdown complete


解決方法:

sht-sgmhadoopdn-01.telenav.cn:mysqladmin:/usr/local/mysql/data:>df -Th

Filesystem Type Size Used Avail Use% Mounted on

/dev/mapper/centos-root xfs 77G 77G 12M 100% /

devtmpfs devtmpfs 2.9G 0 2.9G 0% /dev

tmpfs tmpfs 2.9G 0 2.9G 0% /dev/shm

tmpfs tmpfs 2.9G 65M 2.8G 3% /run

tmpfs tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup

/dev/sda1 xfs 497M 132M 366M 27% /boot


建議:刪除無用的文件釋放disk space,或者mount一個更大的磁盤



參考鏈接:

MySQL壓測工具--TPCC安裝,測試http://blog.51cto.com/darrenmemos/2130202

http://imysql.com/2015/07/28/mysql-benchmark-reference.shtml


MySQL壓測--註意事項和FAQ