MySQL 8.0.11 快速生成百萬甚至千萬測試資料
阿新 • • 發佈:2019-02-03
背景需求:
隨機生成千萬資料用於測試和驗證
1.利用現有的生產資料。
統計現有生產環境的資料,若有千萬級別資料的表則可以直接利用:
SELECT table_schema,table_name,table_rows FROM information_schema.tables WHERE table_rows >10000000;
直接備份還原到測試環境 即可。
2.利用sysbench生成單表千萬上億的資料:
這裡使用的sysbench RPM安裝包: sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=172.16.1.81 --mysql-port=3306 --mysql-db=sbtest --mysql-user=root --mysql-password=xxxxxx --table_size=10000000 --tables=20 --threads=50 --time=240 --report-interval=20 --db-driver=mysql prepare sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=172.16.1.81 --mysql-port=3306 --mysql-db=sbtest --mysql-user=root --mysql-password=xxxxxx --table_size=10000000 --tables=20 --threads=50 --time=240 --report-interval=20 --db-driver=mysql run 注意這裡的table_size指定單表的行數,tables指定生產表的個數;使用完測試資料後自己手動刪除即可。
3.自己手寫SQL程式碼生成千萬資料。
建立一個表儲存0-9共10個數字,領完建立一個表用於存放千萬級別的表資料: CREATE TABLE a (i int); INSERT INTO a(i) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9); create table bigtable(i bigint unsigned ); insert into bigtable(i) SELECT a.i*1 +a1.i*10 +a2.i*100 +a3.i*1000 +a4.i*10000 +a5.i*100000 +a6.i*1000000 +a7.i*10000000 AS id FROM a CROSS JOIN a AS a1 CROSS JOIN a AS a2 CROSS JOIN a AS a3 CROSS JOIN a AS a4 CROSS JOIN a AS a5 CROSS JOIN a AS a6 CROSS JOIN a AS a7; Query OK, 100000000 rows affected (8 min 47.86 sec) Records: 100000000 Duplicates: 0 Warnings: 0 --查詢驗證: mysql> SELECT MIN(b.i),MAX(b.i),COUNT(1) from bigtable b; +----------+----------+-----------+ | MIN(b.i) | MAX(b.i) | COUNT(1) | +----------+----------+-----------+ | 0 | 99999999 | 100000000 | +----------+----------+-----------+ 1 row in set (1 min 24.20 sec) 在MariaDB10.2和10.3版本以及MySQL8.0.11版本中支援with語句後,上面的過程一條SQL語句即可搞定。 WITH a AS ( SELECT 1 AS i UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL SELECT 0) ,b as ( SELECT a.i*1 +a1.i*10 +a2.i*100 +a3.i*1000 +a4.i*10000 +a5.i*100000 +a6.i*1000000 +a7.i*10000000 AS id FROM a CROSS JOIN a AS a1 CROSS JOIN a AS a2 CROSS JOIN a AS a3 CROSS JOIN a AS a4 CROSS JOIN a AS a5 CROSS JOIN a AS a6 CROSS JOIN a AS a7) SELECT MIN(b.id),MAX(b.id),COUNT(1) FROM b; min(b.id) max(b.id) count(1) 0 99999999 100000000 耗時3min24sec。具體的消耗時間視電腦的效能。由於insert操作需要大量寫入資料到磁碟, 在insert之前可以臨時關閉binlog檔案。