1. 程式人生 > >MySQL 8.0.11 快速生成百萬甚至千萬測試資料

MySQL 8.0.11 快速生成百萬甚至千萬測試資料

背景需求:

隨機生成千萬資料用於測試和驗證

1.利用現有的生產資料。

統計現有生產環境的資料,若有千萬級別資料的表則可以直接利用:
SELECT table_schema,table_name,table_rows FROM information_schema.tables WHERE table_rows >10000000;

直接備份還原到測試環境 即可。

2.利用sysbench生成單表千萬上億的資料:

這裡使用的sysbench RPM安裝包:
sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=172.16.1.81 --mysql-port=3306 --mysql-db=sbtest --mysql-user=root --mysql-password=xxxxxx --table_size=10000000 --tables=20 --threads=50 --time=240 --report-interval=20 --db-driver=mysql prepare
sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=172.16.1.81 --mysql-port=3306 --mysql-db=sbtest --mysql-user=root --mysql-password=xxxxxx --table_size=10000000 --tables=20 --threads=50 --time=240 --report-interval=20 --db-driver=mysql run

注意這裡的table_size指定單表的行數,tables指定生產表的個數;使用完測試資料後自己手動刪除即可。

3.自己手寫SQL程式碼生成千萬資料。

建立一個表儲存0-9共10個數字,領完建立一個表用於存放千萬級別的表資料:
CREATE TABLE a (i int);
INSERT INTO a(i) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

create table bigtable(i bigint unsigned );
insert into bigtable(i)
SELECT
    a.i*1
   +a1.i*10
   +a2.i*100
   +a3.i*1000
   +a4.i*10000
   +a5.i*100000
   +a6.i*1000000
   +a7.i*10000000
   AS id
FROM  a 
CROSS JOIN a AS a1
CROSS JOIN a AS a2
CROSS JOIN a AS a3
CROSS JOIN a AS a4
CROSS JOIN a AS a5
CROSS JOIN a AS a6
CROSS JOIN a AS a7;
Query OK, 100000000 rows affected (8 min 47.86 sec)
Records: 100000000  Duplicates: 0  Warnings: 0
--查詢驗證:
mysql> SELECT MIN(b.i),MAX(b.i),COUNT(1) from bigtable b;  
+----------+----------+-----------+
| MIN(b.i) | MAX(b.i) | COUNT(1)  |
+----------+----------+-----------+
|        0 | 99999999 | 100000000 |
+----------+----------+-----------+
1 row in set (1 min 24.20 sec)


在MariaDB10.2和10.3版本以及MySQL8.0.11版本中支援with語句後,上面的過程一條SQL語句即可搞定。
WITH a AS (
SELECT 1 AS i
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
UNION ALL SELECT 9
UNION ALL SELECT 0)
,b as (
SELECT
    a.i*1
   +a1.i*10
   +a2.i*100
   +a3.i*1000
   +a4.i*10000
   +a5.i*100000
   +a6.i*1000000
   +a7.i*10000000
   AS id
FROM  a 
CROSS JOIN a AS a1
CROSS JOIN a AS a2
CROSS JOIN a AS a3
CROSS JOIN a AS a4
CROSS JOIN a AS a5
CROSS JOIN a AS a6
CROSS JOIN a AS a7)
SELECT MIN(b.id),MAX(b.id),COUNT(1) FROM b;
min(b.id)	max(b.id)	count(1)
0	99999999	100000000
耗時3min24sec。具體的消耗時間視電腦的效能。由於insert操作需要大量寫入資料到磁碟,
在insert之前可以臨時關閉binlog檔案。