MySQL大表基於主鍵ID刪除邏輯
阿新 • • 發佈:2021-08-05
目錄
測試環境準備
建立測試表
-- 表結構示例 CREATE TABLE `g_device_action_base` ( `id` int(11) NOT NULL AUTO_INCREMENT, `uid` char(32) DEFAULT '', `domain_id` char(16) DEFAULT '', `machine` char(200) NOT NULL DEFAULT '', `app_type` char(32) NOT NULL DEFAULT '' , `app_id` char(32) NOT NULL DEFAULT '' , `action_time` int(11) NOT NULL DEFAULT '0', `action_status` int(11) NOT NULL DEFAULT '0', `source` char(32) NOT NULL DEFAULT '', `url` varchar(512) NOT NULL DEFAULT '' COMMENT 'url', PRIMARY KEY (`id`) ) ENGINE=InnoDB; -- 記錄示例 mysql> select * from g_device_action_base limit 1\G id: 24244024 uid: 779085e3ac9 domain_id: LhziEhqb8W machine: DBA app_type: wechat app_id: 3e261dcf5485fb0f1c00 action_time: 1595222484 action_status: 1 source: jw_app_hard url: https://www.cnblogs.com/zhenxing/ -- 造資料 -- 插入一條基礎資料 set session sql_log_bin=off; insert into g_device_action_base(uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url) values('779085e3ac9a32e8927099c2be506228','LhziEhqb8WgS','IOS','jw_app_thirdapp','3e261dcf5485fb0f1c0052f838ae6779',1595222484,1,'zhenxing','https://www.cnblogs.com/zhenxing/'); -- 反覆執行,成倍增加 insert into g_device_action_base(id,uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url) select null,uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url from g_device_action_base; -- 直到生成100W測試資料 select count(*) from g_device_action_base; -- 基於資料基礎表建立測試表 create table g_device_action like g_device_action_base;
灌測試資料
假設g_device_action_base表注入了100萬測試資料,現在要模擬5000萬的資料刪除操作,迴圈50次,每次重複插入100萬資料到g_device_action表中,以下是基本的插入資料的指令碼邏輯
#!/bin/bash for ((i=1;i<=50;i++)) do echo "load batch $i" mysql <<EOF set session sql_log_bin=off; use demo insert into g_device_action(uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url) select uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url from g_device_action_base; select sleep(2); EOF done
建立日誌表
日誌表用來分批次刪除資料的狀態和執行時間等情況,便於追溯刪除操作
CREATE TABLE `delete_batch_log` ( `ID` bigint(20) PRIMARY key AUTO_INCREMENT, `BATCH_ID` bigint(20) NOT NULL comment "批次號", `SCHEMA_NAME` varchar(64) NOT NULL comment "資料庫名稱", `TABLE_NAME` varchar(64) NOT NULL comment "表名稱", `BATCH_COUNT` bigint(20) NOT NULL comment "涉及的記錄數", `BEGIN_RECORD` varchar(100) DEFAULT NULL comment "ID最小值", `END_RECORD` varchar(100)DEFAULT NULL comment "ID最大值", `BEGIN_TIME` datetime(6) DEFAULT NULL comment "開始時間", `END_TIME` datetime(6) DEFAULT NULL comment "結束時間", `ERROR_NO` bigint(20) DEFAULT NULL comment "錯誤碼", `crc32_values` varchar(64) DEFAULT NULL comment "校驗碼" ); -- 建立相關查詢需要的索引 CREATE INDEX IDX_DELETE_BATCH_LOG_M1 ON delete_batch_log(BEGIN_RECORD,END_RECORD); CREATE INDEX IDX_DELETE_BATCH_LOG_M2 ON delete_batch_log(BEGIN_TIME,END_TIME); CREATE INDEX IDX_DELETE_BATCH_LOG_M3 ON delete_batch_log(TABLE_NAME,SCHEMA_NAME);
執行刪除資料操作
指令碼
batch_delete_table.sh
完成了以下任務
- 設定批量刪除的併發度
- 連線MySQL查詢出該表的最小主鍵ID和最大主鍵ID
- 基於最小主鍵ID和最大主鍵ID計算以每批次1萬條記錄的區間,需要執行多少次迴圈
- 將刪除操作的會話級別設定為RR且binlog格式設定為statement減少binlog的寫入量(減少IO壓力及從庫回放壓力)
- 將每個批次的刪除操作基本資訊寫入到日誌表中,包含以下資訊
- 資料庫名稱
- 資料表名稱
- 批次號
- 該批次刪除的記錄數
- 該批次的起始ID
- 該批次的結束ID
- 該批次刪除的開始時間
- 該批次刪除的結束時間
- 該批次刪除是否存在錯誤(記錄錯誤碼)
#!/bin/bash
## SET MySQL CONN INFO
MYSQL_HOST=10.186.61.162
MYSQL_USER=zhenxing
MYSQL_PASS=zhenxing
MYSQL_PORT=3306
MYSQL_DB=demo
BATCH_ROWS=10000
MYSQL_TABLE=g_device_action
PARALLEL_WORKERS=5
## Create Named pipe And File descriptor
[ -e /tmp/fd1 ] || mkfifo /tmp/fd1
exec 3<>/tmp/fd1
rm -rf /tmp/fd1
## Set the parallel
for ((i=1;i<= $PARALLEL_WORKERS;i++))
do
echo >&3
done
MINID=`mysql -sse "select min(id) from ${MYSQL_DB}.${MYSQL_TABLE};"`
BATCH_TOTAL=`mysql -sse "select ceil((max(id)-min(id))/${BATCH_ROWS}) from ${MYSQL_DB}.${MYSQL_TABLE};"`
## PARALLEL LOAD DATA
for ((i=1;i<=$BATCH_TOTAL;i++))
do
read -u3
{
BEGIN_RECORD=$[($i-1)*${BATCH_ROWS}+${MINID}]
END_RECORD=$[($i-0)*${BATCH_ROWS}+${MINID}]
mysql -h$MYSQL_HOST -u$MYSQL_USER -p$MYSQL_PASS -P$MYSQL_PORT << EOF
set session transaction_isolation='REPEATABLE-READ';
set session binlog_format='statement';
-- set session sql_log_bin=off;
set @BEGIN_TIME=now(6);
select count(*),CONV(bit_xor(crc32(concat_ws('',id,uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url))),10,16) into @row_count,@crc32_values from ${MYSQL_DB}.${MYSQL_TA
BLE} where id>=${BEGIN_RECORD} and id<${END_RECORD} and action_time<1595222485;
delete from ${MYSQL_DB}.${MYSQL_TABLE} where id>=${BEGIN_RECORD} and id<${END_RECORD} and action_time<1595222485;
set @END_TIME=now(6);
GET DIAGNOSTICS @p1=NUMBER,@p2=ROW_COUNT;
insert into ${MYSQL_DB}.delete_batch_log(BATCH_ID,SCHEMA_NAME,TABLE_NAME,BATCH_COUNT,BEGIN_RECORD,END_RECORD,BEGIN_TIME,END_TIME,ERROR_NO,crc32_values) values (${i},'${MYSQL_DB}','${MYSQL
_TABLE}',@row_count,${BEGIN_RECORD},${END_RECORD},@BEGIN_TIME,@END_TIME,@p1,@crc32_values);
EOF
echo >&3
} &
done
wait
exec 3<&-
exec 3>&-
刪除後的收尾操作
刪除完成後可用以下SQL檢視刪除的彙總情況
select SCHEMA_NAME,TABLE_NAME,min(BATCH_ID) as "最小批次",max(BATCH_ID) as "最大批次",sum(BATCH_COUNT) as "刪除記錄總數",min(BEGIN_TIME) as "開始時間",max(END_TIME) as "結束時間",TIMESTAMPDIFF(SECOND,min(BEGIN_TIME),max(END_TIME)) as "時間消耗(秒)" from delete_batch_log group by SCHEMA_NAME,TABLE_NAME;
*************************** 1. row ***************************
SCHEMA_NAME: demo
TABLE_NAME: g_device_action
最小批次: 1
最大批次: 5415
刪除記錄總數: 51534336
開始時間: 2020-07-16 10:56:46.347799
結束時間: 2020-07-16 11:00:29.617498
時間消耗(秒): 223
1 row in set (0.01 sec)
大表通過以上方式刪除大量資料後,磁碟表空間並不會釋放,需要將表進行收縮,該操作根據表空間的大小執行時間不同,以當前測試環境為例,表空間大小為32G,刪除了5000萬資料,耗時約1分鐘
alter table g_device_action engine=innodb;
刪除5000萬條記錄,基於statement模式產生的binlog約20M
校驗命令
select count(*),CONV(bit_xor(crc32(concat(id,uid,domain_id,machine,app_type,app_id,action_time,action_status,source,url))),10,16) as crc32_values from g_device_action where id>=12520001 and id<12530001;
MySQL 列轉行
set global group_concat_max_len=102400;
set group_concat_max_len=102400;
SELECT @@global.group_concat_max_len;
SELECT @@group_concat_max_len;
select table_name,concat(group_concat(COLUMN_NAME order by ORDINAL_POSITION separator ',')) as all_columns
from information_schema.COLUMNS tb1
where table_schema='demo'
and table_name='g_device_action'
group by table_name;
轉載請說明出處
|QQ:[email protected]