再戰mysql 數據去重
阿新 • • 發佈:2018-11-13
去重 dna 數據去重 value rop 現在 order by drop rep
年初時,寫過一篇去重的,在小表中還能用用,在大表中真的是效率低下,現在給了一次優化
https://www.cnblogs.com/jarjune/p/8328013.html
繼上一篇文章
方法三:
DELIMITER // DROP PROCEDURE IF EXISTS delete_rows_2; CREATE PROCEDURE delete_rows_2(IN TABLENAME VARCHAR(50), IN FIELDNAMES VARCHAR(100), IN AUTOFIELD VARCHAR(50)) BEGIN DECLARE DELETE_TABLE_ROWS_SQL VARCHAR(1000); SET DELETE_TABLE_ROWS_SQL = CONCAT(' DELETE FROM ', TABLENAME ,' WHERE (', FIELDNAMES ,') IN ( SELECT ', FIELDNAMES ,' FROM ( SELECT ', FIELDNAMES ,' FROM ', TABLENAME ,' GROUP BY ', FIELDNAMES ,' HAVING COUNT(1) > 1 ) t1 ) AND ', AUTOFIELD ,' NOT IN ( SELECT ', AUTOFIELD ,' FROM ( SELECT MAX(', AUTOFIELD ,') ', AUTOFIELD ,' FROM ', TABLENAME ,' GROUP BY ', FIELDNAMES ,' HAVING COUNT(1) > 1 ) t2 ) '); SET @DELETE_TABLE_ROWS_SQL = DELETE_TABLE_ROWS_SQL; PREPARE DELETE_TABLE_ROWS_SQL_PRE FROM @DELETE_TABLE_ROWS_SQL; EXECUTE DELETE_TABLE_ROWS_SQL_PRE; END// DELIMITER ; CALL delete_rows_1('表名', '字段1,字段2,字段3...', '主鍵(唯一)字段');
之後發現刪除的效率還是挺低,又優化成
方法三(優化):
DELIMITER // DROP PROCEDURE IF EXISTS delete_rows_2; CREATE PROCEDURE delete_rows_2(IN TABLENAME VARCHAR(50), IN FIELDNAMES VARCHAR(100), IN AUTOFIELD VARCHAR(50)) BEGIN DECLARE DELETE_TABLE_ROWS_SQL VARCHAR(1000); SET DELETE_TABLE_ROWS_SQL = CONCAT(' DELETE FROM ', TABLENAME ,' WHERE ', AUTOFIELD ,' IN ( SELECT ', AUTOFIELD ,' FROM ( SELECT ', AUTOFIELD ,' FROM ', TABLENAME ,' WHERE (', FIELDNAMES ,') IN ( SELECT ', FIELDNAMES ,' FROM ', TABLENAME ,' GROUP BY ', FIELDNAMES ,' HAVING COUNT(1) > 1 ) AND ', AUTOFIELD ,' NOT IN ( SELECT MAX(', AUTOFIELD ,') FROM ', TABLENAME ,' GROUP BY ', FIELDNAMES ,' HAVING COUNT(1) > 1 ) ) t2 ) '); SET @DELETE_TABLE_ROWS_SQL = DELETE_TABLE_ROWS_SQL; PREPARE DELETE_TABLE_ROWS_SQL_PRE FROM @DELETE_TABLE_ROWS_SQL; EXECUTE DELETE_TABLE_ROWS_SQL_PRE; END// DELIMITER ; CALL delete_rows_2('表名', '字段1,字段2,字段3...', '主鍵字段');
由於上述都要group by 兩次,又換了一種思路
方法四
delete t1 FROM l_weijij_47 t1, ( SELECT f01, f02, f03, MAX(seq_value) seq_value FROM l_weijij_47 GROUP BY f01, f02, f03 HAVING COUNT(1) > 1 ORDER BY NULL ) t2 where t1.f01 = t2.f01 AND t1.f02 = t2.f02 AND t1.f03 = t2.f03 and t1.seq_value < t2.seq_value
註:group by默認會進行排序,所以要加上order by NULL就避免了排序
綜上,方法四是目前在用的去重。
再戰mysql 數據去重