如何正確的使用一條SQL刪除重復(fù)數(shù)據(jù)

數(shù)據(jù)庫中表存在重復(fù)數(shù)據(jù),需要清理重復(fù)數(shù)據(jù),清理后保留其中一條的情況是比較常見的需求,如何通過1條SQL準(zhǔn)確的刪除數(shù)據(jù)呢?
1. 創(chuàng)建表及測試數(shù)據(jù)
1.1 數(shù)據(jù)庫中創(chuàng)建一張測試表
CREATE?TABLE?`test`?(`id` INT NOT NULL AUTO_INCREMENT,`c1` VARCHAR(20) DEFAULT NULL,`c2` VARCHAR(20) DEFAULT NULL,`c3` INT DEFAULT NULL,`c4` DATETIME DEFAULT NULL,PRIMARY KEY (`id`)) ENGINE=INNODB DEFAULT CHARSET=utf8;
1.2 插入測試數(shù)據(jù)
INSERT INTO test(c1,c2,c3,c4) VALUES( 'a','b',10, '2022-05-24 18:00:46'),('a','c',20, '2022-05-24 18:00:46');INSERT INTO test(c1,c2,c3,c4) VALUES( 'a','c',10, '2022-05-24 18:00:46'),('a','b',20, '2022-05-24 18:00:46');INSERT INTO test(c1,c2,c3,c4) VALUES( 'b','c',10, '2022-05-24 18:00:46'),('d','b',20, '2022-05-24 18:00:46');INSERT INTO test(c1,c2,c3,c4) VALUES( 'b','c',20, '2022-05-24 18:00:46'),('d','b',30, '2022-05-24 18:00:46');INSERT INTO test(c1,c2,c3,c4) VALUES( 'b','c',20, '2022-05-24 18:00:46'),('a','b',40, '2022-05-24 18:00:46');INSERT INTO test(c1,c2,c3,c4) VALUES( 'd','b',40, '2022-05-24 18:00:46'),('r','f',40, '2022-05-24 18:00:46');
1.3 查看重復(fù)數(shù)據(jù)
例如c1,c2 這2個字段組合作為唯一條件,則查詢重復(fù)數(shù)據(jù)的SQL如下
SELECTc1,c2,COUNT(*)FROMtestGROUP BY c1,c2HAVING COUNT(*) > 1;
可見,結(jié)果如下:

2. 如何刪除重復(fù)數(shù)據(jù)
2.1? 方案一
很多研發(fā)同學(xué)習(xí)慣的思路如下:
先查出重復(fù)的記錄(使用in)
再查出在重復(fù)記錄但id不在每組id最大值的記錄
直接將select 改為delete進(jìn)行刪除
查詢SQL如下
SELECT * FROM testWHERE (c1,c2) IN (SELECT c1,c2FROM testGROUP BY c1,c2HAVING COUNT(*)>1 )AND id NOT IN (SELECT MAX(id)FROM testGROUP BY c1,c2HAVING COUNT(*)>1)ORDER BY c1,c2;

看上去比較符合結(jié)果了,但是改為delete執(zhí)行的時候結(jié)果如下:
--??delete SQLDELETE FROM testWHERE (c1,c2) IN (SELECT c1,c2FROM testGROUP BY c1,c2HAVING COUNT(*)>1 )AND id NOT IN (SELECT MAX(id)FROM testGROUP BY c1,c2HAVING COUNT(*)>1)
出現(xiàn)報錯信息:
錯誤代碼:1093You can't specify target table 'test' for update in FROM clause
也就是說MySQL里需刪除的目標(biāo)表在in子查詢中時,不能直接執(zhí)行刪除操作。
3. 推薦寫法
基于以上情況,使用單條SQL刪除的方式如下:
查詢SQL:
SELECT a.*FROM test a ,(SELECT c1,c2,MAX(id)id FROM test GROUP BY c1,c2 HAVING COUNT(*)>1)bWHERE a.c1=b.c1 AND a.c2=b.c2AND?a.id?<>b.id

刪除SQL
DELETE aFROM test a ,(SELECT c1,c2,MAX(id)id FROM test GROUP BY c1,c2 HAVING COUNT(*)>1)bWHERE a.c1=b.c1 AND a.c2=b.c2AND a.id <>b.id
結(jié)果:
查詢:delete a FROM test a , (select c1,c2,max(id)id from test group by c1,c2 having count(*)>1)b where a.c1=b.c1 and a.c2=b.c2 and a.... 共 7 行受到影響
刪除后數(shù)據(jù)如下:

無重復(fù)數(shù)據(jù)了。

1.??MySQL高可用之MHA集群部署
2.??mysql8.0新增用戶及加密規(guī)則修改的那些事
3.? 比hive快10倍的大數(shù)據(jù)查詢利器-- presto
4. ?監(jiān)控利器出鞘:Prometheus+Grafana監(jiān)控MySQL、Redis數(shù)據(jù)庫
5.? PostgreSQL主從復(fù)制--物理復(fù)制
6.??MySQL傳統(tǒng)點位復(fù)制在線轉(zhuǎn)為GTID模式復(fù)制



