MySQL去重該使用distinct還是group by？

阿新 • • 發佈：2020-05-18

前言

關於group by 與distinct 效能對比:網上結論如下，不走索引少量資料distinct效能更好，大資料量group by 效能好，走索引group by效能好。走索引時分組種類少distinct快。關於網上的結論做一次驗證。

準備階段遮蔽查詢快取

檢視MySQL中是否設定了查詢快取。為了不影響測試結果，需要關閉查詢快取。

show variables like '%query_cache%';

在這裡插入圖片描述

檢視是否開啟查詢快取決定於query_cache_type和query_cache_size。

方法一：關閉查詢快取需要找到my.ini，修改query_cache_type

需要修改C:\ProgramData\MySQL\MySQL Server 5.7\my.ini配置檔案，修改query_cache_type=0或2。
方法二：設定query_cache_size為0，執行以下語句。

set global query_cache_size = 0;

方法三：如果你不想關閉查詢快取，也可以在使用RESET QUERY CACHE。

現在測試環境中query_cache_type=2代表按需進行查詢快取，預設的查詢方式是不會進行快取，如需快取則需要在查詢語句中加上sql_cache。

資料準備

t0表存放10W少量種類少的資料

drop table if exists t0;
create table t0(
id bigint primary key auto_increment,a varchar(255) not null
) engine=InnoDB default charset=utf8mb4 collate=utf8mb4_bin;
1
2
3
4
5
drop procedure insert_t0_simple_category_data_sp;
delimiter //
create procedure insert_t0_simple_category_data_sp(IN num int)
begin
set @i = 0;
while @i < num do
	insert into t0(a) value(truncate(@i/1000,0));
 set @i = @i + 1;
end while;
end
//
call insert_t0_simple_category_data_sp(100000);

t1表存放1W少量種類多的資料

drop table if exists t1;
create table t1 like t0;
1
2
drop procedure insert_t1_complex_category_data_sp;
delimiter //
create procedure insert_t1_complex_category_data_sp(IN num int)
begin
set @i = 0;
while @i < num do
	insert into t1(a) value(truncate(@i/10,0));
 set @i = @i + 1;
end while;
end
//
call insert_t1_complex_category_data_sp(10000);

t2表存放500W大量種類多的資料

drop table if exists t2;
create table t2 like t1;
1
2
drop procedure insert_t2_complex_category_data_sp;
delimiter //
create procedure insert_t2_complex_category_data_sp(IN num int)
begin
set @i = 0;
while @i < num do
	insert into t1(a) value(truncate(@i/10,0));
 set @i = @i + 1;
end while;
end
//
call insert_t2_complex_category_data_sp(5000000);

測試階段

驗證少量種類少資料

未加索引

set profiling = 1;
select distinct a from t0;
show profiles;
select a from t0 group by a;
show profiles;
alter table t0 add index `a_t0_index`(a);

在這裡插入圖片描述

由此可見：少量種類少資料下，未加索引，distinct和group by效能相差無幾。

加索引

alter table t0 add index `a_t0_index`(a);

執行上述類似查詢後

在這裡插入圖片描述

由此可見：少量種類少資料下，加索引，distinct和group by效能相差無幾。

驗證少量種類多資料未加索引

執行上述類似未加索引查詢後

在這裡插入圖片描述

由此可見：少量種類多資料下，未加索引，distinct比group by效能略高，差距並不大。

加索引

alter table t1 add index `a_t1_index`(a);

執行類似未加索引查詢後

在這裡插入圖片描述

由此可見：少量種類多資料下，加索引，distinct和group by效能相差無幾。

驗證大量種類多資料

未加索引

SELECT count(1) FROM t2;

在這裡插入圖片描述

執行上述類似未加索引查詢後

在這裡插入圖片描述

由此可見：大量種類多資料下，未加索引，distinct比group by效能高。

加索引

alter table t2 add index `a_t2_index`(a);

執行上述類似加索引查詢後

在這裡插入圖片描述

由此可見：大量種類多資料下，加索引，distinct和group by效能相差無幾。

總結效能比少量種類少少量種類多大量種類多未加索引相差無幾distinct略優distinct更優加索引相差無幾相差無幾相差無幾

去重場景下，未加索引時，更偏向於使用distinct，而加索引時，distinct和group by兩者都可以使用。

總結

到此這篇關於MySQL去重該使用distinct還是group by？的文章就介紹到這了,更多相關mysql 去重distinct group by內容請搜尋我們以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援我們！

MySQL去重中 distinct 和 group by 的區別

技術標籤：資料庫mysql資料庫sql 開題：SQL中的 group by 和 distinct 瞭解嗎，簡單介紹一下？

MySQL去重該使用distinct還是group by？

前言關於group by 與distinct 效能對比:網上結論如下，不走索引少量資料distinct效能更好，大資料量group by 效能好，走索引group by效能好。走索引時分組種類少distinct快。關於網上的結論做一次驗證。

Mysql中distinct與group by的去重方面的區別

distinct簡單來說就是用來去重的，而group by的設計目的則是用來聚合統計的，兩者在能夠實現的功能上有些相同之處，但應該仔細區分。

MySQL去重與子查詢and聚合和group by 分組

去重與子查詢and聚合 1、DISTINCT 去重 select distinct 欄位名from 表名 select distinct age from user;

十九、MySQL中DISTINCT與GROUP BY計數原理分析

參考連結： MySQL中DISTINCT與GROUP BY計數原理分析通常，我們要統計一個欄位有幾種值有兩種方法：在語句中使用DISTINCT或者GROUP BY，配合count進行查詢。例如：

將MySQL去重操作優化到極致的操作方法

•問題提出源表t_source結構如下： item_id int,created_time datetime,modified_time datetime,item_name varchar(20),other varchar(20)

一條sql語句完成MySQL去重留一

前幾天在做一個需求的時候，需要清理mysql中重複的記錄，當時的想法是通過程式碼遍歷寫出來，然後覺得太複雜，心裡想著應該可以通過一個sql語句來解決問題的。查了資料，請教了大佬之後得出了一個很便利的sql語句，這

mysql去重多個欄位

mysql去重多個欄位源表： select distinct id , name, phone from chongfubiao_quchong;select distinct * from chongfubiao_quchong;

mysql中先執行where還是group by

mysql中這些關鍵字是按照如下順序進行執行的：Where, Group By, Having, Order by。首先where將最原始記錄中不滿足條件的記錄刪除(所以應該在where語句中儘量的將不符合條件的記錄篩選掉，這樣可以減少分組的次數

Postgresql去重函式distinct的用法說明

在專案中我們常會對資料進行去重處理，有時候會用in或者EXISTS函式。或者通過group by也是可以實現查重

去重 list_C#黔驢技巧之去重（Distinct）

技術標籤：去重 list 關於C#中預設的Distinct方法在什麼情況下才能去重，這個就不用我再多講，針對集合物件去重預設實現將不再滿足，於是乎我們需要自定義實現來解決這個問題，接下來我們詳細講解幾種常見去重

MySQL去重插入方法

技術標籤：MySQLmysql去重插入 1. 背景向資料庫插入資料，希望去重插入。 2. 操作一下

一條sql語句完成MySQL去重留保留一條記錄

Mysql重複資料sql去重保留一條資料 DELETE consum_record FROM consum_record, ( SELECT min(id) id,

SQL去重語句【distinct】和【group by】究竟用哪個？

技術標籤：能工巧匠mysqlsql資料庫我是幫助主人快速定位的目錄錄~ 問題丟擲distinct和group by的用法distinctgroup by

mysql concat 唯一鍵 count distinct 去重

本文連結：https://www.cnblogs.com/tujia/p/13717931.html 一、背景事情是這樣的：需求要求按月的時間維度統計商品的被諮詢情況，但諮詢量需要按天去重的，即一個會員一天內向客服小姐姐放了N次商品連結，商品諮詢

mysql中去重 distinct 用法

用distinct來返回不重複的欄位：select distinct Email from user_info; distinct Email ,name 這樣的mysql 會過濾掉Email 和name 兩個欄位都重複的記錄，如果sql這樣寫：select Email ,distinct name from u

解決大於5.7版本mysql的分組報錯Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated

原因: 　　 MySQL 5.7.5和up實現了對功能依賴的檢測。如果啟用了only_full_group_by SQL模式(在預設情況下是這樣)，那麼MySQL就會拒絕選擇列表、條件或順序列表引用的查詢，這些查詢將引用組中未命名的非聚合列，而不

mysql自聯去重的一些筆記記錄

我先把功能場景簡要闡述下：資料行欄位如下： name started_at type 在這張表裡，name有重複值

MySQL資料表合併去重的簡單實現方法

場景：爬取的資料生成資料表，結構與另一個主表相同，需要進行合併+去重解決：（直接舉例）

MySQL group by對單字分組序和多欄位分組的方法講解

我這裡建立了一個 goods 表，先看下里面的資料： mysql> select * from goods; +----+------+------+------------+-------------+------------+

MySQL去重該使用distinct還是group by？

相關推薦