MONGODB05 - 通過MongoDB執行計劃檢視MongoDB find指令和aggreate指令執行過程

阿新 • • 發佈：2020-12-24

起因：開發過程中使用MongoDB，因為有一些關聯會使用到MongoDB的aggregate部分指令，但是在一些分頁查詢的場景，但是在因為aggregate指令順序變化導致查非所查的問題出現，我們通過分析MongoDB的執行計劃來看一下aggregate的執行過程以及它與find的區別

查詢語句如下：

db.classifiedOperationLog.aggregate([
	{$sort:{createDate:-1}},
	{$skip:0},
	{$limit:5},
	{$project:{_id:1,createDate:1}}
]);

一、find指令

上面的語句我們用find指令編寫如下：

db.classifiedOperationLog.find({},{_id:1,createDate:1})
	.sort({createDate:-1})
	.skip(0)
	.limit(5)

查詢的結果如下：

{ 
    "_id" : "6CCC129FC8BD4BA1B9F89B053E86112E", 
    "createDate" : ISODate("2020-12-24T11:25:04.675+0800")
}
{ 
    "_id" : "8B325EC1E7DA4AD390EC301EF5012BE0", 
    "createDate" : ISODate("2020-12-24T11:25:00.176+0800")
}
{ 
    "_id" : "498781F2606D4001977BDB8FAE038DF9", 
    "createDate" : ISODate("2020-12-24T11:24:54.748+0800")
}
{ 
    "_id" : "FBE41B432313469588CE5A80BAB7F7BB", 
    "createDate" : ISODate("2020-12-24T09:52:27.219+0800")
}
{ 
    "_id" : "37D2D451BC53489E945828559AE1EDC0", 
    "createDate" : ISODate("2020-12-24T08:57:12.702+0800")
}

我們看下find的執行計劃,執行命令 db.collection.explain():

db.classifiedOperationLog.find({},{_id:1,createDate:1})
	.sort({createDate:-1})
	.skip(0)
	.limit(5)
	.explain()

//預設情況下，explain包括2個部分，一個是queryPlanner，一個是serverInfo
//如果使用了executionStats或者allPlansExecution，則還會返回executionStats資訊
{ 
    "queryPlanner" : {
        "plannerVersion" : 1.0,							//查詢計劃版本
        "namespace" : "xxx.classifiedOperationLog", 	                        //被查詢物件
        "indexFilterSet" : false, 						//是否用到索引來過濾
        "parsedQuery" : { 							//解析查詢，即過濾條件是什麼

        }, 
        "winningPlan" : {							//最佳的執行計劃
            "stage" : "LIMIT", 							//使用limit限制返回數
            "limitAmount" : 5.0, 						//limit 限制數
            "inputStage" : {
                "stage" : "PROJECTION", 				        //使用 skip 進行跳過
                "transformBy" : {						//欄位過濾
                    "_id" : 1.0, 
                    "createDate" : 1.0
                }, 
                "inputStage" : {
                    "stage" : "FETCH", 					        //檢出文件
                    "inputStage" : {
                        "stage" : "IXSCAN", 			                //索引掃描,建立日期加了索引，此處排序走的是索引
                        "keyPattern" : {
                            "createDate" : 1.0
                        }, 
                        "indexName" : "createDate_1",                           //索引名稱
                        "isMultiKey" : false, 			                //是否複合索引
                        "multiKeyPaths" : {
                            "createDate" : [

                            ]
                        }, 
                        "isUnique" : false, 
                        "isSparse" : false, 
                        "isPartial" : false, 
                        "indexVersion" : 2.0, 
                        "direction" : "backward", 
                        "indexBounds" : {
                            "createDate" : [
                                "[MaxKey, MinKey]"
                            ]
                        }
                    }
                }
            }
        }, 
        "rejectedPlans" : [							//拒絕的執行計劃，此處沒有

        ]
    }, 
    "serverInfo" : {								//伺服器資訊，包括主機名，埠，版本等
        "host" : "node-0", 
        "port" : 28000.0, 
        "version" : "3.6.8", 
        "gitVersion" : "6bc9ed599c3fa164703346a22bad17e33fa913e4"
    }, 
    "ok" : 1.0, 
    "operationTime" : Timestamp(1608789303, 1)
}

其中stage常見的操作描述如下：

COLLSCAN 集合掃描
IXSCAN 索引掃描
FETCH 檢出文件
SHARD_MERGE 合併分片中結果
SHARDING_FILTER 分片中過濾掉孤立文件
LIMIT 使用limit 限制返回數
PROJECTION 使用 skip 進行跳過
IDHACK 針對_id進行查詢
COUNT 利用db.coll.explain().count()之類進行count運算
COUNTSCAN count不使用Index進行count時的stage返回
COUNT_SCAN count使用了Index進行count時的stage返回
SUBPLA 未使用到索引的$or查詢的stage返回
TEXT 使用全文索引進行查詢時候的stage返回
PROJECTION 限定返回欄位時候stage的返回

二、aggregate指令

在檢視aggregate聚合查詢之前我們看個有趣的現象，為了能演示效果，我們把skip的值改為2,把limit改為3

find查詢結果

db.classifiedOperationLog.find({},{_id:1,createDate:1})
	.sort({createDate:-1})
	.skip(2)
	.limit(3)

{ 
    "_id" : "498781F2606D4001977BDB8FAE038DF9", 
    "createDate" : ISODate("2020-12-24T11:24:54.748+0800")
}
{ 
    "_id" : "FBE41B432313469588CE5A80BAB7F7BB", 
    "createDate" : ISODate("2020-12-24T09:52:27.219+0800")
}
{ 
    "_id" : "37D2D451BC53489E945828559AE1EDC0", 
    "createDate" : ISODate("2020-12-24T08:57:12.702+0800")
}

aggregate復刻版

db.classifiedOperationLog.aggregate([
	{$sort:{createDate:-1}},
	{$skip:2},
	{$limit:3},
	{$project:{_id:1,createDate:1}}
]);

{ 
    "_id" : "498781F2606D4001977BDB8FAE038DF9", 
    "createDate" : ISODate("2020-12-24T11:24:54.748+0800")
}
{ 
    "_id" : "FBE41B432313469588CE5A80BAB7F7BB", 
    "createDate" : ISODate("2020-12-24T09:52:27.219+0800")
}
{ 
    "_id" : "37D2D451BC53489E945828559AE1EDC0", 
    "createDate" : ISODate("2020-12-24T08:57:12.702+0800")
}

可以看到aggregate查詢出來的結果和find結果一致，此時我們把 $sort 下移到 $limit 之後

db.classifiedOperationLog.aggregate([
	{$skip:2},
	{$limit:3},
	{$sort:{createDate:-1}},
	{$project:{_id:1,createDate:1}}
]);

{ 
    "_id" : "A3D65CCB5F7144F080B9B972A9595F04", 
    "createDate" : ISODate("2020-09-22T20:57:41.260+0800")
}
{ 
    "_id" : "7C7F4D28F2794B3CB4E361ACAF797646", 
    "createDate" : ISODate("2020-09-22T20:50:21.702+0800")
}
{ 
    "_id" : "4344982784CE454B83D73764986F65E1", 
    "createDate" : ISODate("2020-09-22T20:48:38.409+0800")
}

可以看到還是3條資料，還是倒敘排列，但是時間和ID都不一樣，結果產生了變差，貌似不是我們想要的結果，這個時候，我們再把 $skip移到 $limit下面

db.classifiedOperationLog.aggregate([	
	{$limit:3},
	{$skip:2},
	{$sort:{createDate:-1}},
	{$project:{_id:1,createDate:1}}
]);

{ 
    "_id" : "4344982784CE454B83D73764986F65E1", 
    "createDate" : ISODate("2020-09-22T20:48:38.409+0800")
}

資料變成了一條，且不在期望的查詢結果裡，我們再把 $skip移到 $sort下面

db.classifiedOperationLog.aggregate([	
	{$limit:3},
	{$sort:{createDate:-1}},
	{$skip:2},
	{$project:{_id:1,createDate:1}}
]);

{ 
    "_id" : "EA42C914214C4C18A6B788F897C5F29A"
}

資料又又又變了，但是隨著我們的嘗試，規律越來越清晰了，我們來看下aggregate的執行計劃

db.classifiedOperationLog.aggregate([
	{$sort:{createDate:-1}},
	{$skip:2},
	{$limit:3},
	{$project:{_id:1,createDate:1}}
],{explain:true});								//注意執行計劃的引數寫法

{ 
    "stages" : [								//查詢步驟
        {
            "$cursor" : {							//1、遊標查詢（索引排序）
                "query" : {							//查詢引數，這裡沒有

                }, 
                "sort" : {
                    "createDate" : NumberInt(-1)		                //排序
                }, 
                "limit" : NumberLong(5), 				        //查詢文件數，這裡比較有意思，是skip+limit數量之和
                "fields" : {							
                    "createDate" : NumberInt(1), 
                    "_id" : NumberInt(1)
                }, 
                "queryPlanner" : {						//與find類似不贅述
                    "plannerVersion" : NumberInt(1), 
                    "namespace" : "xx.classifiedOperationLog", 
                    "indexFilterSet" : false, 
                    "parsedQuery" : {

                    }, 
                    "winningPlan" : {					
                        "stage" : "FETCH", 
                        "inputStage" : {
                            "stage" : "IXSCAN", 		                //與find一致走的是索引掃描
                            "keyPattern" : {
                                "createDate" : NumberInt(1)
                            }, 
                            "indexName" : "createDate_1", 
                            "isMultiKey" : false, 
                            "multiKeyPaths" : {
                                "createDate" : [

                                ]
                            }, 
                            "isUnique" : false, 
                            "isSparse" : false, 
                            "isPartial" : false, 
                            "indexVersion" : NumberInt(2), 
                            "direction" : "backward", 
                            "indexBounds" : {
                                "createDate" : [
                                    "[MaxKey, MinKey]"
                                ]
                            }
                        }
                    }, 
                    "rejectedPlans" : [

                    ]
                }
            }
        }, 
        {
            "$skip" : NumberLong(2)				                  //2、skip跳過
        }, 
        {
            "$project" : {							  //3、過濾欄位
                "_id" : true, 
                "createDate" : true
            }
        }
    ], 
    "ok" : 1.0,
	"operationTime" : Timestamp(1608792723, 3), 
    "$gleStats" : {
        "lastOpTime" : Timestamp(0, 0), 
        "electionId" : ObjectId("7fffffff0000000000000002")
    }, 
    "$configServerState" : {
        "opTime" : {
            "ts" : Timestamp(1608792719, 3), 
            "t" : NumberLong(1)
        }
    }, 
    "$clusterTime" : {
        "clusterTime" : Timestamp(1608792723, 3), 
        "signature" : {
            "hash" : BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA="), 
            "keyId" : NumberLong(0)
        }
    }
}

其實aggregate其實是MongoDB的聚合管道，$project 、$limit、$sort都是管道操作符，MongoDB的聚合管道將MongoDB文件在一個管道處理完畢後將結果傳遞給下一個管道處理，管道操作是可以重複的。

ps：之前blog中分組去重計數就用了兩次group來實現，其實現原理就是管道重複。參考連結：MONGODB03 - 分組計數/分組去重計數（基於 spring-data-mongodb）

聚合框架中常用的幾個操作符：

$project：修改輸入文件的結構。可以用來重新命名、增加或刪除域，也可以用於建立計算結果以及巢狀文件。
$match：用於過濾資料，只輸出符合條件的文件。$match使用MongoDB的標準查詢操作。
$limit：用來限制MongoDB聚合管道返回的文件數。
$skip：在聚合管道中跳過指定數量的文件，並返回餘下的文件。
$unwind：將文件中的某一個數組型別欄位拆分成多條，每條包含陣列中的一個值。
$group：將集合中的文件分組，可用於統計結果。
$sort：將輸入文件排序後輸出。
$geoNear：輸出接近某一地理位置的有序文件。

瞭解了aggregate的執行原理我們再看一下點到順序的aggregate執行計劃

db.classifiedOperationLog.aggregate([
	{$skip:2},
	{$limit:3},
	{$sort:{createDate:-1}},
	{$project:{_id:1,createDate:1}}
],{explain:true});

{ 
    "stages" : [
        {
            "$cursor" : {						//1、遊標取數執行limit，取到5條資料，skip+limit數量之和
                "query" : {

                }, 
                "limit" : NumberLong(5), 	  		
                "fields" : {
                    "createDate" : NumberInt(1), 
                    "_id" : NumberInt(1)
                }, 
                "queryPlanner" : {					
                    "plannerVersion" : NumberInt(1), 
                    "namespace" : "xxx.classifiedOperationLog", 
                    "indexFilterSet" : false, 
                    "parsedQuery" : {

                    }, 
                    "winningPlan" : {
                        "stage" : "COLLSCAN", 	                         //集合掃描，取出limit數量文件
                        "direction" : "forward"
                    }, 
                    "rejectedPlans" : [

                    ]
                }
            }
        }, 
        {
            "$skip" : NumberLong(2)					//2、跳過兩條
        }, 
        {
            "$sort" : {							//3、對結果進行排序
                "sortKey" : {
                    "createDate" : NumberInt(-1)
                }
            }
        }, 
        {
            "$project" : {						//4、過濾欄位
                "_id" : true, 
                "createDate" : true
            }
        }
    ], 
    "ok" : 1.0, 
    "operationTime" : Timestamp(1608794303, 1), 
    "$gleStats" : {
        "lastOpTime" : Timestamp(0, 0), 
        "electionId" : ObjectId("7fffffff0000000000000002")
    }, 
    "$configServerState" : {
        "opTime" : {
            "ts" : Timestamp(1608794306, 3), 
            "t" : NumberLong(1)
        }
    }, 
    "$clusterTime" : {
        "clusterTime" : Timestamp(1608794306, 4), 
        "signature" : {
            "hash" : BinData(0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA="), 
            "keyId" : NumberLong(0)
        }
    }
}

通過上述的執行計劃分析，瞭解find和aggregate的工作機制，小夥伴可以根據需要選擇對應的指令，aggregate可利用通過引數順序和二次複用的特性滿足一些特定場景的需求。

參考連結:

https://www.runoob.com/mongodb/mongodb-aggregate.html

https://blog.csdn.net/user_longling/article/details/83957085

https://docs.mongodb.com/manual/aggregation/

MONGODB05 - 通過MongoDB執行計劃檢視MongoDB find指令和aggreate指令執行過程

一、find指令

二、aggregate指令

find查詢結果

aggregate復刻版

MONGODB05 - 通過MongoDB執行計劃檢視MongoDB find指令和aggreate指令執行過程

oceanabse執行計劃檢視

MongoDB快速入門-通過docker安裝MongoDB，MongoDB的基本操作，索引，執行計劃，SpringBoot整合MongoDB，MongoDB認證

通過SQLPLUS中的autotrace命令檢視執行計劃

MySQL中通過EXPLAIN如何分析SQL的執行計劃詳解

通過3分鐘快速掌握MongoDB中regex的幾種用法

MySQL 檢視執行計劃

不通過開發工具來檢視程式的執行日誌

MySQL Explain檢視執行計劃詳解

Mysql檢視執行計劃及索引使用

Dbeaver如何看Oralce執行計劃？解決: explain plan FOR 無效? 執行計劃的順序怎麼檢視?

透徹研究通過explain命令得到的SQL執行計劃（四）

檢視sql_如何在ClickHouse中檢視SQL執行計劃

mysql 建立materialized檢視_MySQL執行計劃

ClickHouse 高階（一）Explain 檢視執行計劃

HIVE高階(8):優化(8) Explain 檢視執行計劃

oracle檢視執行計劃explain plan FOR

通過shell指令碼來得到不穩定的執行計劃(r4筆記第40天)

Mysql中的explain檢視執行計劃

【Oracle效能優化】執行計劃與索引型別分析

MONGODB05 - 通過MongoDB執行計劃檢視MongoDB find指令和aggreate指令執行過程

一、find指令

二、aggregate指令

find查詢結果

aggregate復刻版

相關推薦