mongodb 聚合詳解

阿新 • • 發佈：2019-01-22

聚合（aggregate）框架提供一種方法來計算彙總值，雖然對映化簡是強大的，但它往往比簡單的彙總任務更困難，如欄位值總和或平均值。MongoDB的聚合框架實現sum()、avg()、group by等聚合操作。通過聚合框架，還可對返回的結果進行處理，實現一些特殊需求，例如資料過濾、別名顯示、增加欄位、提取子欄位等。

1、聚合框架的基礎

管道和表示式

1.1管道：

集合管道從一個管道運算子流運文件到下一個的方式來處理文件。管道與unix管道類似，實質就是把掃描的資料輸入聚合程序，進行一些過濾、分組、求和等操作，這些操作是通過管道操作符完成的。在管道中，管道運算子可以重複。所有的管道操作符都是處理文件流的，如果操作符掃描 collection 管道將起作用，並且將所有匹配的文件插入管道的”頂部”，管道里操作符轉換通過管道的每個文件。管道操作符不需要為每一個輸入文件生成一個輸出文件:操作符也可能會產生新的文件或過濾文件。管道不能運算以下型別的值: Binary, Symbol, MinKey, MaxKey, DBRef, Code, 以及 CodeWScope。

$project：用於設定資料的篩選欄位，就像我們SQL中select需要的欄位一樣。例子：

	db.runCommand({ aggregate : "article", pipeline : [
 	   { $match : { author : "dave" } },
	   { $project : {
	        _id : 0,
		author : 1,
	        tags : 1
	    }}
	]});

上面就是將所有author為dave的記錄的author和tags兩個欄位取出來。（_id:0 表示去掉預設會返回的_id欄位）。利用find（）功能實現：

	> db.article.find({ author : "dave" }, { _id : 0, author : 1, tags : 1);

$match：$match的作用是過濾資料，通過設定一個條件，將資料進行篩選過濾。例子：

	db.runCommand({ aggregate : "article", pipeline : [
	    { $match : { author : "dave" } }
	]});

這相當於將article這個collection中的記錄進行篩選，篩選條件是author屬性值為dave，其作用其實相當於普通的find命令，如：

	> db.article.find({ author : "dave" });

與find不同，find的結果是直接作為最終資料返回，而$match只是pipeline中的一環，它篩選的結果資料可以再進行下一級的統計操作。

$limit：限制了的文件數量看一下由從當前位置開始的給定數。例子：

	>db.mycol.find({},{"title":1,_id:0}).limit(2)
	{"title":"MongoDB Overview"}
	{"title":"NoSQL Overview"}
	>

$skip：在聚合管道中跳過指定數量的文件，並返回餘下的文件。例子：

	>db.mycol.find({},{"title":1,_id:0}).limit(1).skip(1)
	{"title":"NoSQL Overview"}
	>

$unwind：將文件中的某一個數組型別欄位拆分成多條，每條包含陣列中的一個值。例子：

比如你使用下面命令新增一條記錄：

	db.article.save( {
	    title : "this is your title" ,
	    author : "dave" ,
	    posted : new Date(4121381470000) ,
	    pageViews : 7 ,
	    tags : [ "fun" , "nasty" ] ,
	    comments : [
	        { author :"barbara" , text : "this is interesting" } ,
	        { author :"jenny" , text : "i like to play pinball", votes: 10 }
	    ],
	    other : { bar : 14 }
	});

這裡面tags欄位就是一個array。下面我們在這個欄位上應用$unwind操作

	db.runCommand({ aggregate : "article", pipeline : [
	    { $unwind : "$tags" }
	]});

$group：按某一個key將key值相同的多條資料組織成一條。例子：

	db.runCommand({ aggregate : "article", pipeline : [
	    { $unwind : "$tags" },
	    { $group : {
		_id : "$tags",
	        count : { $sum : 1 },
		authors : { $addToSet : "$author" }
	    }}
	]});

$sort：將輸入文件排序後輸出

1.2表示式：

表示式基於對輸入檔案進行計算而生成輸出檔案. 聚合框架定義表示式中使用的一種檔案格式，使用字首。表示式是無狀態的，並只在被集合程序看到時才計算。管道里的表示式只能操作當前文件，無法整合其他文件中的資料。

表示式：$sum、$avg、$min、$max、$push、$addToSet、$first、$last
$sum：總結從集合中的所有檔案所定義的值。例子：

	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}])

$avg：從所有文件集合中所有給定值計算的平均。

	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}])

$min：獲取集合中的所有檔案中的相應值最小。

	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}])

$max：獲取集合中的所有檔案中的相應值的最大。

	db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}])

$push：值插入到一個數組生成文件中。

	db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}])

$addToSet：值插入到一個數組中所得到的文件，但不會建立重複。

	db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}])

$first：根據分組從源文件中獲取的第一個文件。通常情況下，這才有意義，連同以前的一些應用 “$sort”-stage。

	db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}])

$last：根據分組從源文件中獲取最後的文件。通常，這才有意義，連同以前的一些應用 “$sort”-stage。

	db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}])

2、聚合使用

在 mongo 殼或者資料庫命令 aggregate 裡使用 aggregate() 包裝器呼叫聚合`。通常在集合物件裡呼叫 :method:`~db.collection.aggregate() 來決定集合管道輸入文件。:method:~db.collection.aggregate() 方法的引數指定一系列管道運算元,其中每個操作符可以有許多運算元。

資料：

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview', 
   description: 'MongoDB is no sql database',
   by_user: 'w3cschool.cc',
   url: 'http://www.w3cschool.cc',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview', 
   description: 'No sql database is very fast',
   by_user: 'w3cschool.cc',
   url: 'http://www.w3cschool.cc',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview', 
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},

使用及結果：

> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
   "result" : [
      {
         "_id" : "w3cschool.cc",
         "num_tutorial" : 2
      },
      {
         "_id" : "Neo4j",
         "num_tutorial" : 1
      }
   ],
   "ok" : 1
}
>

$push 和$addToSet 區別

$addToSet同樣可以向陣列中新增元素，與$push不同的是$addToSet會保證元素的唯一性，利用$unwind可以在聚合中實現distinct的功能，這裡是stackoverflow 上的一篇介紹，點選。

mongodb 聚合詳解

1、聚合框架的基礎

1.1管道：

1.2表示式：

2、聚合使用

mongodb 聚合詳解

MongoDB oplog詳解

MongoDb 查詢詳解

MongoDB 操作符詳解

mongoDB使用詳解（在node中使用）

MongoDB引數詳解之enableLocalhostAuthBypass

Etherchannel鏈路聚合詳解

MongoDB資料庫詳解

【.NET】繼承，組合，聚合詳解

Mongodb連線詳解

MongoDB 聚合管道查詢詳解 aggregate

MongoDB增刪改查操作詳解

MongoDB執行計劃分析詳解（1）

每篇半小時1天入門MongoDB——3.MongoDB可視化及shell詳解

UML類圖詳解_聚合關系

mongodb 用戶權限設置詳解

mongodb.conf配置文件詳解

mongodb 數據庫詳解

MongoDB 數據庫詳解，以及 MongoDB4.0版本的安裝

mongodb replica set 配置高性能多服務器詳解

mongodb 聚合詳解

1、聚合框架的基礎

1.1管道：

1.2表示式：

2、聚合使用

相關推薦