MongoDB sharding分片
背景
當MongoDB儲存海量的資料時,一臺機器可能不足以儲存資料,也可能不足以提供可接受的讀寫吞吐量。這時,我們就可以通過在多臺機器上分割資料,使得資料庫系統能儲存和處理更多的資料。
1、MongoDB sharding簡介
三種角色:
配置伺服器(config):是一個獨立的mongod程序,儲存叢集和分片的元資料,即各分片包含了哪些資料的資訊。
路由伺服器(mongos):起到一個路由的功能,供程式連線。本身不儲存資料,在啟動時從配置伺服器載入叢集資訊.
分片伺服器(sharding):是一個獨立mongod程序,儲存資料資訊。可以是一臺伺服器,如果想要高可用也可以配置成副本集。
2、實驗環境
兩臺機器的IP:
172.16.101.54 sht-sgmhadoopcm-01
172.16.101.55 sht-sgmhadoopnn-01
config server:
172.16.101.55:27017
mongos:
172.16.101.55:27018
sharding:
172.16.101.54:27017
172.16.101.54:27018
172.16.101.54:27019
2、啟動config服務
修改配置伺服器的配置檔案,主要是引數clusterRole指定角色為configsvr
[[email protected]mongodb]# cat /etc/mongod27017.conf systemLog: destination: file path: "/usr/local/mongodb/log/mongod27017.log" logAppend: true storage: dbPath: /usr/local/mongodb/data/db27017 journal: enabled: true processManagement: fork: true pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid net: port: 27017 bindIp: 0.0.0.0 setParameter: enableLocalhostAuthBypass: false sharding: clusterRole: configsvr archiveMovedChunks: true
[[email protected] mongodb]# bin/mongod --config /etc/mongod27017.conf
warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default
about to fork child process, waiting until server is ready for connections.
forked process: 31033
child process started successfully, parent exiting
3、啟動mongos服務
修改路由伺服器的配置檔案,主要是引數configDB指定config伺服器的IP和port,不需要配置有關資料檔案的資訊,因為路由伺服器不儲存資料
[[email protected] mongodb]# cat /etc/mongod27018.conf systemLog: destination: file path: "/usr/local/mongodb/log/mongod27018.log" logAppend: true processManagement: fork: true pidFilePath: /usr/local/mongodb/data/db27018/mongod27018.pid net: port: 27018 bindIp: 0.0.0.0 setParameter: enableLocalhostAuthBypass: false sharding: autoSplit: true configDB: 172.16.101.55:27017 chunkSize: 64
[[email protected] mongodb]# bin/mongos --config /etc/mongod27018.conf
warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default
2018-11-10T18:57:13.705+0800 W SHARDING running with 1 config server should be done only for testing purposes and is not recommended for production
about to fork child process, waiting until server is ready for connections.
forked process: 31167
child process started successfully, parent exiting
4、啟動sharding服務
就是一個普通的mongodb程序,普通的配置檔案
[[email protected] mongodb]# cat /etc/mongod27017.conf systemLog: destination: file path: "/usr/local/mongodb/log/mongod27017.log" logAppend: true storage: dbPath: /usr/local/mongodb/data/db27017 journal: enabled: true processManagement: fork: true pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid net: port: 27017 bindIp: 0.0.0.0 setParameter: enableLocalhostAuthBypass: false
[[email protected] mongodb]# bin/mongod --config /etc/mongod27017.conf
[[email protected] mongodb]# bin/mongod --config /etc/mongod27018.conf
[[email protected] mongodb]# bin/mongod --config /etc/mongod27019.conf
5、登陸mongos服務並新增sharding資訊
[[email protected] mongodb]# bin/mongo --port=27018
mongos> sh.addShard("172.16.101.54:27017")
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard("172.16.101.54:27018")
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.addShard("172.16.101.54:27019")
{ "shardAdded" : "shard0002", "ok" : 1 }
檢視叢集分片資訊
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4") } shards: { "_id" : "shard0000", "host" : "172.16.101.54:27017" } { "_id" : "shard0001", "host" : "172.16.101.54:27018" } { "_id" : "shard0002", "host" : "172.16.101.54:27019" } balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" }
mongos> db.runCommand({listshards:1}) { "shards" : [ { "_id" : "shard0000", "host" : "172.16.101.54:27017" }, { "_id" : "shard0001", "host" : "172.16.101.54:27018" }, { "_id" : "shard0002", "host" : "172.16.101.54:27019" } ], "ok" : 1 }
6、開啟分片
需要執行分片的庫和集合,以及分片模式,分片模式分為兩種hash和range
[[email protected] mongodb]# bin/mongo --port=27018
分片庫是testdb
mongos> sh.enableSharding("testdb")
{ "ok" : 1 }
(1)hash分片模式測試
分片的集合是collection1,根據id進行hash分片
mongos> sh.shardCollection("testdb.collection1",{"_id":"hashed"})
{ "collectionsharded" : "testdb.collection1", "ok" : 1 }
共插入10個測試document
mongos> use testdb
switched to db testdb
mongos> for(var i=0;i<10;i++){db.collection1.insert({name:"jack"+i});}
WriteResult({ "nInserted" : 1 })
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4") } shards: { "_id" : "shard0000", "host" : "172.16.101.54:27017" } { "_id" : "shard0001", "host" : "172.16.101.54:27018" } { "_id" : "shard0002", "host" : "172.16.101.54:27019" } balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 2 : Success databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : false, "primary" : "shard0000" } { "_id" : "testdb", "partitioned" : true, "primary" : "shard0000" } testdb.collection1 shard key: { "_id" : "hashed" } chunks: shard0000 2 shard0001 2 shard0002 2 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2) { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3) { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4) { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5) { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6) { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7)
檢視每個sharding上的資料分佈情況db.collection.stats()
mongos> db.collection1.stats() { "sharded" : true, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "ns" : "testdb.collection1", "count" : 10, #共十個document資料 "numExtents" : 3, "size" : 480, "storageSize" : 24576, "totalIndexSize" : 49056, "indexSizes" : { "_id_" : 24528, "_id_hashed" : 24528 }, "avgObjSize" : 48, "nindexes" : 2, "nchunks" : 6, "shards" : { "shard0000" : { "ns" : "testdb.collection1", "count" : 0, "size" : 0, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "_id_hashed" : 8176 }, "ok" : 1 }, "shard0001" : { "ns" : "testdb.collection1", "count" : 6, "size" : 288, "avgObjSize" : 48, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "_id_hashed" : 8176 }, "ok" : 1 }, "shard0002" : { "ns" : "testdb.collection1", "count" : 4, "size" : 192, "avgObjSize" : 48, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "_id_hashed" : 8176 }, "ok" : 1 } }, "ok" : 1 }
分別登陸sharding節點檢視資料分佈,和通過命令db.collection.stats()看到的結果一致
可以發現節點27017上沒有資料,節點27018上有6個document,節點27019上有4個document,出現這種情況的原因可能是插入的資料量太小,沒有分佈均勻,資料量越大,分佈越均勻。
[[email protected] mongodb]# bin/mongo 172.16.101.54:27017/testdb
> db.collection1.find()
[[email protected] mongodb]# bin/mongo 172.16.101.54:27018/testdb
> db.collection1.find()
{ "_id" : ObjectId("5be6c467e6467cc8077da816"), "name" : "jack1" }
{ "_id" : ObjectId("5be6c467e6467cc8077da817"), "name" : "jack2" }
{ "_id" : ObjectId("5be6c467e6467cc8077da818"), "name" : "jack3" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81a"), "name" : "jack5" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81c"), "name" : "jack7" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81e"), "name" : "jack9" }
[[email protected] mongodb]# bin/mongo 172.16.101.54:27019/testdb
> db.collection1.find()
{ "_id" : ObjectId("5be6c467e6467cc8077da815"), "name" : "jack0" }
{ "_id" : ObjectId("5be6c467e6467cc8077da819"), "name" : "jack4" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81b"), "name" : "jack6" }
{ "_id" : ObjectId("5be6c467e6467cc8077da81d"), "name" : "jack8" }
(2)range分片模式測試
分片的集合是collection2,根據name進行range分片
mongos> sh.shardCollection("testdb.collection2",{"name":1})
{ "collectionsharded" : "testdb.collection2", "ok" : 1 }
mongos> for(var i=0;i<1000;i++){db.collection2.insert({name:"jack"+i});}
WriteResult({ "nInserted" : 1 })
mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4") } shards: { "_id" : "shard0000", "host" : "172.16.101.54:27017" } { "_id" : "shard0001", "host" : "172.16.101.54:27018" } { "_id" : "shard0002", "host" : "172.16.101.54:27019" } balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 4 : Success databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : false, "primary" : "shard0000" } { "_id" : "testdb", "partitioned" : true, "primary" : "shard0000" } testdb.collection1 shard key: { "_id" : "hashed" } chunks: shard0000 2 shard0001 2 shard0002 2 { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2) { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3) { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4) { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5) { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6) { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7) testdb.collection2 shard key: { "name" : 1 } chunks: shard0000 1 shard0001 1 shard0002 1 { "name" : { "$minKey" : 1 } } -->> { "name" : "jack1" } on : shard0001 Timestamp(2, 0) { "name" : "jack1" } -->> { "name" : "jack5" } on : shard0002 Timestamp(3, 0) { "name" : "jack5" } -->> { "name" : { "$maxKey" : 1 } } on : shard0000 Timestamp(3, 1)
mongos> db.collection2.stats() { "sharded" : true, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "ns" : "testdb.collection2", "count" : 1000, "numExtents" : 6, "size" : 48032, "storageSize" : 221184, "totalIndexSize" : 130816, "indexSizes" : { "_id_" : 65408, "name_1" : 65408 }, "avgObjSize" : 48.032, "nindexes" : 2, "nchunks" : 3, "shards" : { "shard0000" : { "ns" : "testdb.collection2", "count" : 555, "size" : 26656, "avgObjSize" : 48, "numExtents" : 3, "storageSize" : 172032, "lastExtentSize" : 131072, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 65408, "indexSizes" : { "_id_" : 32704, "name_1" : 32704 }, "ok" : 1 }, "shard0001" : { "ns" : "testdb.collection2", "count" : 1, "size" : 48, "avgObjSize" : 48, "numExtents" : 1, "storageSize" : 8192, "lastExtentSize" : 8192, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 16352, "indexSizes" : { "_id_" : 8176, "name_1" : 8176 }, "ok" : 1 }, "shard0002" : { "ns" : "testdb.collection2", "count" : 444, "size" : 21328, "avgObjSize" : 48, "numExtents" : 2, "storageSize" : 40960, "lastExtentSize" : 32768, "paddingFactor" : 1, "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.", "userFlags" : 1, "capped" : false, "nindexes" : 2, "totalIndexSize" : 49056, "indexSizes" : { "_id_" : 24528, "name_1" : 24528 }, "ok" : 1 } }, "ok" : 1 }
參考連結