1. 程式人生 > >MongoDB sharding分片

MongoDB sharding分片

背景

當MongoDB儲存海量的資料時,一臺機器可能不足以儲存資料,也可能不足以提供可接受的讀寫吞吐量。這時,我們就可以通過在多臺機器上分割資料,使得資料庫系統能儲存和處理更多的資料。


1、MongoDB sharding簡介

三種角色:

配置伺服器(config):是一個獨立的mongod程序,儲存叢集和分片的元資料,即各分片包含了哪些資料的資訊。

路由伺服器(mongos):起到一個路由的功能,供程式連線。本身不儲存資料,在啟動時從配置伺服器載入叢集資訊.

分片伺服器(sharding):是一個獨立mongod程序,儲存資料資訊。可以是一臺伺服器,如果想要高可用也可以配置成副本集。


2、實驗環境

兩臺機器的IP:

172.16.101.54 sht-sgmhadoopcm-01

172.16.101.55 sht-sgmhadoopnn-01


config server:

172.16.101.55:27017


mongos:

172.16.101.55:27018


sharding:

172.16.101.54:27017

172.16.101.54:27018

172.16.101.54:27019


2、啟動config服務

修改配置伺服器的配置檔案,主要是引數clusterRole指定角色為configsvr

[[email protected]
 mongodb]# cat /etc/mongod27017.conf systemLog:    destination: file    path: "/usr/local/mongodb/log/mongod27017.log"    logAppend: true     storage:    dbPath: /usr/local/mongodb/data/db27017    journal:       enabled: true        processManagement:    fork: true    pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid net:    port: 27017    bindIp: 0.0.0.0     setParameter:    enableLocalhostAuthBypass: false     sharding:    clusterRole: configsvr    archiveMovedChunks: true

[[email protected] mongodb]# bin/mongod --config /etc/mongod27017.conf

warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default

about to fork child process, waiting until server is ready for connections.

forked process: 31033

child process started successfully, parent exiting


3、啟動mongos服務

修改路由伺服器的配置檔案,主要是引數configDB指定config伺服器的IP和port,不需要配置有關資料檔案的資訊,因為路由伺服器不儲存資料

[[email protected] mongodb]# cat /etc/mongod27018.conf
systemLog:
   destination: file
   path: "/usr/local/mongodb/log/mongod27018.log"
   logAppend: true
   
processManagement:
   fork: true
   pidFilePath: /usr/local/mongodb/data/db27018/mongod27018.pid
   
net:
   port: 27018
   bindIp: 0.0.0.0
   
setParameter:
   enableLocalhostAuthBypass: false
   
sharding:
   autoSplit: true
   configDB: 172.16.101.55:27017
   chunkSize: 64


[[email protected] mongodb]# bin/mongos --config /etc/mongod27018.conf

warning: bind_ip of 0.0.0.0 is unnecessary; listens on all ips by default

2018-11-10T18:57:13.705+0800 W SHARDING running with 1 config server should be done only for testing purposes and is not recommended for production

about to fork child process, waiting until server is ready for connections.

forked process: 31167

child process started successfully, parent exiting


4、啟動sharding服務

就是一個普通的mongodb程序,普通的配置檔案

[[email protected] mongodb]# cat /etc/mongod27017.conf
systemLog:
   destination: file
   path: "/usr/local/mongodb/log/mongod27017.log"
   logAppend: true
   
storage:
   dbPath: /usr/local/mongodb/data/db27017
   journal:
      enabled: true
      
processManagement:
   fork: true
   pidFilePath: /usr/local/mongodb/data/db27017/mongod27017.pid
   
net:
   port: 27017
   bindIp: 0.0.0.0
   
setParameter:
   enableLocalhostAuthBypass: false

[[email protected] mongodb]# bin/mongod --config /etc/mongod27017.conf

[[email protected] mongodb]# bin/mongod --config /etc/mongod27018.conf

[[email protected] mongodb]# bin/mongod --config /etc/mongod27019.conf


5、登陸mongos服務並新增sharding資訊

[[email protected] mongodb]# bin/mongo --port=27018

mongos> sh.addShard("172.16.101.54:27017")

{ "shardAdded" : "shard0000", "ok" : 1 }

mongos> sh.addShard("172.16.101.54:27018")

{ "shardAdded" : "shard0001", "ok" : 1 }

mongos> sh.addShard("172.16.101.54:27019")

{ "shardAdded" : "shard0002", "ok" : 1 }


檢視叢集分片資訊

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4")
}
  shards:
    {  "_id" : "shard0000",  "host" : "172.16.101.54:27017" }
    {  "_id" : "shard0001",  "host" : "172.16.101.54:27018" }
    {  "_id" : "shard0002",  "host" : "172.16.101.54:27019" }
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours:
        No recent migrations
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
mongos> db.runCommand({listshards:1})
{
    "shards" : [
        {
            "_id" : "shard0000",
            "host" : "172.16.101.54:27017"
        },
        {
            "_id" : "shard0001",
            "host" : "172.16.101.54:27018"
        },
        {
            "_id" : "shard0002",
            "host" : "172.16.101.54:27019"
        }
    ],
    "ok" : 1
}


6、開啟分片

需要執行分片的庫和集合,以及分片模式,分片模式分為兩種hash和range

[[email protected] mongodb]# bin/mongo --port=27018

分片庫是testdb

mongos> sh.enableSharding("testdb")

{ "ok" : 1 }


(1)hash分片模式測試

分片的集合是collection1,根據id進行hash分片

mongos> sh.shardCollection("testdb.collection1",{"_id":"hashed"})

{ "collectionsharded" : "testdb.collection1", "ok" : 1 }


共插入10個測試document

mongos> use testdb

switched to db testdb

mongos> for(var i=0;i<10;i++){db.collection1.insert({name:"jack"+i});}

WriteResult({ "nInserted" : 1 })

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4")
}
  shards:
    {  "_id" : "shard0000",  "host" : "172.16.101.54:27017" }
    {  "_id" : "shard0001",  "host" : "172.16.101.54:27018" }
    {  "_id" : "shard0002",  "host" : "172.16.101.54:27019" }
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours:
        2 : Success
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0000" }
    {  "_id" : "testdb",  "partitioned" : true,  "primary" : "shard0000" }
        testdb.collection1
            shard key: { "_id" : "hashed" }
            chunks:
                shard0000    2
                shard0001    2
                shard0002    2
            { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2)
            { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3)
            { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4)
            { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5)
            { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6)
            { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7)

檢視每個sharding上的資料分佈情況db.collection.stats()

mongos> db.collection1.stats()
{
    "sharded" : true,
    "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
    "userFlags" : 1,
    "capped" : false,
    "ns" : "testdb.collection1",
    "count" : 10,   #共十個document資料
    "numExtents" : 3,
    "size" : 480,
    "storageSize" : 24576,
    "totalIndexSize" : 49056,
    "indexSizes" : {
        "_id_" : 24528,
        "_id_hashed" : 24528
    },
    "avgObjSize" : 48,
    "nindexes" : 2,
    "nchunks" : 6,
    "shards" : {
        "shard0000" : {
            "ns" : "testdb.collection1",
            "count" : 0,
            "size" : 0,
            "numExtents" : 1,
            "storageSize" : 8192,
            "lastExtentSize" : 8192,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 2,
            "totalIndexSize" : 16352,
            "indexSizes" : {
                "_id_" : 8176,
                "_id_hashed" : 8176
            },
            "ok" : 1
        },
        "shard0001" : {
            "ns" : "testdb.collection1",
            "count" : 6,
            "size" : 288,
            "avgObjSize" : 48,
            "numExtents" : 1,
            "storageSize" : 8192,
            "lastExtentSize" : 8192,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 2,
            "totalIndexSize" : 16352,
            "indexSizes" : {
                "_id_" : 8176,
                "_id_hashed" : 8176
            },
            "ok" : 1
        },
        "shard0002" : {
            "ns" : "testdb.collection1",
            "count" : 4,
            "size" : 192,
            "avgObjSize" : 48,
            "numExtents" : 1,
            "storageSize" : 8192,
            "lastExtentSize" : 8192,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 2,
            "totalIndexSize" : 16352,
            "indexSizes" : {
                "_id_" : 8176,
                "_id_hashed" : 8176
            },
            "ok" : 1
        }
    },
    "ok" : 1
}

分別登陸sharding節點檢視資料分佈,和通過命令db.collection.stats()看到的結果一致

可以發現節點27017上沒有資料,節點27018上有6個document,節點27019上有4個document,出現這種情況的原因可能是插入的資料量太小,沒有分佈均勻,資料量越大,分佈越均勻。

[[email protected] mongodb]# bin/mongo 172.16.101.54:27017/testdb

> db.collection1.find()


[[email protected] mongodb]# bin/mongo 172.16.101.54:27018/testdb

> db.collection1.find()

{ "_id" : ObjectId("5be6c467e6467cc8077da816"), "name" : "jack1" }

{ "_id" : ObjectId("5be6c467e6467cc8077da817"), "name" : "jack2" }

{ "_id" : ObjectId("5be6c467e6467cc8077da818"), "name" : "jack3" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81a"), "name" : "jack5" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81c"), "name" : "jack7" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81e"), "name" : "jack9" }


[[email protected] mongodb]# bin/mongo 172.16.101.54:27019/testdb

> db.collection1.find()

{ "_id" : ObjectId("5be6c467e6467cc8077da815"), "name" : "jack0" }

{ "_id" : ObjectId("5be6c467e6467cc8077da819"), "name" : "jack4" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81b"), "name" : "jack6" }

{ "_id" : ObjectId("5be6c467e6467cc8077da81d"), "name" : "jack8" }


(2)range分片模式測試

分片的集合是collection2,根據name進行range分片

mongos> sh.shardCollection("testdb.collection2",{"name":1})

{ "collectionsharded" : "testdb.collection2", "ok" : 1 }

mongos> for(var i=0;i<1000;i++){db.collection2.insert({name:"jack"+i});}

WriteResult({ "nInserted" : 1 })

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("5be6b98a507b3e0370eb36b4")
}
  shards:
    {  "_id" : "shard0000",  "host" : "172.16.101.54:27017" }
    {  "_id" : "shard0001",  "host" : "172.16.101.54:27018" }
    {  "_id" : "shard0002",  "host" : "172.16.101.54:27019" }
  balancer:
    Currently enabled:  yes
    Currently running:  no
    Failed balancer rounds in last 5 attempts:  0
    Migration Results for the last 24 hours:
        4 : Success
  databases:
    {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
    {  "_id" : "test",  "partitioned" : false,  "primary" : "shard0000" }
    {  "_id" : "testdb",  "partitioned" : true,  "primary" : "shard0000" }
        testdb.collection1
            shard key: { "_id" : "hashed" }
            chunks:
                shard0000    2
                shard0001    2
                shard0002    2
            { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6148914691236517204") } on : shard0000 Timestamp(3, 2)
            { "_id" : NumberLong("-6148914691236517204") } -->> { "_id" : NumberLong("-3074457345618258602") } on : shard0000 Timestamp(3, 3)
            { "_id" : NumberLong("-3074457345618258602") } -->> { "_id" : NumberLong(0) } on : shard0001 Timestamp(3, 4)
            { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("3074457345618258602") } on : shard0001 Timestamp(3, 5)
            { "_id" : NumberLong("3074457345618258602") } -->> { "_id" : NumberLong("6148914691236517204") } on : shard0002 Timestamp(3, 6)
            { "_id" : NumberLong("6148914691236517204") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0002 Timestamp(3, 7)
        testdb.collection2
            shard key: { "name" : 1 }
            chunks:
                shard0000    1
                shard0001    1
                shard0002    1
            { "name" : { "$minKey" : 1 } } -->> { "name" : "jack1" } on : shard0001 Timestamp(2, 0)
            { "name" : "jack1" } -->> { "name" : "jack5" } on : shard0002 Timestamp(3, 0)
            { "name" : "jack5" } -->> { "name" : { "$maxKey" : 1 } } on : shard0000 Timestamp(3, 1)


mongos> db.collection2.stats()
{
    "sharded" : true,
    "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
    "userFlags" : 1,
    "capped" : false,
    "ns" : "testdb.collection2",
    "count" : 1000,
    "numExtents" : 6,
    "size" : 48032,
    "storageSize" : 221184,
    "totalIndexSize" : 130816,
    "indexSizes" : {
        "_id_" : 65408,
        "name_1" : 65408
    },
    "avgObjSize" : 48.032,
    "nindexes" : 2,
    "nchunks" : 3,
    "shards" : {
        "shard0000" : {
            "ns" : "testdb.collection2",
            "count" : 555,
            "size" : 26656,
            "avgObjSize" : 48,
            "numExtents" : 3,
            "storageSize" : 172032,
            "lastExtentSize" : 131072,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 2,
            "totalIndexSize" : 65408,
            "indexSizes" : {
                "_id_" : 32704,
                "name_1" : 32704
            },
            "ok" : 1
        },
        "shard0001" : {
            "ns" : "testdb.collection2",
            "count" : 1,
            "size" : 48,
            "avgObjSize" : 48,
            "numExtents" : 1,
            "storageSize" : 8192,
            "lastExtentSize" : 8192,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 2,
            "totalIndexSize" : 16352,
            "indexSizes" : {
                "_id_" : 8176,
                "name_1" : 8176
            },
            "ok" : 1
        },
        "shard0002" : {
            "ns" : "testdb.collection2",
            "count" : 444,
            "size" : 21328,
            "avgObjSize" : 48,
            "numExtents" : 2,
            "storageSize" : 40960,
            "lastExtentSize" : 32768,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 2,
            "totalIndexSize" : 49056,
            "indexSizes" : {
                "_id_" : 24528,
                "name_1" : 24528
            },
            "ok" : 1
        }
    },
    "ok" : 1
}

參考連結

Sharding