1. 程式人生 > >mongos分片叢集整體線上遷移方案和詳細實踐

mongos分片叢集整體線上遷移方案和詳細實踐

環境準備:

mongodb版本:3.0
mongos:1個
configserver:3個,普通模式組成高可用(非副本集方式)
分片節點:2個,每個分片是三個資料節點組成的副本集(1 primary+1 secondary+1 arbiter)
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("582d45c13eb0b8dc133631d9")
}
  shards:
{  "_id" : "udb-1ywcjr",  "host" : "udb-1ywcjr/10.19.27.79:27017,10.19.51.238:27017" }
{  "_id" : "udb-dwk5m0",  "host" : "udb-dwk5m0/10.19.35.18:27017,10.19.89.184:27017" }
  balancer:
Currently enabled:  yes
Currently running:  no
Failed balancer rounds in last 5 attempts:  0
Migration Results for the last 24 hours: 
No recent migrations
  databases:
{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
{  "_id" : "test",  "partitioned" : false,  "primary" : "udb-dwk5m0" }
{  "_id" : "test1",  "partitioned" : false,  "primary" : "udb-1ywcjr" }

mongos> use test
switched to db test
mongos> db.test.find()
{ "_id" : ObjectId("582d483bc5bab52092bf33fd"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d483cc5bab52092bf33fe"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d483dc5bab52092bf33ff"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d483dc5bab52092bf3400"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d483ec5bab52092bf3401"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d483ec5bab52092bf3402"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d483fc5bab52092bf3403"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4840c5bab52092bf3404"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4840c5bab52092bf3405"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4841c5bab52092bf3406"), "name" : "jiangjian" }
mongos> use test1
switched to db test1
mongos> db.test1.find()
{ "_id" : ObjectId("582d4875c5bab52092bf3407"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4876c5bab52092bf3408"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4876c5bab52092bf3409"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4877c5bab52092bf340a"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4878c5bab52092bf340b"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4878c5bab52092bf340c"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4879c5bab52092bf340d"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d4879c5bab52092bf340e"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d487ac5bab52092bf340f"), "name" : "jiangjian" }
{ "_id" : ObjectId("582d487ac5bab52092bf3410"), "name" : "jiangjian" }


遷移步驟

0 確認源mongos叢集和目標mongos叢集使用同一套keyfile

1 遷移分片

1.1 建立兩個分片1所在的primary節點的secondary節點,1個arbiter,這三個新加的節點作為目標sharding 1
1.2 建立兩個分片2所在的primary節點的secondary節點,1個arbiter,這三個新加的節點作為目標sharding 2
1.3 這時在mongos上執行sh.status()的效果圖如下,這時分片1新添加了節點10.19.39.207,10.19.111.102和仲裁節點10.19.136.209(sh.staus不可見),
分片2新添加了節點10.19.191.159,10.19.88.149和仲裁節點10.19.187.4(sh.status不可見)
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("582d45c13eb0b8dc133631d9")
}
  shards:
{  "_id" : "udb-1ywcjr",  "host" : "udb-1ywcjr/10.19.111.102:27017,10.19.27.79:27017,10.19.39.207:27017,10.19.51.238:27017" }
{  "_id" : "udb-dwk5m0",  "host" : "udb-dwk5m0/10.19.191.159:27017,10.19.35.18:27017,10.19.88.149:27017,10.19.89.184:27017" }
  balancer:
Currently enabled:  yes
Currently running:  no
Failed balancer rounds in last 5 attempts:  0
Migration Results for the last 24 hours: 
No recent migrations
  databases:
{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
{  "_id" : "test",  "partitioned" : false,  "primary" : "udb-dwk5m0" }
{  "_id" : "test1",  "partitioned" : false,  "primary" : "udb-1ywcjr" }
1.4 將目標sharding 1的其中一個數據節點提升為主節點,提升完成後將源sharding 1的節點刪除;
將目標sharding 2的其中一個數據節點提升為主節點,提升完成後將源sharding 2的節點刪除;
這時分片遷移完成,執行sh.status()效果如下
mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("582d45c13eb0b8dc133631d9")
}
  shards:
{  "_id" : "udb-1ywcjr",  "host" : "udb-1ywcjr/10.19.111.102:27017,10.19.39.207:27017" }
{  "_id" : "udb-dwk5m0",  "host" : "udb-dwk5m0/10.19.191.159:27017,10.19.88.149:27017" }
  balancer:
Currently enabled:  yes
Currently running:  no
Failed balancer rounds in last 5 attempts:  0
Migration Results for the last 24 hours: 
No recent migrations
  databases:
{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
{  "_id" : "test",  "partitioned" : false,  "primary" : "udb-dwk5m0" }
{  "_id" : "test1",  "partitioned" : false,  "primary" : "udb-1ywcjr" }

2 遷移config server和mongos(普通模式的config server,即非副本集模式)

2.1 創建出相同配置的config server和mongos(本案例中的目標mongos為10.19.180.185,目標configsvr為10.19.47.36,10.19.24.53,10.19.82.250)
2.2 啟動目標mongos和configsvr(注意mongos啟動時在命令列新增上目標configsvr的IP)
2.3 關閉目標mongos和configsvr(2.2只是確認目標mongos啟動無問題)
2.4 在源端mongos停止負載均衡器,防止在config server遷移過程中資料變動(不影響現網服務)
mongos> sh.getBalancerState()
true
mongos> sh.stopBalancer()
Waiting for active hosts...
Waiting for active host 222eeafc4436:27017 to recognize new settings... (ping : Fri Nov 18 2016 13:06:12 GMT+0800 (CST))
Waited for active ping to change for host 222eeafc4436:27017, a migration may be in progress or the host may be down.
Waiting for the balancer lock...
Waiting again for active hosts after balancer is off...
Waiting for active host 222eeafc4436:27017 to recognize new settings... (ping : Fri Nov 18 2016 13:06:12 GMT+0800 (CST))
Waited for active ping to change for host 222eeafc4436:27017, a migration may be in progress or the host may be down.
Warning : host 222eeafc4436:27017 seems to have been offline since Fri Nov 18 2016 13:06:12 GMT+0800 (CST)
mongos> sh.getBalancerState()
false
2.5 關閉其中一個源端的config server
2.6 copy該源端的configserver的data目錄到三個目標config server的data目錄
2.7 關閉所有節點(包括所有的mongos,config server,sharding server)
2.8 啟動sharding server
2.9 啟動目標的config server
2.10 啟動目標的mongos
2.11 將業務連線的IP從源mongos ip改成目標mongos ip
2.12 確認業務正常
2.13 啟動目標端的負載均衡器
mongos> sh.setBalancerState(true)
mongos> sh.getBalancerState()
true
2.14 遷移全部完成

注意事項

雖然mongo支援在同一套分片叢集下,同時存在2套config server和mongos;但是兩套同時執行時,我發現後起來的一套可寫不可讀,另外一套可讀可寫。
至於具體哪一套完全可用,取決於底層的sharding server重啟後,哪一套mongos先行啟動,先行啟動的mongos可讀可寫;
所以遷移過程中我選擇先停服舊的mongos叢集,再啟動新的mongos叢集,這樣會有短暫的業務中斷。
mongos> db.xxoo.insert({xx:"llllll"})
WriteResult({ "nInserted" : 1 })
mongos> db.xxoo.find()
Error: error: {
"$err" : "setShardVersion failed shard: udb-dwk5m0:udb-dwk5m0/10.19.191.159:27017,10.19.88.149:27017 { configdb: { stored: \"10.19.24.53:27017,10.19.47.36:27017,10.19.82.250:27017\", given: \"10.19.19.32:27017,10.19.18.239:27017,10.19.93.32:27017\" }, ok: 0.0, errmsg: \"mongos specified a different config database string : stored : 10.19.24.53:27017,10.19.47.36:27017,10.19.82.250:27017 vs given : 10.19.19.32:27017,10....\", $gleStats: { lastOpTime: Timestamp 0|0, electionId: ObjectId('582e8c4fdcb96a0831b5cef9') } }",
"code" : 10429,
"shard" : "udb-dwk5m0"
}