mongodb副本集搭建
筆記日期:2018-01-09
- 21.33 mongodb副本集介紹
- 21.34 mongodb副本集搭建
- 21.35 mongodb副本集測試
21.33 mongodb副本集介紹
副本集(Replica Set)是一組MongoDB實例組成的集群,由一個主(Primary)服務器和多個備份(Secondary)服務器構成。通過Replication,將數據的更新由Primary推送到其他實例上,在一定的延遲之後,每個MongoDB實例維護相同的數據集副本。通過維護冗余的數據庫副本,能夠實現數據的異地備份,讀寫分離和自動故障轉移。
也就是說如果主服務器崩潰了,備份服務器會自動將其中一個成員升級為新的主服務器。使用復制功能時,如果有一臺服務器宕機了,仍然可以從副本集的其他服務器上訪問數據。如果服務器上的數據損壞或者不可訪問,可以從副本集的某個成員中創建一份新的數據副本。
早期的MongoDB版本使用master-slave,一主一從和MySQL類似,但slave在此架構中為只讀,當主庫宕機後,從庫不能自動切換為主。目前已經淘汰master-slave模式,改為副本集,這種模式下有一個主(primary),和多個從(secondary),只讀。支持給它們設置權重,當主宕掉後,權重最高的從切換為主。在此架構中還可以建立一個仲裁(arbiter)的角色,它只負責裁決,而不存儲數據。此架構中讀寫數據都是在主上,要想實現負載均衡的目的需要手動指定讀庫的目標server。
簡而言之MongoDB 副本集是有自動故障恢復功能的主從集群,有一個Primary節點和一個或多個Secondary節點組成。類似於MySQL的MMM架構。更多關於副本集的介紹請見官方文檔:
官方文檔地址:
https://docs.mongodb.com/manual/replication/
副本集架構圖:
21.34 mongodb副本集搭建
我這裏使用了三臺機器搭建副本集:
192.168.77.128 (primary)
192.168.77.130 (secondary)
192.168.77.134 (secondary)
這三臺機器上都已經安裝好了MongoDB。
開始搭建:
1.編輯三臺機器的配置文件,更改或增加以下內容:
[root@localhost ~]# vim /etc/mongod.conf replication: # 取消這行的註釋 oplogSizeMB: 20 # 增加這一行配置定義oplog的大小,註意前面需要有兩個空格 replSetName: zero # 定義復制集的名稱,同樣的前面需要有兩個空格
註:需要確保每臺機器的配置文件中的bindIp都有配置監聽自身的內網IP
2.編輯完成後,分別重啟三臺機器的MongoDB服務:
[root@localhost ~]# systemctl restart mongod.service
[root@localhost ~]# ps aux |grep mongod
mongod 2578 0.7 8.9 1034696 43592 ? Sl 18:21 0:00 /usr/bin/mongod -f /etc/mongod.conf
root 2605 0.0 0.1 112660 964 pts/0 S+ 18:21 0:00 grep --color=auto mongod
[root@localhost ~]# netstat -lntp |grep mongod
tcp 0 0 192.168.77.134:27017 0.0.0.0:* LISTEN 2578/mongod
tcp 0 0 127.0.0.1:27017 0.0.0.0:* LISTEN 2578/mongod
[root@localhost ~]#
3.關閉三臺機器的防火墻,或者清空iptables規則
4.連接主機器的MongoDB,在主機器上運行命令mongo,然後配置副本集:
[root@localhost ~]# mongo
> use admin
switched to db admin
> config={_id:"zero",members:[{_id:0,host:"192.168.77.128:27017"},{_id:1,host:"192.168.77.130:27017"},{_id:2,host:"192.168.77.134:27017"}]} # 分別配置三臺機器的ip
{
"_id" : "zero", # 副本集的名稱
"members" : [
{
"_id" : 0,
"host" : "192.168.77.128:27017"
},
{
"_id" : 1,
"host" : "192.168.77.130:27017"
},
{
"_id" : 2,
"host" : "192.168.77.134:27017"
}
]
}
> rs.initiate(config) # 初始化
{
"ok" : 1,
"operationTime" : Timestamp(1515465317, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1515465317, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
zero:PRIMARY> rs.status() # 查看狀態
{
"set" : "zero",
"date" : ISODate("2018-01-09T02:37:13.713Z"),
"myState" : 1,
"term" : NumberLong(1),
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"appliedOpTime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
}
},
"members" : [
{
"_id" : 0,
"name" : "192.168.77.128:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 527,
"optime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-01-09T02:37:09Z"),
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(1515465327, 1),
"electionDate" : ISODate("2018-01-09T02:35:27Z"),
"configVersion" : 1,
"self" : true
},
{
"_id" : 1,
"name" : "192.168.77.130:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 116,
"optime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-01-09T02:37:09Z"),
"optimeDurableDate" : ISODate("2018-01-09T02:37:09Z"),
"lastHeartbeat" : ISODate("2018-01-09T02:37:13.695Z"),
"lastHeartbeatRecv" : ISODate("2018-01-09T02:37:13.661Z"),
"pingMs" : NumberLong(0),
"syncingTo" : "192.168.77.128:27017",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "192.168.77.134:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 116,
"optime" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1515465429, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2018-01-09T02:37:09Z"),
"optimeDurableDate" : ISODate("2018-01-09T02:37:09Z"),
"lastHeartbeat" : ISODate("2018-01-09T02:37:13.561Z"),
"lastHeartbeatRecv" : ISODate("2018-01-09T02:37:13.660Z"),
"pingMs" : NumberLong(0),
"syncingTo" : "192.168.77.128:27017",
"configVersion" : 1
}
],
"ok" : 1,
"operationTime" : Timestamp(1515465429, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1515465429, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
zero:PRIMARY>
以上我們需要關註三臺機器的stateStr狀態,主機器的stateStr狀態需要為PRIMARY,兩臺從機器的stateStr狀態需要為SECONDARY才是正常。
如果出現兩個從上的stateStr狀態為"stateStr" : "STARTUP", 則需要進行如下操作:
> config={_id:"zero",members:[{_id:0,host:"192.168.77.128:27017"},{_id:1,host:"192.168.77.130:27017"},{_id:2,host:"192.168.77.134:27017"}]}
> rs.reconfig(config)
然後再次查看狀態:rs.status(),確保從的狀態變為SECONDARY。
21.35 mongodb副本集測試
1.在主機器上創建庫以及創建集合:
zero:PRIMARY> use testdb # 創建庫
switched to db testdb
zero:PRIMARY> db.test.insert({AccountID:1,UserName:"zero",password:"123456"}) # 創建集合,並且插入一條數據
WriteResult({ "nInserted" : 1 })
zero:PRIMARY> show dbs # 查看所有的庫
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
zero:PRIMARY> show tables # 查看當前庫的集合
test
zero:PRIMARY>
2.然後到從機器上查看是否有同步主機器上的數據:
[root@localhost ~]# mongo
zero:SECONDARY> show dbs
2018-01-09T18:46:09.959+0800 E QUERY [thread1] Error: listDatabases failed:{
"operationTime" : Timestamp(1515466399, 1),
"ok" : 0,
"errmsg" : "not master and slaveOk=false",
"code" : 13435,
"codeName" : "NotMasterNoSlaveOk",
"$clusterTime" : {
"clusterTime" : Timestamp(1515466399, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
Mongo.prototype.getDBs@src/mongo/shell/mongo.js:65:1
shellHelper.show@src/mongo/shell/utils.js:813:19
shellHelper@src/mongo/shell/utils.js:703:15
@(shellhelp2):1:1
zero:SECONDARY> rs.slaveOk() # 如果出現以上錯誤,需要執行這條命令
zero:SECONDARY> show dbs # 然後就不會報錯了
admin 0.000GB
config 0.000GB
local 0.000GB
testdb 0.000GB
zero:SECONDARY> use testdb
switched to db testdb
zero:SECONDARY> show tables
test
zero:SECONDARY>
如上可以看到數據已經成功同步到從機器上了。
副本集更改權重模擬主宕機
使用rs.config()命令可以查看每臺機器的權重:
zero:PRIMARY> rs.config()
{
"_id" : "zero",
"version" : 1,
"protocolVersion" : NumberLong(1),
"members" : [
{
"_id" : 0,
"host" : "192.168.77.128:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 1,
"host" : "192.168.77.130:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 2,
"host" : "192.168.77.134:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5a542a65e491a43160eb92f0")
}
}
zero:PRIMARY>
priority的值表示該機器的權重,默認都為1。
增加一條防火墻規則來阻斷通信模擬主機器宕機:
# 註意這是在主機器上執行
[root@localhost ~]# iptables -I INPUT -p tcp --dport 27017 -j DROP
然後到從機器上查看狀態:
zero:SECONDARY> rs.status()
{
"set" : "zero",
"date" : ISODate("2018-01-09T14:06:24.127Z"),
"myState" : 1,
"term" : NumberLong(4),
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
},
"appliedOpTime" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
},
"durableOpTime" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
}
},
"members" : [
{
"_id" : 0,
"name" : "192.168.77.128:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDurable" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2018-01-09T14:06:20.243Z"),
"lastHeartbeatRecv" : ISODate("2018-01-09T14:06:23.491Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "Couldn‘t get a connection within the time limit",
"configVersion" : -1
},
{
"_id" : 1,
"name" : "192.168.77.130:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 1010,
"optime" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
},
"optimeDurable" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
},
"optimeDate" : ISODate("2018-01-09T14:06:22Z"),
"optimeDurableDate" : ISODate("2018-01-09T14:06:22Z"),
"lastHeartbeat" : ISODate("2018-01-09T14:06:23.481Z"),
"lastHeartbeatRecv" : ISODate("2018-01-09T14:06:23.178Z"),
"pingMs" : NumberLong(0),
"syncingTo" : "192.168.77.134:27017",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "192.168.77.134:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1250,
"optime" : {
"ts" : Timestamp(1515506782, 1),
"t" : NumberLong(4)
},
"optimeDate" : ISODate("2018-01-09T14:06:22Z"),
"electionTime" : Timestamp(1515506731, 1),
"electionDate" : ISODate("2018-01-09T14:05:31Z"),
"configVersion" : 1,
"self" : true
}
],
"ok" : 1,
"operationTime" : Timestamp(1515506782, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1515506782, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
zero:PRIMARY>
如上,可以看到 192.168.77.128 的 stateStr 的值變成 not reachable/healthy 了,而 192.168.77.134 自動切換成主了,也可以看到192.168.77.134 的 stateStr 的值變成 了PRIMARY。因為權重是相同的,所以切換是有一定的隨機性的。
接下來我們指定每臺機器權重,讓權重高的機器自動切換為主。
1.先刪除192.168.77.128上的防火墻規則:
[root@localhost ~]# iptables -D INPUT -p tcp --dport 27017 -j DROP
2.回到192.168.77.134機器上,指定各個機器的權重:
zero:PRIMARY> cfg = rs.conf()
zero:PRIMARY> cfg.members[0].priority = 3
3
zero:PRIMARY> cfg.members[1].priority = 2
2
zero:PRIMARY> cfg.members[2].priority = 1
zero:PRIMARY> rs.reconfig(cfg)
{
"ok" : 1,
"operationTime" : Timestamp(1515507322, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1515507322, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
zero:PRIMARY>
3.這時候192.168.77.128應該就切換成主了,到192.168.77.128上執行rs.config()進行查看:
zero:PRIMARY> rs.config()
{
"_id" : "zero",
"version" : 2,
"protocolVersion" : NumberLong(1),
"members" : [
{
"_id" : 0,
"host" : "192.168.77.128:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 3,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 1,
"host" : "192.168.77.130:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 2,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
},
{
"_id" : 2,
"host" : "192.168.77.134:27017",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {
},
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
},
"replicaSetId" : ObjectId("5a542a65e491a43160eb92f0")
}
}
zero:PRIMARY>
如上,可以看到每個機器權重的變化,192.168.77.128也自動切換回主角色了。如果192.168.77.128再宕掉的話,那麽192.168.77.130就會是候選主節點,因為除了192.168.77.128之外它的權重最高。
mongodb副本集搭建