mongodb高可用性架构—Replica Set

mongodb高可用性架构—Replica Set

Replica Set使用的是n个mongod节点,构建具备自动的容错功能(auto-failover),自动恢复的(auto-recovery)的高可用方案。

使用Replica Set来实现读写分离。通过在连接时指定或者在主库指定slaveOk,由Secondary来分担读的压力,Primary只承担写操作。

对于Replica Set中的secondary 节点默认是不可读的。

环境如下:

192.168.198.131 192.168.198.129 192.168.198.132

分别在三台服务器上安装mongod服务,安装如下:

# wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.0.3.tgz

# tar zxvf mongodb-linux-x86_64-2.0.3.tgz -C ../software/

# ln -s mongodb-linux-x86_64-2.0.3 /usr/local/mongodb

# useradd mongodb

# mkdir -p /data/mongodb/myset

# cd /usr/local/mongodb/bin

# ./mongod –replSet myset –dbpath /data/mongodb/myset –oplogSize 100 –logpath /data/mongodb/myset/myset.log –logappend –fork

# ./mongo //任选一台执行以下内容

> config={_id:”myset”,members:[

… {_id:0,host:”192.168.198.131:27017″},

… {_id:1,host:”192.168.198.129:27017″},

… {_id:2,host:”192.168.198.132:27017″,arbiterOnly:true}]}

以下输出内容:

{

“_id” : “myset”,

“members” : [

{

“_id” : 0,

“host” : “192.168.198.131:27017”

},

{

“_id” : 1,

“host” : “192.168.198.129:27017”

},

{

“_id” : 2,

“host” : “192.168.198.132:27017”,

“arbiterOnly” : true

}

]

}

> rs.initiate(config) //初始化

以下输出内容:

{

“info” : “Config now saved locally. Should come online in about a minute.”,

“ok” : 1

}

> rs.conf() //查看配置内容

{

“_id” : “myset”,

“version” : 1,

“members” : [

{

“_id” : 0,

“host” : “192.168.198.131:27017”

},

{

“_id” : 1,

“host” : “192.168.198.129:27017”

},

{

“_id” : 2,

“host” : “192.168.198.132:27017”,

“arbiterOnly” : true

}

]

}

> rs.status() //查看状态信息

{

“set” : “myset”,

“date” : ISODate(“2012-03-01T08:45:01Z”),

“myState” : 1,

“members” : [

{

“_id” : 0,

“name” : “192.168.198.131:27017”,

“health” : 1,

“state” : 1,

“stateStr” : “PRIMARY”,

“optime” : {

“t” : 1330591378000,

“i” : 1

},

“optimeDate” : ISODate(“2012-03-01T08:42:58Z”),

“self” : true

},

{

“_id” : 1,

“name” : “192.168.198.129:27017”,

“health” : 1,

“state” : 2,

“stateStr” : “SECONDARY”,

“uptime” : 121,

“optime” : {

“t” : 1330591378000,

“i” : 1

},

“optimeDate” : ISODate(“2012-03-01T08:42:58Z”),

“lastHeartbeat” : ISODate(“2012-03-01T08:45:01Z”),

“pingMs” : 0

},

{

“_id” : 2,

“name” : “192.168.198.132:27017”,

“health” : 1,

“state” : 7,

“stateStr” : “ARBITER”,

“uptime” : 121,

“optime” : {

“t” : 0,

“i” : 0

},

“optimeDate” : ISODate(“1970-01-01T00:00:00Z”),

“lastHeartbeat” : ISODate(“2012-03-01T08:45:01Z”),

“pingMs” : 1

}

],

“ok” : 1

}

state: 1表示当前可以进行读写,2表示不能读写

health: 1表示是正常的,0异常

在同一时刻,每组 Replica Sets 只有一个 Primary,用于接受写操作。而后会异步复制到其他成员数据库中。一旦 primary 死掉,会自动投票选出接任的 primary 来,原服务器恢复后成为普通成员。如果数据尚未从先前的 primary 复制到成员服务器,有可能会丢失数据。

PRIMARY> db.test.insert({“name”:”foobar”,”age”:25})

PRIMARY> db.test.find()

{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }

SECONDARY> db.test.find()

error: { “$err” : “not master and slaveok=false”, “code” : 13435 }

SECONDARY> db.test.insert({“name”:”foobar”,”age”:25})

not master

在主库上设置slaveok=ok

PRIMARY> db.getMongo().setSlaveOk()

SECONDARY> use test

switched to db test

SECONDARY> db.test.find()

{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }

192.168.198.131上pkill mongod

Thu Mar 1 17:17:51 got kill or ctrl c or hup signal 15 (Terminated), will terminate after current cmd ends

Thu Mar 1 17:17:51 [interruptThread] now exiting

Thu Mar 1 17:17:51 dbexit:

Thu Mar 1 17:17:51 [interruptThread] shutdown: going to close listening sockets…

Thu Mar 1 17:17:51 [interruptThread] closing listening socket: 7

Thu Mar 1 17:17:51 [interruptThread] closing listening socket: 8

Thu Mar 1 17:17:51 [interruptThread] closing listening socket: 9

Thu Mar 1 17:17:51 [interruptThread] removing socket file: /tmp/mongodb-27017.sock

Thu Mar 1 17:17:51 [interruptThread] shutdown: going to flush diaglog…

Thu Mar 1 17:17:51 [interruptThread] shutdown: going to close sockets…

Thu Mar 1 17:17:51 [conn1] end connection 127.0.0.1:58614

Thu Mar 1 17:17:51 [interruptThread] shutdown: waiting for fs preallocator…

Thu Mar 1 17:17:51 [interruptThread] shutdown: lock for final commit…

Thu Mar 1 17:17:51 [interruptThread] shutdown: final commit…

Thu Mar 1 17:17:52 [interruptThread] shutdown: closing all files…

Thu Mar 1 17:17:52 [interruptThread] closeAllFiles() finished

Thu Mar 1 17:17:52 [interruptThread] journalCleanup…

Thu Mar 1 17:17:52 [interruptThread] removeJournalFiles

Thu Mar 1 17:17:52 [interruptThread] shutdown: removing fs lock…

Thu Mar 1 17:17:52 dbexit: really exiting now

192.168.198.129选择为primary

Thu Mar 1 00:17:51 [conn144] end connection 192.168.198.131:35714

Thu Mar 1 00:17:51 [rsSync] replSet syncThread: 10278 dbclient error communicating with server: 192.168.198.131:27017

Thu Mar 1 00:17:52 [rsHealthPoll] DBClientCursor::init call() failed

Thu Mar 1 00:17:52 [rsHealthPoll] replSet info 192.168.198.131:27017 is down (or slow to respond): DBClientBase::findN: transport error: 192.168.198.131:27017 query: { replSetHeartbeat: “myset”, v: 1, pv: 1, checkEmpty: false, from: “192.168.198.129:27017” }

Thu Mar 1 00:17:52 [rsHealthPoll] replSet member 192.168.198.131:27017 is now in state DOWN

Thu Mar 1 00:17:52 [rsMgr] not electing self, 192.168.198.132:27017 would veto

Thu Mar 1 00:17:58 [rsMgr] replSet info electSelf 1

Thu Mar 1 00:17:58 [rsMgr] replSet PRIMARY

【ARBITER】192.168.198.132日志

Thu Mar 1 04:17:51 [conn143] end connection 192.168.198.131:56260

Thu Mar 1 04:17:53 [rsHealthPoll] DBClientCursor::init call() failed

Thu Mar 1 04:17:53 [rsHealthPoll] replSet info 192.168.198.131:27017 is down (or slow to respond): DBClientBase::findN: transport error: 192.168.198.131:27017 query: { replSetHeartbeat: “myset”, v: 1, pv: 1, checkEmpty: false, from: “192.168.198.132:27017” }

Thu Mar 1 04:17:53 [rsHealthPoll] replSet member 192.168.198.131:27017 is now in state DOWN

Thu Mar 1 04:17:58 [conn144] replSet info voting yea for 192.168.198.129:27017 (1)

Thu Mar 1 04:17:59 [rsHealthPoll] replSet member 192.168.198.129:27017 is now in state PRIMARY

Thu Mar 1 04:18:05 [rsHealthPoll] couldn’t connect to 192.168.198.131:27017: couldn’t connect to server 192.168.198.131:27017

PRIMARY> rs.status();

{

“set” : “myset”,

“date” : ISODate(“2012-03-01T09:20:37Z”),

“myState” : 1,

“syncingTo” : “192.168.198.131:27017”,

“members” : [

{

“_id” : 0,

“name” : “192.168.198.131:27017”,

“health” : 0,

“state” : 8,

“stateStr” : “(not reachable/healthy)”,

“uptime” : 0,

“optime” : {

“t” : 1330591997000,

“i” : 1

},

“optimeDate” : ISODate(“2012-03-01T08:53:17Z”),

“lastHeartbeat” : ISODate(“2012-03-01T09:17:50Z”),

“pingMs” : 0,

“errmsg” : “socket exception”

},

{

“_id” : 1,

“name” : “192.168.198.129:27017”,

“health” : 1,

“state” : 1,

“stateStr” : “PRIMARY”,

“optime” : {

“t” : 1330591997000,

“i” : 1

},

“optimeDate” : ISODate(“2012-03-01T08:53:17Z”),

“self” : true

},

{

“_id” : 2,

“name” : “192.168.198.132:27017”,

“health” : 1,

“state” : 7,

“stateStr” : “ARBITER”,

“uptime” : 2244,

“optime” : {

“t” : 0,

“i” : 0

},

“optimeDate” : ISODate(“1970-01-01T00:00:00Z”),

“lastHeartbeat” : ISODate(“2012-03-01T09:20:36Z”),

“pingMs” : 0

}

],

“ok” : 1

}

PRIMARY> db.test.find()

{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }

{ “_id” : ObjectId(“4f4f3fe2a7c9a9d1eb78392f”), “name” : “ttlsa”, “age” : 1 }

再次启动192.168.198.131的mongod服务

Thu Mar 1 17:23:24 [initandlisten] MongoDB starting : pid=6977 port=27017 dbpath=/data/mongodb/myset 64-bit host=node2

Thu Mar 1 17:23:24 [initandlisten] db version v2.0.3, pdfile version 4.5

Thu Mar 1 17:23:24 [initandlisten] git version: 05bb8aa793660af8fce7e36b510ad48c27439697

Thu Mar 1 17:23:24 [initandlisten] build info: Linux ip-10-110-9-236 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41

Thu Mar 1 17:23:24 [initandlisten] options: { dbpath: “/data/mongodb/myset”, fork: true, logappend: true, logpath: “/data/mongodb/myset/myset.log”, oplogSize: 100, replSet: “myset” }

Thu Mar 1 17:23:24 [initandlisten] journal dir=/data/mongodb/myset/journal

Thu Mar 1 17:23:24 [initandlisten] recover : no journal files present, no recovery needed

Thu Mar 1 17:23:26 [initandlisten] waiting for connections on port 27017

Thu Mar 1 17:23:26 [websvr] admin web console waiting for connections on port 28017

Thu Mar 1 17:23:27 [initandlisten] connection accepted from 192.168.198.129:43753 #1

Thu Mar 1 17:23:27 [initandlisten] connection accepted from 127.0.0.1:37253 #2

Thu Mar 1 17:23:27 [rsStart] trying to contact 192.168.198.129:27017

Thu Mar 1 17:23:27 [rsStart] replSet STARTUP2

Thu Mar 1 17:23:27 [rsSync] replSet SECONDARY

SECONDARY> use test

switched to db test

SECONDARY> db.test.find()

{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }

{ “_id” : ObjectId(“4f4f3fe2a7c9a9d1eb78392f”), “name” : “ttlsa”, “age” : 1 }