MongoDB -> kafka 高性能实时同步(采集)mongodb数据到kafka解决方案 (2)

日期：2021-06-06 栏目：程序人生浏览：次

我用的是v2.2.1版本，高可用部署非常简单。collector.conf开启master的选举即可：

# high availability option. # enable master election if set true. only one mongoshake can become master # and do sync, the others will wait and at most one of them become master once # previous master die. The master information stores in the `mongoshake` db in the source # database by default. # 如果开启主备mongoshake拉取同一个源端，此参数需要开启。 master_quorum = true # checkpoint存储的地址，database表示存储到MongoDB中，api表示提供http的接口写入checkpoint。 context.storage = database

同时我checkpoint的存储地址默认用的是database，会默认存储在mongoshake这个db中。我们可以查询到checkpoint记录的一些信息。

rs0:PRIMARY> use mongoshake switched to db mongoshake rs0:PRIMARY> show collections; ckpt_default ckpt_default_oplog election rs0:PRIMARY> db.election.find() { "_id" : ObjectId("5204af979955496907000001"), "pid" : 6545, "host" : "192.168.31.175", "heartbeat" : NumberLong(1582045562) }

我在192.168.31.174，192.168.31.175，192.168.31.176上总共启了3个MongoShake，可以看到现在工作的是192.168.31.175机器上进程。自测过程，高速往mongodb写入数据，手动kill掉192.168.31.175上的collector进程，等192.168.31.174成为master之后，我又手动kill掉它，最终只保留192.168.31.176上的进程工作，最后统计数据发现，有重采数据现象，猜测有实例还没来得及checkpoint就被kill掉了。

转载注明出处：https://www.heiqu.com/wpxpzz.html

MongoDB -> kafka 高性能实时同步(采集)mongodb数据到kafka解决方案 (2)

相关推荐