CentOS 6.3下DRBD+HeartBeat+NFS配置笔记

----------闲 扯-----------

这里首先感谢酒哥的构建高可用的Linux服务器的这本书,看了这本书上并参考里面的配置让自己对DRBD+HeartBeat+NFS思路顿时清晰了许多,最后吐槽下,yum安装真心坑爹,以后如果非必须,尽量源码包安装。

----------开 搞-----------

系统版本: CentOS6.3 x64(内核2.6.32)

DRBD: DRBD-8.4.3

HeartBeat:epel更新源(真坑)

NFS: 系统自带

HeartBeat VIP: 192.168.7.90

Primary DRBD+HeartBeat: 192.168.7.88(drbd1.example.com)

Secondary DRBD+HeartBeat: 192.168.7.89 (drbd2.example.com)

(Primary)为仅主服务器端配置

(Secondary)为仅从服务器端配置

(Primary,Secondary)为主服务器端从服务器端共同配置

一.DRBD配置,传送门:

二.Hearbeat配置;

这里接着DRBD系统环境及安装配置:

1.安装heartbeat(CentOS6.3中默认不带有Heartbeat包,因此需要从第三方下载)(Primary,Secondary)

# wget ftp://mirror.switch.ch/pool/1/mirror/scientificlinux/6rolling/i386/os/Packages/epel-release-6-5.noarch.rpm

# rpm -ivUh epel-release-6-5.noarch.rpm

# yum --enablerepo=epel install heartbeat -y

2.配置heartbeat

(Primary)

# vi /etc/ha.d/ha.cf

---------------

# 日志

logfile /var/log/ha-log

logfacility local0

# 心跳监测时间

keepalive 2

# 死亡时间

deadtime 5

# 指定对方IP:

ucast eth0 192.168.7.89

# 服务器正常后由主服务器接管资源,另一台服务器放弃该资源

auto_failback off

#定义节点

node drbd1.example.com drbd2.example.com

---------------

(Secondary)

# vi /etc/ha.d/ha.cf

---------------

# 日志

logfile /var/log/ha-log

logfacility local0

# 心跳监测时间

keepalive 2

# 死亡时间

deadtime 5

# 指定对方IP:

ucast eth0 192.168.7.88

# 服务器正常后由主服务器接管资源,另一台服务器放弃该资源

auto_failback off

#定义节点

node drbd1.example.com drbd2.example.com

---------------

编辑双机互联验证文件:(Primary,Secondary)

# vi /etc/ha.d/authkeys

--------------

auth 1

1 crc

--------------

# chmod 600 /etc/ha.d/authkeys

编辑集群资源文件:(Primary,Secondary)

# vi /etc/ha.d/haresources

--------------

drbd1.example.com IPaddr::192.168.7.90/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext4 killnfsd

--------------

该文件内IPaddr,Filesystem等脚本存放路径在/etc/ha.d/resource.d/下

编辑脚本文件killnfsd,用来重启NFS服务:

注:因为NFS服务切换后,必须重新mount NFS共享出来的目录,否则会报错。

# vi /etc/ha.d/resource.d/killnfsd

-----------------

killall -9 nfsd; /etc/init.d/nfs restart;exit 0

-----------------

赋予执行权限:

# chmod 755 /etc/ha.d/resource.d/killnfsd

创建DRBD脚本文件drbddisk:(Primary,Secondary)

注:

此处又是一个大坑,如果不明白Heartbeat目录结构的朋友估计要在这里被卡到死,因为默认yum安装Heartbeat,不会在/etc/ha.d/resource.d/创建drbddisk脚本,而且也无法在安装后从本地其他路径找到该文件。

此处本人也是因为启动Heartbeat后无法PING通虚IP,最后通过查看/var/log/ha-log日志,找到一行

ERROR: Cannot locate resource script drbddisk

然后进而到/etc/ha.d/resource.d/路径下发现竟然没有drbddisk脚本,最后在google上找到该代码,创建该脚本,终于测试通过:

# vi /etc/ha.d/resource.d/drbddisk

-----------------------

#!/bin/bash

#

# This script is inteded to be used as resource script by heartbeat

#

# Copright 2003-2008 LINBIT Information Technologies

# Philipp Reisner, Lars Ellenberg

#

###

DEFAULTFILE="/etc/default/drbd"

DRBDADM="/sbin/drbdadm"

if [ -f $DEFAULTFILE ]; then

. $DEFAULTFILE

fi

if [ "$#" -eq 2 ]; then

RES="$1"

CMD="$2"

else

RES="all"

CMD="$1"

fi

## EXIT CODES

# since this is a "legacy heartbeat R1 resource agent" script,

# exit codes actually do not matter that much as long as we conform to

#

# but it does not hurt to conform to lsb init-script exit codes,

# where we can.

#

#LSB-Core-generic/LSB-Core-generic/iniscrptact.html

####

drbd_set_role_from_proc_drbd()

{

local out

if ! test -e /proc/drbd; then

ROLE="Unconfigured"

return

fi

dev=$( $DRBDADM sh-dev $RES )

minor=${dev#/dev/drbd}

if [[ $minor = *[!0-9]* ]] ; then

# sh-minor is only supported since drbd 8.3.1

minor=$( $DRBDADM sh-minor $RES )

fi

if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then

ROLE=Unknown

return

fi

if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then

set -- $out

ROLE=${5%/**}

: ${ROLE:=Unconfigured} # if it does not show up

else

ROLE=Unknown

fi

}

case "$CMD" in

start)

# try several times, in case heartbeat deadtime

# was smaller than drbd ping time

try=6

while true; do

$DRBDADM primary $RES && break

let "--try" || exit 1 # LSB generic error

sleep 1

done

;;

stop)

# heartbeat (haresources mode) will retry failed stop

# for a number of times in addition to this internal retry.

try=3

while true; do

$DRBDADM secondary $RES && break

# We used to lie here, and pretend success for anything != 11,

# to avoid the reboot on failed stop recovery for "simple

# config errors" and such. But that is incorrect.

# Don't lie to your cluster manager.

# And don't do config errors...

let --try || exit 1 # LSB generic error

sleep 1

done

;;

status)

if [ "$RES" = "all" ]; then

echo "A resource name is required for status inquiries."

exit 10

fi

ST=$( $DRBDADM role $RES )

ROLE=${ST%/**}

case $ROLE in

Primary|Secondary|Unconfigured)

# expected

;;

*)

# unexpected. whatever...

# If we are unsure about the state of a resource, we need to

# report it as possibly running, so heartbeat can, after failed

# stop, do a recovery by reboot.

# drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is

# suddenly readonly. So we retry by parsing /proc/drbd.

drbd_set_role_from_proc_drbd

esac

case $ROLE in

Primary)

echo "running (Primary)"

exit 0 # LSB status "service is OK"

;;

Secondary|Unconfigured)

echo "stopped ($ROLE)"

exit 3 # LSB status "service is not running"

;;

*)

# NOTE the "running" in below message.

# this is a "heartbeat" resource script,

# the exit code is _ignored_.

echo "cannot determine status, may be running ($ROLE)"

exit 4 # LSB status "service status is unknown"

;;

esac

;;

*)

echo "Usage: drbddisk [resource] {start|stop|status}"

exit 1

;;

esac

exit 0

-----------------------

赋予执行权限:

# chmod 755 /etc/ha.d/resource.d/drbddisk

在两个节点上启动HeartBeat服务,先启动Primary:(Primary,Secondary)

# service heartbeat start

# chkconfig heartbeat on

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/3362a7f25812537106409404ef616f1b.html