HeartBeat单独提供高可用服务(2)

tar xf heartbeat\ 3.0.6.bz2
cd Heartbeat-3-0-958e11be8686/
./bootstrap
export CFLAGS="$CFLAGS -I/usr/local/heartbeat/include -L/usr/local/heartbeat/lib"
./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient LIBS=/lib64/libuuid.so.1
make
make install
如果出现如下错误:

【configure时错误:】
configure: error: in `/root/Heartbeat-3-0-958e11be8686':
configure: error: Core development headers were not found

解决方法:
export CFLAGS="$CFLAGS -I/usr/local/heartbeat/include -L/usr/local/heartbeat/lib"

【make时错误:】
/usr/local/heartbeat/include/heartbeat/glue_config.h:105:1: error: "HA_HBCONF_DIR" redefined
In file included from ../include/lha_internal.h:38,
                from strlcpy.c:1:
../include/config.h:390:1: error: this is the location of the previous definition
gmake[1]: *** [strlcpy.lo] Error 1
gmake[1]: Leaving directory `/root/Heartbeat-3-0-958e11be8686/replace'
make: *** [all-recursive] Error 1

解决方法1:
删除/usr/local/heartbeat/include/heartbeat/glue_config.h 中的第105行
#define HA_HBCONF_DIR "/etc/ha.d/"
sed -i '105d' /usr/local/heartbeat/include/heartbeat/glue_config.h

解决方法2:configure上加上忽略错误选项
./configure --prefix=/usr/local/heartbeat --with-daemon-user=hacluster --with-daemon-group=haclient LIBS=/lib64/libuuid.so.1 --enable-fatal-warnings=no
(5).编译后配置。

mkdir -p /usr/local/heartbeat/usr/lib/ocf/lib/heartbeat
cp -a /usr/lib/ocf/lib/heartbeat/ocf-* /usr/local/heartbeat/usr/lib/ocf/lib/heartbeat/
ln -s /usr/local/heartbeat/lib64/heartbeat/plugins/RAExec/* /usr/local/heartbeat/lib/heartbeat/plugins/RAExec/
ln -s /usr/local/heartbeat/lib64/heartbeat/plugins/* /usr/local/heartbeat/lib/heartbeat/plugins/
ln -s /usr/local/heartbeat/share/heartbeat /usr/share/heartbeat
提供配置文件:

cd /usr/local/heartbeat
cp -a share/doc/heartbeat/{ha.cf,haresources,authkeys} etc/ha.d/
chmod 600 etc/ha.d/authkeys
加入服务器启动列表:

chkconfig --add heartbeat
chkconfig --level 2345 heartbeat on
设置环境变量PATH:

echo 'export PATH=/usr/local/heartbeat/sbin:/usr/local/heartbeat/bin:$PATH' >/etc/profile.d/ha.sh
chmod +x /etc/profile.d/ha.sh
source /etc/profile.d/ha.sh
设置man PATH:

echo 'MANPATH /usr/local/heartbeat/share/man' >>/etc/man.config
3.heartbeat相关配置文件
heartbeat配置文件有3个:

密钥文件authkeys,用在messaging layer下各节点之间的认证,防止外界主机随意加入节点(600权限);
heartbeat核心配置文件,ha.cf;
资源管理配置文件:haresources;
它们的生效位置在/etc/ha.d/目录下,但是初始时在此目录下并没有这3个文件,它们的样例配置文件在/usr/share/docs/heartbeat-$$version/目录下,可以将它们复制到/etc/ha.d目录下。

#以下是yum安装,非编译安装的操作
cp /usr/share/doc/heartbeat-3.0.4/{authkeys,ha.cf,haresources} /etc/ha.d/

3.1 配置文件ha.cf
ha.cf的部分内容如下。该文件看起来很多,但如果不结合pacemaker,其实要修改的就几项,包括node和bcast/mcast以及auto_failback,有时还配置下ping和log。注意该文件从上往下读取,指令的配置位置很重要,因此一般不要修改它们的出现顺序。

#  如果logfile/debugfile/logfacility都没有设置,则等价于设置了"use_logd yes"
#  且use_logd设置为yes后,logfile/debugfile/logfacility的设置都失效
#
#      Note on logging:
#      If all of debugfile, logfile and logfacility are not defined,
#      logging is the same as use_logd yes. In other case, they are
#      respectively effective. if detering the logging to syslog,
#      logfacility must be "none".
#
#      File to write debug messages to
#debugfile /var/log/ha-debug
#
#
#      File to write other messages to
#
#logfile        /var/log/ha-log
#
#
#      Facility to use for syslog()/logger
#
logfacility    local0
#
#
#      A note on specifying "how long" times below...
#
#      The default time unit is seconds
#              10 means ten seconds
#
#      You can also specify them in milliseconds
#              1500ms means 1.5 seconds
#
#
#      keepalive: how long between heartbeats?
#  发送心跳信息的时间间隔,默认每两秒发送一次心跳信息
#keepalive 2
#
#      deadtime: how long-to-declare-host-dead?
#
#              If you set this too low you will get the problematic
#              split-brain (or cluster partition) problem.
#              See the FAQ for how to use warntime to tune deadtime.
#  指定若备节点在30秒内未收到主节点心跳信号,则判定主节点死亡,并接管主服务器资源
#deadtime 30
#
#      warntime: how long before issuing "late heartbeat" warning?
#      See the FAQ for how to use warntime to tune deadtime.
#  指定心跳延迟的时间为10秒,10秒内备节点不能接收主节点心跳信号,即往日志写入警告日志,但不会切换服务
#warntime 10
#
#
#      Very first dead time (initdead)
#
#      On some machines/OSes, etc. the network takes a while to come up
#      and start working right after you've been rebooted.  As a result
#      we have a separate dead time for when things first come up.
#      It should be at least twice the normal dead time.
#  定义第一次死亡判定时间,即第一个heartbeat启动后等待第二个heartbeat启动,
#  第二个启动后才会启动高可用服务、启动VIP等。若在此时间内第二个节点未启动则
#  判定其dead,然后才启动高可用服务和VIP,这是双方为形成高可用群集的等待时间。
#  此时间至少要是deadtime的两倍
#initdead 120
#
#
#      What UDP port to use for bcast/ucast communication?
#  心跳信息端口
#udpport        694
#
#      Baud rate for serial ports...
#  支持两种方式发送心跳信息,一是以太网(广播组播单播),一是串行线,在heartbeat3中,baud已经废弃
#baud  19200
#
#      serial  serialportname ...
#serial /dev/ttyS0      # Linux
#serial /dev/cuaa0      # FreeBSD
#serial /dev/cuad0      # FreeBSD 6.x
#serial /dev/cua/a      # Solaris
#
#
#      What interfaces to broadcast heartbeats over?
#
#bcast  eth0            # Linux
#bcast  eth1 eth2      # Linux
#bcast  le0            # Solaris
#bcast  le1 le2        # Solaris
#
#      Set up a multicast heartbeat medium
#      mcast [dev] [mcast group] [port] [ttl] [loop]
#
#      [dev]          device to send/rcv heartbeats on
#      [mcast group]  multicast group to join (class D multicast address
#                      224.0.0.0 - 239.255.255.255)
#      [port]          udp port to sendto/rcvfrom (set this value to the
#                      same value as "udpport" above)
#      [ttl]          the ttl value for outbound heartbeats.  this effects
#                      how far the multicast packet will propagate.  (0-255)
#                      Must be greater than zero.
#      [loop]          toggles loopback for outbound multicast heartbeats.
#                      if enabled, an outbound packet will be looped back and
#                      received by the interface it was sent on. (0 or 1)
#                      Set this value to zero.
#
#
#mcast eth0 225.0.0.1 694 1 0
#
#      Set up a unicast / udp heartbeat medium
#      ucast [dev] [peer-ip-addr]
#
#      [dev]          device to send/rcv heartbeats on
#      [peer-ip-addr]  IP address of peer to send packets to
#
#  单播心跳,需指定对方心跳接口地址
#ucast eth0 192.168.1.2
#
#
#      About boolean values...
#
#      Any of the following case-insensitive values will work for true:
#              true, on, yes, y, 1
#      Any of the following case-insensitive values will work for false:
#              false, off, no, n, 0
#
#
#
#      auto_failback:  determines whether a resource will
#      automatically fail back to its "primary" node, or remain
#      on whatever node is serving it until that node fails, or
#      an administrator intervenes.
#
#      The possible values for auto_failback are:
#              on      - enable automatic failbacks
#              off    - disable automatic failbacks
#              legacy  - enable automatic failbacks in systems
#                      where all nodes do not yet support
#                      the auto_failback option.
#
#      auto_failback "on" and "off" are backwards compatible with the old
#              "nice_failback on" setting.
#
#      See the FAQ for information on how to convert
#              from "legacy" to "on" without a flash cut.
#              (i.e., using a "rolling upgrade" process)
#
#      The default value for auto_failback is "legacy", which
#      will issue a warning at startup.  So, make sure you put
#      an auto_failback directive in your ha.cf file.
#      (note: auto_failback can be any boolean or "legacy")
#  主节点恢复重新上线后,是否自动接管服务
auto_failback on
#
#  以下是fence设备相关
#      Basic STONITH support
#      Using this directive assumes that there is one stonith
#      device in the cluster.  Parameters to this device are
#      read from a configuration file. The format of this line is:
#
#        stonith <stonith_type> <configfile>
#
#      NOTE: it is up to you to maintain this file on each node in the
#      cluster!
#
#stonith baytech /etc/ha.d/conf/stonith.baytech
#
#      STONITH support
#      You can configure multiple stonith devices using this directive.
#      The format of the line is:
#        stonith_host <hostfrom> <stonith_type> <params...>
#        <hostfrom> is the machine the stonith device is attached
#              to or * to mean it is accessible from any host.
#        <stonith_type> is the type of stonith device (a list of
#              supported drives is in /usr/lib/stonith.)
#        <params...> are driver specific parameters.  To see the
#              format for a particular device, run:
#          stonith -l -t <stonith_type>
#
#
#      Note that if you put your stonith device access information in
#      here, and you make this file publically readable, you're asking
#      for a denial of service attack ;-)
#
#      To get a list of supported stonith devices, run
#              stonith -L
#      For detailed information on which stonith devices are supported
#      and their detailed configuration options, run this command:
#              stonith -h
#
#stonith_host *    baytech 10.0.0.3 mylogin mysecretpassword
#stonith_host ken3  rps10 /dev/ttyS1 kathy 0
#stonith_host kathy rps10 /dev/ttyS1 ken3 0
#
#
#  看门狗是一个计时器。如果自身60秒不心跳了,则本节点会重启
#      Watchdog is the watchdog timer.  If our own heart doesn't beat for
#      a minute, then our machine will reboot.
#      NOTE: If you are using the software watchdog, you very likely
#      wish to load the module with the parameter "nowayout=0" or
#      compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
#      an orderly shutdown of heartbeat will trigger a reboot, which is
#      very likely NOT what you want.
#
#watchdog /dev/watchdog  #看门狗fence设备,Linux自带的软watchdog
#     
#      Tell what machines are in the cluster
#      node    nodename ...    -- must match uname -n
#  必须配置的node,必须和uname -n的结果一致
#node  ken3
#node  kathy
#
#      Less common options...
#
#      Treats 10.10.10.254 as a psuedo-cluster-member
#      Used together with ipfail below...
#      note: don't use a cluster node as ping node
#
#  通过ping参考ip检测本节点对外的网络连通性,需要配合ipfail进程。当ping不通时将down掉本节点
#  ping_group是通过ping一组ip来检查ip的连通性,防止因对方节点故障而误以为自己坏了。
#  只有当组中所有节点都ping不通才认为自己坏了。和ping只能使用二选一
#  不要使用集群节点作为ping的参考ip,一般ping的对象都是网关
#ping 10.10.10.254
#ping_group group1 172.16.103.254 172.16.103.212

#  随heartbeat启动、停止而启动停止的进程,它是pacemaker在heartbeat中的实现
#respawn hacluster  /usr/local/lib/heartbeat/ipfail

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/79227f6773124b6aa3e0ab8d424485b0.html