对日志文件中的error进行监控,当日志文件中出现error关键字时,就截取日志(grep -i error 不区分大小写进行搜索"error"关键字,但是会将包含error大小写字符的单词搜索出来),大家可以去看这编 文章
1)第一类日志
在每天的日志目录下生产的error日志,此日志文件每天都会自动生成,里面有没有error日志内容不一定,日志内容写入不频繁,日志文件比较小。
举例说明:
采用sendemail发送告警邮件,sendemail安装参考:http://www.cnblogs.com/kevingrace/p/5961861.html
监控脚本路径: [root@fk-databus01 ~]# cd /opt/log_error_script/ [root@fk-databus01 log_error_script]# ll total 20 -rw-r--r-- 1 root root 3782 Jun 29 12:13 DEJ_0001_error.log -rwxr-xr-x 1 root root 4274 Jun 29 11:38 prcc_log_error.sh -rwxr-xr-x 1 root root 1142 Feb 13 10:51 sendemail.sh 监控脚本内容 [root@fk-databus01 log_error_script]# cat prcc_log_error.sh #!/bin/sh ERROR_LOG=`/bin/ls /data/log/sedsb/$(date +%Y%m%d)/DEJ_0001_error*` ERROR_NEW_LOG=http://www.likecs.com/opt/log_error_script/DEJ_0001_error.log DATE=`date +%Y年%m月%d日%H时%M分%S秒` HOST=`/bin/hostname` IP=`/sbin/ifconfig|grep "inet addr"|grep "Bcast"|cut -d":" -f2|awk -F" " \'{print $1}\'` ERROR_MESSAGE=$(/bin/grep -A20 "$(grep "ERROR" $ERROR_LOG|tail -1|awk \'{print $1,$2,$3,$4}\')" $ERROR_LOG) DIR=http://www.likecs.com/data/log/sedsb/$(date +%Y%m%d) FILE=http://www.likecs.com/data/log/sedsb/$(date +%Y%m%d)/DEJ_0001_error_$(date +%Y%m%d).0.log if [ ! -d $DIR ];then /bin/mkdir $DIR fi if [ ! -f $FILE ];then /bin/touch $FILE fi /bin/chown -R zquser.zquser $DIR sleep 3 if [ ! -f $ERROR_NEW_LOG ];then /bin/touch $ERROR_NEW_LOG fi NUM1=$(/bin/cat $ERROR_LOG|wc -l) NUM2=$(/bin/cat $ERROR_NEW_LOG|wc -l) if [ -f $ERROR_LOG ] && [ $NUM1 -ne 0 ] && [ $NUM2 -eq 0 ];then /bin/bash /opt/log_error_script/sendemail.sh wangshibo@kevin.com "风控系统${HOSTNAME}机器prcc服务日志的error监控" "告警主机:${HOSTNAME} \n告警IP:${IP} \n告警时间:${DATE} \n告警等级:严重,抓紧解决啊! \n告警人员:王士博 \n告警详情:prcc服务日志中出现error了! \n告警日志文件:${ERROR_LOG} \n当前状态: PROBLEM \n \nerror信息:\n$ERROR_MESSAGE" /bin/cat $ERROR_LOG > $ERROR_NEW_LOG fi /usr/bin/cmp $ERROR_LOG $ERROR_NEW_LOG >/dev/null 2>&1 if [ $? -ne 0 ];then /bin/bash /opt/log_error_script/sendemail.sh wangshibo@kevin.com "风控系统${HOSTNAME}机器prcc服务日志的error监控" "告警主机:${HOSTNAME} \n告警IP:${IP} \n告警时间:${DATE} \n告警等级:严重,抓紧解决啊! \n告警人员:王士博 \n告警详情:prcc服务日志中出现error了! \n告警日志文件:${ERROR_LOG} \n当前状态: PROBLEM \n \nerror信息:\n$ERROR_MESSAGE" /bin/cat $ERROR_LOG > $ERROR_NEW_LOG fi 结合crontab进行定时监控(每15秒执行一次) [root@fk-databus01 ~]# crontab -l #监控pcrr日志的error * * * * * /bin/bash -x /opt/log_error_script/prcc_log_error.sh >/dev/null 2>&1 * * * * * sleep 15;/bin/bash -x /opt/log_error_script/prcc_log_error.sh >/dev/null 2>&1 * * * * * sleep 30;/bin/bash -x /opt/log_error_script/prcc_log_error.sh >/dev/null 2>&1 * * * * * sleep 45;/bin/bash -x /opt/log_error_script/prcc_log_error.sh >/dev/null 2>&1 ================================================================================== 针对上面脚本中的某些变量说明 [root@fk-databus01 ~]# /bin/ls /data/log/sedsb/$(date +%Y%m%d)/DEJ_0001_error* /data/log/sedsb/20180629/DEJ_0001_error_20180629.0.log [root@fk-databus01 ~]# grep "ERROR" /data/log/sedsb/20180629/DEJ_0001_error_20180629.0.log ERROR DEJ 2018-06-29 12:13:29.191 [pool-4-thread-10] n.s.p.r.thread.OuterCheThdInterface - cx201806291213288440016车300接口异常! ERROR DEJ 2018-06-29 12:13:29.196 [nioEventLoopGroup-3-12] n.s.p.r.c.MessageControllerImpl - cx201806291213288440016: [root@fk-databus01 ~]# grep "ERROR" /data/log/sedsb/20180629/DEJ_0001_error_20180629.0.log |tail -1|awk \'{print $1,$2,$3,$4}\' ERROR DEJ 2018-06-29 12:13:29.196 [root@fk-databus01 ~]# /bin/grep -A20 "$(grep "ERROR" /data/log/sedsb/20180629/DEJ_0001_error_20180629.0.log |tail -1|awk \'{print $1,$2,$3,$4}\')" /data/log/sedsb/20180629/DEJ_0001_error_20180629.0.log ERROR DEJ 2018-06-29 12:13:29.196 [nioEventLoopGroup-3-12] n.s.p.r.c.MessageControllerImpl - cx201806291213288440016: net.sinocredit.pre.rcc.utils.exception.OuterDataException: 外部数据:cheFixPrice:mile里程 is null; at net.sinocredit.pre.rcc.datafactory.OuterDataProcess.execute(OuterDataProcess.java:51) at net.sinocredit.pre.rcc.datafactory.OuterDataProcess.execute(OuterDataProcess.java:23) at net.sinocredit.pre.rcc.service.getOtherDataService.MessageServiceImpl.getOrderData(MessageServiceImpl.java:34) at net.sinocredit.pre.rcc.controller.MessageControllerImpl.divMessage(MessageControllerImpl.java:110) at net.sinocredit.pre.rcc.handler.ServerHandler.channelRead(ServerHandler.java:28) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373)