使用flume+kafka+storm构建实时日志分析系统

Kafka-Storm 集成部署

1. flume安装使用
    下载flume安装包
    解压$ tar -xzvf apache-flume-1.5.2-bin.tar.gz -C /opt/flume
    flume配置文件放在conf文件目录下,执行文件放在bin文件目录下。
    1)配置flume
    进入conf目录将flume-conf.properties.template拷贝一份,并命名为自己需要的名字
    $ cp flume-conf.properties.template flume.conf
    修改flume.conf的内容,我们使用file sink来接收channel中的数据,channel采用memory channel,source采用exec source,配置文件如下:

agent.sources = seqGenSrc
agent.channels = memoryChannel
agent.sinks = loggerSink

# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = exec
agent.sources.seqGenSrc.command = tail -F /data/mongodata/mongo.log
#agent.sources.seqGenSrc.bind = 172.168.49.130

# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = memoryChannel

# Each sink's type must be defined
agent.sinks.loggerSink.type = file_roll
agent.sinks.loggerSink.sink.directory = /data/flume

#Specify the channel the sink should use
agent.sinks.loggerSink.channel = memoryChannel

# Each channel's type is defined.
agent.channels.memoryChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.memoryChannel.capacity = 1000
agent.channels.memory4log.transactionCapacity = 100
    2)运行flume agent
    切换到bin目录下,运行一下命令:
    $ ./flume-ng agent --conf ../conf -f ../conf/flume.conf --n agent -Dflume.root.logger=INFO,console
    在/data/flume目录下可以看到生成的日志文件。

2. 结合kafka
    由于flume1.5.2没有kafka sink,所以需要自己开发kafka sink
    可以参考flume 1.6里面的kafka sink,但是要注意使用的kafka版本,由于有些kafka api不兼容的
    这里只提供核心代码,process()内容。

Sink.Status status = Status.READY;

Channel ch = getChannel();
    Transaction transaction = null;
    Event event = null;
    String eventTopic = null;
    String eventKey = null;
   
    try {
        transaction = ch.getTransaction();
        transaction.begin();
        messageList.clear();
       
        if (type.equals("sync")) {
            event = ch.take();

if (event != null) {
        byte[] tempBody = event.getBody();
        String eventBody = new String(tempBody,"UTF-8");
        Map<String, String> headers = event.getHeaders();

if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
        eventTopic = topic;
        }

eventKey = headers.get(KEY_HDR);

if (logger.isDebugEnabled()) {
        logger.debug("{Event} " + eventTopic + " : " + eventKey + " : "
        + eventBody);
        }
       
        ProducerData<String, Message> data = new ProducerData<String, Message>
        (eventTopic, new Message(tempBody));
       
        long startTime = System.nanoTime();
        logger.debug(eventTopic+"++++"+eventBody);
        producer.send(data);
        long endTime = System.nanoTime();
    }
        } else {
            long processedEvents = 0;
            for (; processedEvents < batchSize; processedEvents += 1) {
                event = ch.take();

if (event == null) {
        break;
        }

byte[] tempBody = event.getBody();
        String eventBody = new String(tempBody,"UTF-8");
        Map<String, String> headers = event.getHeaders();

if ((eventTopic = headers.get(TOPIC_HDR)) == null) {
        eventTopic = topic;
        }

eventKey = headers.get(KEY_HDR);

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/09c1f25fbdc95d573ef4d91f3dd2e40f.html