从上面的配置可知,id为0,1,2,3的节点分别对应server0, server1, server2, server3。而上例中的partition:0分布在id为0, 1, 2这三个节点上,因此可以在server0, server1, server2这三个节点上看到有kafka-0这个文件夹。这个kafka-0就代表Topic为kafka的partion0。
第二部分 Kafka+Log4j项目整合
先来看下Maven项目结构图:
作为Demo,文件不多。先看看pom.xml引入了哪些jar包:
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.9.2</artifactId>
<version>0.8.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.8.2.1</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>18.0</version>
</dependency>
重要的内容是log4j.properties:
log4j.rootLogger=INFO,console
# for package com.demo.kafka, log would be sent to kafka appender.
log4j.logger.com.demo.kafka=DEBUG,kafka
# appender kafka
log4j.appender.kafka=kafka.producer.KafkaLog4jAppender
log4j.appender.kafka.topic=kafka
# multiple brokers are separated by comma ",".
log4j.appender.kafka.brokerList=localhost:9092, localhost:9093, localhost:9094, localhost:9095
log4j.appender.kafka.compressionType=none
log4j.appender.kafka.syncSend=true
log4j.appender.kafka.layout=org.apache.log4j.PatternLayout
log4j.appender.kafka.layout.ConversionPattern=%d [%-5p] [%t] - [%l] %m%n
# appender console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d [%-5p] [%t] - [%l] %m%n
App.Java里面就很简单啦,主要是通过log4j输出日志:
package com.demo.kafka;
import org.apache.log4j.Logger;
public class App {
private static final Logger LOGGER = Logger.getLogger(App.class);
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 20; i++) {
LOGGER.info("Info [" + i + "]");
Thread.sleep(1000);
}
}
}
MyConsumer.java用于消费kafka中的信息:
package com.demo.kafka;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import com.google.common.collect.ImmutableMap;
import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
import kafka.message.MessageAndMetadata;
public class MyConsumer {
private static final String ZOOKEEPER = "localhost:2181";
//groupName可以随意给,因为对于kafka里的每条消息,每个group都会完整的处理一遍
private static final String GROUP_NAME = "test_group";
private static final String TOPIC_NAME = "kafka";
private static final int CONSUMER_NUM = 4;
private static final int PARTITION_NUM = 4;
public static void main(String[] args) {
// specify some consumer properties
Properties props = new Properties();
props.put("zookeeper.connect", ZOOKEEPER);
props.put("zookeeper.connectiontimeout.ms", "1000000");
props.put("group.id", GROUP_NAME);
// Create the connection to the cluster
ConsumerConfig consumerConfig = new ConsumerConfig(props);
ConsumerConnector consumerConnector =
Consumer.createJavaConsumerConnector(consumerConfig);
// create 4 partitions of the stream for topic “test”, to allow 4
// threads to consume
Map<String, List<KafkaStream<byte[], byte[]>>> topicMessageStreams =
consumerConnector.createMessageStreams(
ImmutableMap.of(TOPIC_NAME, PARTITION_NUM));
List<KafkaStream<byte[], byte[]>> streams = topicMessageStreams.get(TOPIC_NAME);
// create list of 4 threads to consume from each of the partitions
ExecutorService executor = Executors.newFixedThreadPool(CONSUMER_NUM);
// consume the messages in the threads
for (final KafkaStream<byte[], byte[]> stream : streams) {
executor.submit(new Runnable() {
public void run() {
for (MessageAndMetadata<byte[], byte[]> msgAndMetadata : stream) {
// process message (msgAndMetadata.message())
System.out.println(new String(msgAndMetadata.message()));
}
}
});
}
}
}
MyProducer.java用于向Kafka发送消息,但不通过log4j的appender发送。此案例中可以不要。但是我还是放在这里: