Apache Beam实战指南(8)

设计思路:Kafka消息生产程序发送testmsg到Kafka集群,Apache Beam 程序读取Kafka的消息,经过简单的业务逻辑,最后发送到Kafka集群,然后Kafka消费端消费消息。

Apache Beam实战指南 | 玩转KafkaIO与Flink

Apache Beam 内部数据处理流程图

Apache Beam 程序通过kafkaIO读取Kafka集群的数据,进行数据格式转换。数据统计后,通过KafkaIO写操作把消息写入Kafka集群。最后把程序运行在Flink的计算平台上。

软件环境和版本说明

Kafka集群和Flink单机或集群配置,大家可以去网上搜一下配置文章,操作比较简单,这里就不赘述了。

实践步骤

1)新建一个Maven项目

Apache Beam实战指南 | 玩转KafkaIO与Flink

2)在pom文件中添加jar引用

<dependency> <groupId>org.apache.beam</groupId> <artifactId>beam-sdks-java-io-kafka</artifactId> <version>2.4.0</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>0.10.1.1</version> </dependency> <dependency> <groupId>org.apache.beam</groupId> <artifactId>beam-runners-core-java</artifactId> <version>2.4.0</version> </dependency> <dependency> <groupId>org.apache.beam</groupId> <artifactId>beam-runners-flink_2.11</artifactId> <version>2.4.0</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-java</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-clients_2.11</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-core</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-runtime_2.11</artifactId> <version>1.5.2</version> <!--<scope>provided</scope>--> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-streaming-java_2.11</artifactId> <version>1.5.2</version> <!--<scope>provided</scope>--> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-metrics-core</artifactId> <version>1.5.2</version> <!--<scope>provided</scope>--> </dependency>

3)新建BeamFlinkKafka.java类

Apache Beam实战指南 | 玩转KafkaIO与Flink

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/b0750d480662aeccb8a7dfff6579955c.html