Flink开发IDEA环境搭建与测试(4)

单词在5秒的时间窗口(处理时间,翻滚窗口)中计算并打印到stdout监视TaskManager的输出文件并写入一些文本nc(输入在点击后逐行发送到Flink):

Flink开发IDEA环境搭建与测试

三. 使用IDEA开发离线程序

Dataset是flink的常用程序,数据集通过source进行初始化,例如读取文件或者序列化集合,然后通过transformation(filtering、mapping、joining、grouping)将数据集转成,然后通过sink进行存储,既可以写入hdfs这种分布式文件系统,也可以打印控制台,flink可以有很多种运行方式,如local、flink集群、yarn等.

1. scala程序

package com.xyg.batch import org.apache.flink.api.scala.ExecutionEnvironment import org.apache.flink.api.scala._ /** * Author: Mr.Deng * Date: 2018/10/19 * Desc: */ object WordCountScala{ def main(args: Array[String]) { //初始化环境 val env = ExecutionEnvironment.getExecutionEnvironment //从字符串中加载数据 val text = env.fromElements( "Who's there?", "I think I hear them. Stand, ho! Who's there?") //分割字符串、汇总tuple、按照key进行分组、统计分组后word个数 val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } } .map { (_, 1) } .groupBy(0) .sum(1) //打印 counts.print() } }

2. java程序

package com.xyg.batch; import org.apache.flink.api.common.functions.FlatMapFunction; import org.apache.flink.api.java.DataSet; import org.apache.flink.api.java.ExecutionEnvironment; import org.apache.flink.api.java.tuple.Tuple2; import org.apache.flink.util.Collector; /** * Author: Mr.Deng * Date: 2018/10/19 * Desc: */ public class WordCountJava { public static void main(String[] args) throws Exception { //构建环境 final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); //通过字符串构建数据集 DataSet<String> text = env.fromElements( "Who's there?", "I think I hear them. Stand, ho! Who's there?"); //分割字符串、按照key进行分组、统计相同的key个数 DataSet<Tuple2<String, Integer>> wordCounts = text .flatMap(new LineSplitter()) .groupBy(0) .sum(1); //打印 wordCounts.print(); } //分割字符串的方法 public static class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> { @Override public void flatMap(String line, Collector<Tuple2<String, Integer>> out) { for (String word : line.split(" ")) { out.collect(new Tuple2<String, Integer>(word, 1)); } } } }

3.运行

Flink开发IDEA环境搭建与测试

Linux公社的RSS地址https://www.linuxidc.com/rssFeed.aspx

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/wwgzdd.html