Flink入门宝典(详细截图版) (3)

举例:

dataStream.flatMap(new FlatMapFunction<String, String>() {     @Override     public void flatMap(String value, Collector<String> out)         throws Exception {         for(String word: value.split(" ")){             out.collect(word);         }     } });

(3)Filter方式:DataStream -> DataStream

功能:针对每个element判断函数是否返回true,最后只保留返回true的element

举例:

dataStream.filter(new FilterFunction<Integer>() {     @Override     public boolean filter(Integer value) throws Exception {         return value != 0;     } });

(4)KeyBy方式:DataStream -> KeyedStream

功能:逻辑上将流分割成不相交的分区,每个分区都是相同key的元素

举例:

dataStream.keyBy("someKey") // Key by field "someKey" dataStream.keyBy(0) // Key by the first element of a Tuple

(5)Reduce方式:KeyedStream -> DataStream

功能:在keyed data stream中进行轮训reduce。

举例:

keyedStream.reduce(new ReduceFunction<Integer>() {     @Override     public Integer reduce(Integer value1, Integer value2)     throws Exception {         return value1 + value2;     } });

(6)Aggregations方式:KeyedStream -> DataStream

功能:在keyed data stream中进行聚合操作

举例:

keyedStream.sum(0); keyedStream.sum("key"); keyedStream.min(0); keyedStream.min("key"); keyedStream.max(0); keyedStream.max("key"); keyedStream.minBy(0); keyedStream.minBy("key"); keyedStream.maxBy(0); keyedStream.maxBy("key");

(7)Window方式:KeyedStream -> WindowedStream

功能:在KeyedStream中进行使用,根据某个特征针对每个key用windows进行分组。

举例:

dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data

(8)WindowAll方式:DataStream -> AllWindowedStream

功能:在DataStream中根据某个特征进行分组。

举例:

dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data

(9)Union方式:DataStream* -> DataStream

功能:合并多个数据流成一个新的数据流

举例:

dataStream.union(otherStream1, otherStream2, ...);

(10)Split方式:DataStream -> SplitStream

功能:将流分割成多个流

举例:

SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() {     @Override     public Iterable<String> select(Integer value) {         List<String> output = new ArrayList<String>();         if (value % 2 == 0) {             output.add("even");         }         else {             output.add("odd");         }         return output;     } });

(11)Select方式:SplitStream -> DataStream

功能:从split stream中选择一个流

举例:

SplitStream<Integer> split; DataStream<Integer> even = split.select("even"); DataStream<Integer> odd = split.select("odd"); DataStream<Integer> all = split.select("even","odd"); 4、输出数据 writeAsText() writeAsCsv(...) print() / printToErr() writeUsingOutputFormat() / FileOutputFormat writeToSocket addSink

更多Flink相关原理:

穿梭时空的实时计算框架——Flink对时间的处理

大数据实时处理的王者-Flink

统一批处理流处理——Flink批流一体实现原理

Flink快速入门--安装与示例运行

快速构建第一个Flink工程

更多实时计算,Flink,Kafka等相关技术博文,欢迎关注实时流式计算:

file

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zywzdp.html