Spark学习进度11-Spark Streaming&Structured Streaming

Spark Streaming Spark Streaming 介绍

批量计算

Spark学习进度11-Spark Streaming&Structured Streaming

 

 流计算

Spark学习进度11-Spark Streaming&Structured Streaming

 

Spark Streaming 入门  Netcat 的使用

Spark学习进度11-Spark Streaming&Structured Streaming

 项目实例

目标:使用 Spark Streaming 程序和 Socket server 进行交互, 从 Server 处获取实时传输过来的字符串, 拆开单词并统计单词数量, 最后打印出来每一个小批次的单词数量

Spark学习进度11-Spark Streaming&Structured Streaming

 步骤: 

package cn.itcast.streaming import org.apache.spark.SparkConf import org.apache.spark.storage.StorageLevel import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream} import org.apache.spark.streaming.{Seconds, StreamingContext} object StreamingWordCount { def main(args: Array[String]): Unit = { //1.初始化 val sparkConf=new SparkConf().setAppName("streaming").setMaster("local[2]") val ssc=new StreamingContext(sparkConf,Seconds(5)) ssc.sparkContext.setLogLevel("WARN") val lines: ReceiverInputDStream[String] = ssc.socketTextStream( hostname = "192.168.31.101", port = 9999, storageLevel = StorageLevel.MEMORY_AND_DISK_SER ) //2.数据处理 //2.1把句子拆单词 val words: DStream[String] =lines.flatMap(_.split(" ")) val tuples: DStream[(String, Int)] =words.map((_,1)) val counts: DStream[(String, Int)] =tuples.reduceByKey(_+_) //3.展示 counts.print() ssc.start() ssc.awaitTermination() } }

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zywxjd.html