spark streaming reducebykeyandwindow()不显示结果

xu3bshqb  于 2021-07-12  发布在  Spark
关注(0)|答案(0)|浏览(112)

我正在尝试在twitter流中计算字数,并希望每30秒移动20分钟的窗口来计算字数。我将显示在一个字和前10个字计数的数据框的结果。但是,在下面的代码中,twitter流文件运行良好,但下面的代码不会产生任何结果:


# split each tweet into words

words = dataStream.flatMap(lambda line: line.split(" "))

# Count only the words

wordCounts = words.map(lambda w: (w, 1))

# adding the count of each word to its last count using updateStateByKey

word_totals = wordCounts.reduceByKeyAndWindow(aggregate_words_count, 600, 30)
word_totals.pprint()

# do the processing for each RDD generated in each interval

word_totals.foreachRDD(process_rdd)

# start the streaming computation

ssc.start()

# wait for the streaming to finish

ssc.awaitTermination()

我做错什么了?
谢谢

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题