spark流

ajsxfq5m 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(283)

任何人请帮助我如何从现有的rdd创建一个数据流。我的代码是：

JavaSparkContext ctx = new JavaSparkContext(conf);
JavaRDD<String> rddd = ctx.parallelize(arraylist);

现在我需要使用这些rddd作为javastreamingcontext的输入。

Java hadoop apache-spark spark-streaming

来源：https://stackoverflow.com/questions/35088171/spark-streaming-from-existing-rdd

1条答案

按热度按时间

j13ufse21#

试试QueueStreamAPI。
rdd队列作为一个流，推送到队列中的每个rdd都将被视为数据流中的一批数据，并像流一样进行处理。

public <T> InputDStream<T> queueStream(scala.collection.mutable.Queue<RDD<T>> queue,
                              boolean oneAtATime,
                              scala.reflect.ClassTag<T> evidence$15)

Create an input stream from a queue of RDDs. In each batch, it will process either one or all of the RDDs returned by the queue.
NOTE: Arbitrary RDDs can be added to queueStream, there is no way to recover data of those RDDs, so queueStream doesn't support checkpointing.

赞(0）回复(0）举报 2021-05-29

我来回答

spark流

1条答案

相关问题

热门标签

最新问答