流媒体接收器到s3

eaf3rand 于 2021-06-21 发布在 Flink

关注(0)|答案(1)|浏览(391)

我正在尝试为我的流输出创建一个s3接收器。我想到了一个 BucketingSink 可以，因为它是用于hdfs的。但是s3url似乎没有被识别为hdfs。我得到以下错误：

Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: java.lang.RuntimeException: Error while creating FileSystem when initializing the state of the BucketingSink.
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Cannot support file system for 's3' via Hadoop, because Hadoop is not in the classpath, or some classes are missing from the classpath.

有没有办法让s3为你工作 BucketingSink ，或者除了 BucketingSink 我可以用吗？我目前正在运行1.5.2。很乐意提供更多信息。
谢谢您！
编辑：
我的Flume创建/使用如下所示：

val s3Sink = new BucketingSink[String]("s3://s3bucket/sessions")
s3Sink.setBucketer(new DateTimeBucketer[String]("yyyy-MM-dd--HHmm"))
s3Sink.setWriter(new StringWriter[String]())
s3Sink.setBatchSize(200)
s3Sink.setPendingPrefix("sessions-")
s3Sink.setPendingSuffix(".csv")

// Create stream and do stuff here 
stream.addSink(s3Sink)

amazon-s3 apache-flink flink-streaming

来源：https://stackoverflow.com/questions/51845238/streaming-sink-to-s3