我在考虑用大数据来构建一些东西。理想情况下,我想做的是:
把一个.csv文件放到flume中,比kafka,执行n etl并放回kafka,从kafka文件放到flume中,然后放回hdfs中。一旦信息在hdfs中,我想执行一个map reduce作业或一些hive查询,然后绘制任何我想要的图表。
如何将.csv文件放入flume并保存到kafka?我有这段代码,但我不确定它是否有效:
myagent.sources = r1
myagent.sinks = k1
myagent.channels = c1
myagent.sources.r1.type = spooldir
myagent.sources.r1.spoolDir = /home/xyz/source
myagent.sources.r1.fileHeader = true
myagent.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
vmagent.channels.c1.type = memory
myagent.channels.c1.capacity = 1000
myagent.channels.c1.transactionCapacity = 100
myagent.sources.r1.channels = c1
myagent.sinks.k1.channel = c1
有什么帮助或建议吗?如果这段代码是正确的,如何继续?
谢谢大家!!
1条答案
按热度按时间6tqwzwtp1#
你的接收器配置不完整。尝试:
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = mytopic
a1.sinks.k1.brokerList = localhost:9092a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20a1.sinks.k1.channel = c1
https://flume.apache.org/flumeuserguide.html#kafka-Flume