将数据从twitter加载到hdfs时出错

zqry0prt  于 2021-06-04  发布在  Flume
关注(0)|答案(1)|浏览(225)

hi iam从twitter加载数据到hdfs时出错
我使用的是ambari沙盒hortonworks hadoop-2.7
这是我的flume.conf文件

flume.conf(Flume形态):

TwitterAgent.sources = Twitter
    TwitterAgent.channels = MemChannel
    TwitterAgent.sinks = HDFS

    TwitterAgent.sources.Twitter.type =    
    com.cloudera.flume.source.TwitterSource
    TwitterAgent.sources.Twitter.channels = MemChannel
    TwitterAgent.sources.Twitter.consumerKey =oblBU8btK3OpuSoFce8fJTOz9
    TwitterAgent.sources.Twitter.consumerSecret     
    =ofsGWmx1T4GHvi8qDcAySUAC3mVdvSS8VcfD9CPTejxzQ52izk
    TwitterAgent.sources.Twitter.accessToken =3479003538-     
    2OP1N7wKqSkAohXscehBdhbMfJhoXqSPkng7cPY
    TwitterAgent.sources.Twitter.accessTokenSecret       
    =0vrKLzdUplRnPjcTWiSNKhu9Ohe18FcoOXYMmD7OUazTt
    TwitterAgent.sources.Twitter.keywords = hadoop, big data, analytics
    TwitterAgent.sinks.HDFS.channel = MemChannel
    TwitterAgent.sinks.HDFS.type = hdfs
    TwitterAgent.sinks.HDFS.hdfs.path =/flume/tweets
    TwitterAgent.sinks.HDFS.hdfs.fileType =DataStream
    TwitterAgent.sinks.HDFS.hdfs.filePrefix =twitter
    TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text
    TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000
    TwitterAgent.sinks.HDFS.hdfs.rollSize = 0
    TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
    TwitterAgent.sinks.HDFS.hdfs.rollInterval = 10

    TwitterAgent.channels.MemChannel.type = memory
    TwitterAgent.channels.MemChannel.capacity = 10000
    TwitterAgent.channels.MemChannel.transactionCapacity = 100

15/09/11 07:21:03信息twitter4j.twitterstreamimpl:建立连接。15/09/11 07:21:03 info twitter4j.twitterstreamimpl:stream.twitter.com 15/09/11 07:21:03 info twitter4j.twitterstreamimpl:等待1000毫秒15/09/11 07:21:04 info twitter4j.twitterstreamimpl:建立连接。15/09/11 07:21:04 info twitter4j.twitterstreamimpl:stream.twitter.com 15/09/11 07:21:04 info twitter4j.twitterstreamimpl:等待2000毫秒15/09/11 07:21:06 info twitter4j.twitterstreamimpl:建立连接。15/09/11 07:21:06 info twitter4j.twitterstreamimpl:stream.twitter.com 15/09/11 07:21:06 info twitter4j.twitterstreamimpl:等待4000毫秒15/09/11 07:21:10 info twitter4j.twitterstreamimpl:建立连接。15/09/11 07:21:10 info twitter4j.twitterstreamimpl:stream.twitter.com 15/09/11 07:21:10 info twitter4j.twitterstreamimpl:等待8000毫秒15/09/11 07:21:18 info twitter4j.twitterstreamimpl:建立连接。15/09/11 07:21:18 info twitter4j.twitterstreamimpl:stream.twitter.com 15/09/11 07:21:18 info twitter4j.twitterstreamimpl:等待16000毫秒15/09/11 07:21:34 info twitter4j.twitterstreamimpl:建立连接。2011年9月15日07:21:34 info twitter4j.twitterstreamimpl:stream.twitter.com 2011年9月15日07:21:34 info twitter4j.twitterstreamimpl:等待16000毫秒
^c15/09/11 07:21:45 info lifecycle.lifecyclesupervisor:stopping lifecycle supervisor 10 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:组件类型:sink,名称:hdfs stopped 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型:sink,名称:hdfs的关闭度量。sink.start.time==1441956061906 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.stop.time==1441956105092 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.batch.complete==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.batch.empty==7 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.batch.underflow==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.connection.closed.count==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.connection.creation.count==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.connection.failed.count==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.event.drain.attempt==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为sink、名称为hdfs的关闭度量。sink.event.drain.sucess==0 15/09/11 07:21:45 info node.pollingproperties文件配置提供程序:配置提供程序停止15/09/11 07:21:45 info instrumentation.monitoredcountergroup:组件类型:通道,名称:memchannel stopped 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型:通道的关闭度量,名称:memchannel。channel.start.time==1441956061903 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.stop.time==1441956105094 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.capacity==10000 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.current.size==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.event.put.attempt==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.event.put.success==0 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.event.take.attempt==7 15/09/11 07:21:45 info instrumentation.monitoredcountergroup:类型为channel、名称为memchannel的关闭度量。channel.event.take.success==0[root@sandbox [垃圾箱]#

ux6nzvsh

ux6nzvsh1#

看起来您尚未给出完整的hdfs路径:

TwitterAgent.sinks.HDFS.hdfs.path =hdfs://localhost:8020/flume/tweets

这里localhost是主机名和8020 hdfs端口。
希望对你有帮助。如果你有任何问题,请告诉我。

相关问题