java的多输出hadoop

a8jjtwal 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(284)

我用的是hadoop2.6.5，写的程序有3个输出文件。
当我运行本地-程序的工作非常好，并创建3个输出文件。当我用emr运行它时，这行崩溃了-文件已经存在：o
我知道这不是将hadoop与emr结合使用的方法。
我看过这个帖子：https://forums.aws.amazon.com/thread.jspa?threadid=131036
但是我没有这个方法：mos.getcollector，我没有找到任何关于如何使用emr在java中创建mos的文档。
这是我减速机的代码：

@Override
    protected void setup(Context context)
            throws IOException, InterruptedException {
        // TODO Auto-generated method stub
        mos = new MultipleOutputs<>(context);
    }
@Override
protected void cleanup(Reducer<Text, IntWritable, Text, IntWritable>.Context context)
        throws IOException, InterruptedException {
    // TODO Auto-generated method stub
    mos.close();

}

... reduce方法结束-编写结果文件部分（一、二、三）

if (keyArr.length == 1) {
              mos.write(key, result, "ones.txt");      
      }
      else if (keyArr.length == 2) {
              mos.write(key, result, "twos.txt"); 
      }
      else {
              mos.write(key, result, "threes.txt"); 
      }

Java hadoop emr

来源：https://stackoverflow.com/questions/47733586/hadoop-using-multipleoutputs-with-emr-java