mapreduce中的全局变量或属性?

zlhcx6iw  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(342)

我希望能够在mr作业的map阶段设置某种变量或标志,以便在作业完成后进行检查。我认为用一些代码来演示我想要的最好的方法是:p.s我正在使用hadoop2.2.0

public class MRJob {

  public static class MapperTest 
       extends Mapper<Object, Text, Text, IntWritable>{

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
        //Do some computation to get new value and key
        ...
        //Check if new value equal to some condition e.g if(value < 1) set global variable to true

        context.write(newKey, newValue);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();

    Job job = Job.getInstance(new Configuration(), "word_count");
   //set job configs

    job.waitForCompletion(true);

    //Here I want to be able to check if my global variable has been set to true by any one of the mappers

  }
}
lmyy7pcs

lmyy7pcs1#

使用 Counter 关于那件事。

public static enum UpdateCounter {
  UPDATED
}

 @Override
 public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

    if(value < 1) {
      context.getCounter(UpdateCounter.UPDATED).increment(1);
    }

    context.write(newKey, newValue);
}

作业完成后,您可以检查:

Configuration conf = new Configuration();

Job job = Job.getInstance(new Configuration(), "word_count");
//set job configs

job.waitForCompletion(true);
long counter = job.getCounters().findCounter(UpdateCounter.UPDATED).getValue();   

if(counter > 0) 
  // some mapper has seen the condition

相关问题