我试图在mapreduce中输出{key,list(values)},但是我只得到排序的{key,value}对

r6l8ljro  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(374)

我的要求如下

input file 
key    value
eid    ename
1      a
2      b
3      c

o/p文件

key   values
eid   1,2,3
ename a,b,c

我在Map器中使用头数组和数据数组以及case1:without reducer(即setnumreducetasks(0))编写了逻辑
案例2:使用默认大小写
在这两种情况下,我只是得到的o/p作为

eid   1
eid   2
eid   3
ename a
ename b
ename c
ugmeyewa

ugmeyewa1#

要实现这一点,您必须使用减速器。原因是,你想要所有的记录 eid 去同一个减速机和所有的记录 ename 转到同一个减速器。这将帮助您聚合 eid 以及 ename .
如果您只是使用Map器(不带还原器),那么可能 eid 我可以去找不同的制图员。
以下代码实现了这一点:

package com.myorg.hadooptests;    

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;
import java.util.Iterator;

public class EidTest {

    public static class EidTestMapper
            extends Mapper<LongWritable, Text , Text, Text > {

        public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {

            String line = value.toString();
            String[] words = line.split("\t");

            if(words.length == 2) {
                context.write(new Text("eid"), new Text(words[0]));
                context.write(new Text("ename"), new Text(words[1]));
            }
        }
    }

    public static class EidTestReducer
            extends Reducer<Text, Text, Text, Text> {

        public void reduce(Text key, Iterable<Text> values, Context context)
                throws IOException, InterruptedException {

            String finalVal = "";

            for (Text val : values) {
                finalVal = finalVal.concat(val.toString()).concat(",");
            }

            finalVal = finalVal.substring(0, finalVal.length() - 1); // Remove trailing comma
            context.write(key, new Text(finalVal));
        }
    }

    public static void main(String[] args) throws Exception {

        Configuration conf = new Configuration();

        Job job = Job.getInstance(conf, "EidTest");
        job.setJarByClass(WordCount.class);
        job.setMapperClass(EidTestMapper.class);
        job.setReducerClass(EidTestReducer.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);

        FileInputFormat.addInputPath(job, new Path("/in/in9.txt"));
        FileOutputFormat.setOutputPath(job, new Path("/out/"));

        job.waitForCompletion(true);
    }
}

对于您的输入,我得到了输出(Map器假设键/值是用tab分隔的):

eid     3,2,1
ename   c,b,a

相关问题