我的wordcount示例如下:
public class WordCount extends Configured implements Tool {
public static class Map extends
Mapper<LongWritable, Text, Text, IntWritable> {}
public static class Reduce extends
Reducer<Text, IntWritable, Text, IntWritable> {}
public static void main(String[] args) throws Exception {
BasicConfigurator.configure();
Logger.getRootLogger().setLevel(Level.WARN);
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
}
@Override
public int run(String[] args) throws Exception {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
if (fs.exists(new Path(args[1]))) {
fs.delete(new Path(args[1]), true);
}
Job job = Job.getInstance(conf, "wordcount");
long startTime = System.currentTimeMillis();
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setJarByClass(WordCount.class);
// job.setJar(WordCount.class.getSimpleName());
job.waitForCompletion(true);
System.out.println("Job Finished in "
+ (System.currentTimeMillis() - startTime) / 1000.0
+ " seconds");
return 0;
}
}
这个 job.setJarByClass()
调用不起作用,我收到一条“no job jar file set”消息。另外,此调用之后的job.getjar()显示“null”值。有人知道这里有什么问题吗?
我也试过了 job.setJarByClass(this.getClass())
, job.setJar("WordCount")
以及 job.setJar(WordCount.class.getSimpleName())
. 第一个没有效果, job.getJar()
返回null,第二个和第三个都给我 FileNotFoundException
:文件字数不存在。然后我试着 job.setJar("src/wordcount/WordCount.java")
以及 job.setJar("bin/wordcount/WordCount.class")
,两者都在eclipse中成功(没有此警告消息),但仍然失败 FileNotFoundException
当作为命令行上的独立jar文件执行时。如果不是未解析的依赖项,我猜问题可能与类路径设置有关。
2条答案
按热度按时间svmlkihl1#
请使用这个java代码进行字数计算,有两个参数,一个是输入文件,另一个是结果文件。并从中添加所有jar文件
mapreduce
以及common
hadoop目录中的文件夹或者,如果你想使用高级版本使用这段代码的三个参数,这里第三个文件,你不想计数的例子
,
```package org.samples.mapreduce.training;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.net.URI;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.Counter;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.StringUtils;
public class WordCountV2 {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
GenericOptionsParser optionParser = new GenericOptionsParser(conf, args);
String[] remainingArgs = optionParser.getRemainingArgs();
if (!(remainingArgs.length != 2 || remainingArgs.length != 4)) {
System.err.println("Usage: wordcount [-skip skipPatternFile]");
System.exit(2);
}
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCountV2.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
}
}
4uqofj5v2#
我认为应该添加适当的jar文件。
在你的情况下,你必须有这个jar
org.apache.hadoop.mapreduce.Job
在项目文件中。我导入了以下类和接口
你的计划也很顺利。导入上述所有类后,请检查。如果有任何问题,请给我一个意见。