我希望通过不指定outputdir(即本地输出文件)来运行hbase导出,而是将导出作业输出直接流式传输到远程主机,以避免本地临时存储。
String tableName = args[0];
**Path outputDir = new Path(args[1]);**
Job job = new Job(conf, "export" + "_" + tableName);
job.setJobName(NAME + "_" + tableName);
job.setJarByClass(Exporter.class);
// Set optional scan parameters
Scan s = getConfiguredScanForJob(conf, args);
TableMapReduceUtil.initTableMapperJob(tableName, s, Exporter.class, null,
null, job);
// No reducers. Just write straight to output files.
job.setNumReduceTasks(0);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
job.setOutputKeyClass(ImmutableBytesWritable.class);
job.setOutputValueClass(Result.class);
FileOutputFormat.setOutputPath(job, outputDir);
job.waitForCompletion(true);
有些帖子提到我们可以覆盖formatclass的getrecordwriter()方法,但我不明白。
暂无答案!
目前还没有任何答案,快来回答吧!