java—通过hfile将数据加载到hbase不工作

sirbozc5  于 2021-06-01  发布在  Hadoop
关注(0)|答案(1)|浏览(313)

我写了一个Map器,通过hfile将磁盘上的数据加载到hbase中,程序运行成功,但是我的hbase表中没有加载数据,对此有什么想法吗?
下面是我的java程序:

protected void writeToHBaseViaHFile() throws Exception {
        try {
            System.out.println("In try...");
            Configuration conf = HBaseConfiguration.create();
            conf.set("hbase.zookeeper.quorum", "XXXX");
            Connection connection = ConnectionFactory.createConnection(conf);
            System.out.println("got connection");

            String inputPath = "/tmp/nuggets_from_Hive/part-00000";
            String outputPath = "/tmp/mytemp" + new Random().nextInt(1000);
            final TableName tableName = TableName.valueOf("steve1");
            System.out.println("got table steve1, outputPath = " + outputPath);

            // tag::SETUP[]
            Table table = connection.getTable(tableName);

            Job job = Job.getInstance(conf, "ConvertToHFiles");
            System.out.println("job is setup...");

            HFileOutputFormat2.configureIncrementalLoad(job, table,
                connection.getRegionLocator(tableName)); // <1>
            System.out.println("done configuring incremental load...");

            job.setInputFormatClass(TextInputFormat.class); // <2>

            job.setJarByClass(Importer.class); // <3>

            job.setMapperClass(LoadDataMapper.class); // <4>
            job.setMapOutputKeyClass(ImmutableBytesWritable.class); // <5>
            job.setMapOutputValueClass(KeyValue.class); // <6>

            FileInputFormat.setInputPaths(job, inputPath);
            HFileOutputFormat2.setOutputPath(job, new org.apache.hadoop.fs.Path(outputPath));
            System.out.println("Setup complete...");
            // end::SETUP[]

            if (!job.waitForCompletion(true)) {
                System.out.println("Failure");
            } else {
                System.out.println("Success");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

下面是我的mapper类:

public class LoadDataMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Cell> {

    public static final byte[] FAMILY = Bytes.toBytes("pd");
    public static final byte[] COL = Bytes.toBytes("bf");
    public static final ImmutableBytesWritable rowKey = new ImmutableBytesWritable();

    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] line = value.toString().split("\t"); // <1>
        byte[] rowKeyBytes = Bytes.toBytes(line[0]);
        rowKey.set(rowKeyBytes);
        KeyValue kv = new KeyValue(rowKeyBytes, FAMILY, COL, Bytes.toBytes(line[1])); // <6>
        context.write (rowKey, kv); // <7>
        System.out.println("line[0] = " + line[0] + "\tline[1] = " + line[1]);
    }

}

我已经创建了表 steve1 在我的群集中,但在程序成功运行后获得0行:

hbase(main):007:0> count 'steve1'
0 row(s) in 0.0100 seconds

=> 0

我试过的:
我尝试在mapper类中添加打印输出消息,以查看它是否真的读取了数据,但是打印输出从未在我的控制台中打印出来。我不知道如何调试这个。
任何想法都非常感谢!

csbfibhn

csbfibhn1#

这只是为了创建hfiles,您仍然需要将hfile加载到表中。例如,您需要执行以下操作:

LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(outputPath), admin, hTable, regionLocator);

相关问题