默认不支持分桶功能,需要开启。
set hive.enforce.bucketing;
+-------------------------------+--+
| set |
+-------------------------------+--+
| hive.enforce.bucketing=false |
+-------------------------------+--+
执行开启:
set hive.enforce.bucketing = true;
+------------------------------+--+
| set |
+------------------------------+--+
| hive.enforce.bucketing=true |
+------------------------------+--+
最终转为MapReduce,设置4桶:
set mapreduce.job.reduces = 4;
+--------------------------+--+
| set |
+--------------------------+--+
| mapreduce.job.reduces=4 |
+--------------------------+--+
创建分桶表
create table t1(id int, name string, age int, dept string)
clustered by(id)
into 4 buckets
row format delimited
fields terminated by ',';
导入数据:insert + select
insert overwrite table t1 select * from t2 cluster by(id);
内容来源于网络,如有侵权,请联系作者删除!