连接配置单元中的所有分区动态分区表

nfg76nw0 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(389)

我的配置单元表在2年内按日期进行分区，每个分区中有200个2mb文件。
我能够连接运行以下命令“alter table\u name partition（partition\u column\u name='2017-12-31'）concatenate”
手动运行每个查询需要更多的时间，那么有什么简单的方法可以做到这一点吗？

hadoop Hive hiveql

来源：https://stackoverflow.com/questions/56939168/concatenate-all-partitions-in-hive-dynamically-partitioned-table

1条答案

按热度按时间

3bygqnnd1#

选项1： Select and overwrite same hive table: 配置单元支持插入或覆盖同一个表，前提是使用 insert statements only （不通过hdfs加载文件）然后使用此选项。

hive> SET hive.exec.dynamic.partition = true;
hive> SET hive.exec.dynamic.partition.mode = nonstrict;
hive> Insert overwrite table <partition_table_name> partition(<partition_col>) 
      select * from <db>.<partition_table_name>;

还可以使用sort by、distribute by和这些附加参数来控制表中创建的文件数。
选项2： Using Shell script: ```
bash$ cat cnct.hql
alter table default.partitn1 partition(${hiveconf:var1} = '${hiveconf:var2}') concatenate

触发上述 `.hql` 使用shell脚本编写脚本（for循环）

bash$ cat trigg.sh

!/bin/bash

id=hive -e "show partitions default.partitn"
echo "partitions: " $id
for f in $id; do
echo "select query for: " $f

split the partitions on = then assigning to two variables

IFS="=" read var1 var2 <<< $f

pass the variables and execute the cnct.hql script

hive --hiveconf var1=$var1 --hiveconf var2=$var2 -f cnct.hql
done

赞(0）回复(0）举报 2021-05-29

我来回答

连接配置单元中的所有分区动态分区表

1条答案

!/bin/bash

split the partitions on = then assigning to two variables

pass the variables and execute the cnct.hql script

相关问题

热门标签

最新问答