在pig中将一行转换为多行

9njqaruj  于 2021-06-25  发布在  Pig
关注(0)|答案(1)|浏览(398)

我想为下面的查询编写一个pig脚本。
输入为:

ABC,DEF,GHI,JKL,AAA,aaa,1,2,3,bbb,1,2,3,ccc,1,2,3,BBB,aaa,1,2,3,bbb,1,2,3,ccc,1,2,3

输出应为:

ABC,DEF,GHI,JKL,AAA,aaa,1,2,3
ABC,DEF,GHI,JKL,AAA,bbb,1,2,3
ABC,DEF,GHI,JKL,AAA,ccc,1,2,3
ABC,DEF,GHI,JKL,BBB,aaa,1,2,3
ABC,DEF,GHI,JKL,BBB,bbb,1,2,3
ABC,DEF,GHI,JKL,BBB,ccc,1,2,3

有人能帮我吗?

vcudknz3

vcudknz31#

您可以编写自己的自定义udf或尝试以下方法
输入文件

ABC,DEF,GHI,JKL,AAA,aaa,1,2,3,bbb,1,2,3,ccc,1,2,3,BBB,aaa,1,2,3,bbb,1,2,3,ccc,1,2,3,CCC,aaa,1,2,3,bbb,1,2,3,ccc,1,2,3

Pig手稿:

A = LOAD 'input.txt' USING PigStorage(',');
B = FOREACH A GENERATE $0,$1,$2,$3,
                       FLATTEN(TOTUPLE($4)),
                       FLATTEN(TOBAG(
                                     TOTUPLE($5..$8),
                                     TOTUPLE($9..$12),
                                     TOTUPLE($13..$16)
                                    )
                              );
C = FOREACH A GENERATE $0,$1,$2,$3,
                       FLATTEN(TOTUPLE($17)),
                       FLATTEN(TOBAG(
                                     TOTUPLE($18..$21),
                                     TOTUPLE($22..$25),
                                     TOTUPLE($26..$29)
                                    )
                              );
D = UNION B,C;
DUMP D

输出:

(ABC,DEF,GHI,JKL,AAA,aaa,1,2,3)
(ABC,DEF,GHI,JKL,AAA,bbb,1,2,3)
(ABC,DEF,GHI,JKL,AAA,ccc,1,2,3)
(ABC,DEF,GHI,JKL,BBB,aaa,1,2,3)
(ABC,DEF,GHI,JKL,BBB,bbb,1,2,3)
(ABC,DEF,GHI,JKL,BBB,ccc,1,2,3)

相关问题