没有空值的pig flatten

628mspwn  于 2021-06-25  发布在  Pig
关注(0)|答案(1)|浏览(267)

我有一个Pig袋子

(1139-50052,Aquatic,Consumer,6,makarina,2,{(),(Unknown)})
(1139-50052,Aquatic,Consumer,6,jabong,2,{(),(),(),(Unknown)})

我需要把它弄平而不留空白。

(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)

请给我建议。

8ehkhllq

8ehkhllq1#

一个选择是你可以把包放进去 BagToString() 函数,这样空值将被丢弃,然后根据分隔符拆分包值 '_' .

FLATTEN(STRSPLIT(BagToString(BagName),'_+'))

除了你的输入,它也适用于其他组合,下面的示例。
输入

1139-50052      Aquatic Consumer        6       makarina        2       {(),(Unknown)}
1139-50052      Aquatic Consumer        6       jabong  2       {(),(),(),(Unknown)}
1139-50052      Aquatic Consumer        6       test1   2       {(unknown1),(),(),(Unknown2)}
1139-50052      Aquatic Consumer        6       test2   2       {(unknown1),(unknown2),(),(Unknown3)}

Pig手稿:

A = LOAD 'input' USING PigStorage() AS (f0,f1,f2,f3,f4,f5,B:{T:(f7)});
B = FOREACH A GENERATE f0,f1,f2,f3,f4,f5,FLATTEN(STRSPLIT(BagToString(B),'_+'));
DUMP B;

输出:

(1139-50052,Aquatic,Consumer,6,makarina,2,Unknown)
(1139-50052,Aquatic,Consumer,6,jabong,2,Unknown)
(1139-50052,Aquatic,Consumer,6,test1,2,unknown1,Unknown2)
(1139-50052,Aquatic,Consumer,6,test2,2,unknown1,unknown2,Unknown3)

相关问题