我有一个input.txt,如下所示:
{"charId":1111,"encounters":[{"alias":"A","guid":192,"data1":0,"data2":0,"temporary":1},{"alias":"B","guid":952,"data1":0,"data2":0,"temporary":1}]}
{"charId":2222,"encounters":[{"alias":"C","guid":544,"data1":0,"data2":0,"temporary":1}]}
{"charId":3333,"encounters":[]}
我的问题是如何获得如下输出:
(1111, A, 192, 0, 0, 1)
(1111, B, 952, 0, 0, 1)
(2222, C, 544, 0, 0, 1)
(3333, , , , , )
p、 这是我的脚本,但它只输出前三行。
raw_data = LOAD 'input.txt' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS (json:map[]);
a = FOREACH raw_data GENERATE json#'charId' AS (charId:chararray), FLATTEN(json#'encounters') AS (encounters:map[]);
b = FOREACH a GENERATE charId, encounters#'alias' AS alias, encounters#'guid' AS guid, encounters#'data1' AS data1, encounters#'data2' AS data2, encounters#'temporary' AS temporary;
非常感谢你的帮助。我真的很感激。
1条答案
按热度按时间ngynwnxp1#
原因是,
Flatten
运算符将始终丢弃空Map,因此它不会包含在最终输出中。一个选择是你可以用下面的方法解决这个问题。我不会说这是最好的解决办法,但至少可以解决你的问题。Pig手稿:
输出: