我有一个表,表中的行格式如下
user | purchase | time_of_purchase|quantity
样品
1234 | Bread | Jul 7 20:48| 1
1234 | Shaving Cream | July 10 14:20 | 2
5678 | Milk | July 7 3:48 | 1
5678 | Bread | July 7 3:49 | 2
5678 | Bread | July 7 15:30 | 1
我想用以下格式创建用户的购买历史记录
1234 | {[Bread , Jul 7 20:48,1] ,[ Shaving Cream , July 10 14:20, 2 ]}
5678 | {[Milk, July 7 3:48 , 1 ] , [Bread , July 7 3:49 , 2], [Bread , July 7 15:30 , 1]}
有没有可能在Hive或Pig脚本中这样做?我尝试了collect\u list,但它不能保持列之间的顺序来组合,也尝试了brickhouse collect,但其行为类似于collect\u set,因此我丢失了部分信息。
1条答案
按热度按时间mhd8tkvw1#
Pig手稿
你可以参考几个例子http://ybhavesh.blogspot.com/
希望有帮助