sql配置单元pig-mapreduce

oewdyzsn 于 2021-05-30 发布在 Hadoop

关注(0)|答案(1)|浏览(231)

每行有5列，这5列通常用逗号分隔

1 column is name
2nd column is date_of_purchase
3rd column is product
4th column is mode of payment
5th column is total_amount

希望你能理解它包含的数据

surender,2014-03-09,TV,OFFLINE,20000
surender,2014-01-01,Mobile,ONLINE,18000
Raja,2014-09-21,Laptop,ONLINE,30000
Surender,2014-10-12,Laptop,ONLINE,40000
Raja,2014-FEB-11,MusicSystem,ONLINE,2000
Kumar,2014-07-09,Ipod,OFFLINE,4000
Kumar,2014-06-08,TV,ONLINE,20000
Raja,2014-11-07,SPeakers,OFFLINE,8000
Kumar,2014-10-18,Laptop,ONLINE,30000

我需要的是我想看看每个人通过在线模式和离线模式花了多少钱
基本上我需要减速机输出应该像下面

surender   OFFLINE   20000
surender   ONLINE    58000
Raja       OFFLINE   8000
Raja       ONLINE    32000
Kumar      OFFLINE    4000
Kumar      ONLINE    50000

最终的输出应该是这样的：

surender 20000  58000
Raja     8000   32000
Kumar     4000   50000

你可以给我一个Hive或Pig查询或一个mapreduce程序

hadoop Hive mapreduce apache-pig hadoop-streaming

来源：https://stackoverflow.com/questions/27549162/sql-hive-pig-mapreduce

1条答案

按热度按时间

gtlvzcf81#

A = LOAD 'file_name' using PigStorage(',') as (name:chararray,date:chararray,product:chararray,mode:chararray,total:long);
B = GROUP A BY (name,mode);
C = FOREACH B GENERATE group.name as name,group.mode, SUM(total) as total;
D = GROUP C BY name;
E = FOREACH D GENERATE group, C.total;

如果像您提供的示例这样的数据有不同的拼写，则需要在分组之前将其转换为大写

赞(0）回复(0）举报 2021-05-30

我来回答

sql配置单元pig-mapreduce

1条答案

相关问题

热门标签

最新问答