by

mzaanser  于 2021-06-21  发布在  Pig
关注(0)|答案(2)|浏览(172)

我对pig语法还不熟悉,我想知道是否有人能提供一个将这个sql代码翻译成pig的提示。

SELECT column1, column2, SUM(column3)
FROM table
WHERE column5 = 100
GROUP BY column2;

到目前为止,我已经:

data = LOAD....etc.
filterColumn = FILTER data BY column5 = 100;
groupColumn = Group filterColumn By column2;
result = foreach groupColumn Generate group, column1, SUM(column3) as sumCol3; 
DUMP result;

这不管用。错误消息是“无法将org.apache.pig.builtin.sum的匹配函数推断为多个或没有匹配的函数。请使用显式强制转换。“

laik7k3q

laik7k3q1#

可以使用以下清管器命令:

test=LOAD '<testdata>' USING PigStorage(',') AS (column1:<datatype>, column2:<datatype>,column3:<datatype>, column5:<datatype>);

A =FILTER test BY column5==100;

B = GROUP A BY column2;

C = FOREACH B GENERATE group, test.column1,test.column2,SUM(test.column3);

dump C;

注意,“pigstorage”和“as”的用法是可选的。
希望这有帮助。

gzszwxb4

gzszwxb42#

SUM() :计算单列包中数值的总和。它期望bag作为它的输入。所以 FOREACH ... GENERATE 会是,

result = foreach groupColumn Generate group, filterColumn.column1, SUM(filterColumn.column3) as sumCol3;

同样在 FILTER 语句,用于检查是否相等 == ```
filterColumn = FILTER data BY column5 == 100;

相关问题