在pig拉丁语中使用sum()

ddarikpa  于 2021-06-25  发布在  Pig
关注(0)|答案(2)|浏览(291)

我刚开始用pig编写一些脚本,我正在尝试对int列求和,我的脚本如下所示:

DATA = LOAD 'SomeFile' as (fingerPrint, size, str1, str2);
groupedChunks = GROUP DATA BY fingerPrint;

uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1;
sizes = FOREACH uniqueChunks GENERATE MAX($.size) as size;

现在我有一个表,只有一列,即size列,如果我调用descripe,它将生成以下输出: sizes:{size: int} 在这一步中我需要帮助,我如何得到这个列所有大小的总和?

raogr8fs

raogr8fs1#

v=全部分组数据;结果=foreach v generate sum(data.size)

nwwlzxa7

nwwlzxa72#

你能试试这个吗?

result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes);
DUMP result;

更新:完整代码
输入文件

a       1       b       c
d       2       e       f

Pig手稿:

DATA = LOAD 'input.txt' as (fingerPrint, size, str1, str2);
groupedChunks = GROUP DATA BY fingerPrint;
uniqueChunks = FILTER groupedChunks BY COUNT(DATA)==1;
sizes = FOREACH uniqueChunks GENERATE MAX(DATA.size) as size;
result = FOREACH (GROUP sizes ALL) GENERATE SUM(sizes);
DUMP result;

输出:

(3.0)

相关问题