如何编写pig脚本来提取两个给定时间戳之间的日志行?

6ojccjat  于 2021-06-21  发布在  Pig
关注(0)|答案(1)|浏览(208)

需要帮助编写pig脚本来提取两个给定时间戳之间的日志行。
示例日志文件:

2016/08/17 09:00:00 This is log line 1<BR>
2016/08/17 09:05:00 This is log line 2<BR>
2016/08/17 10:00:00 This is log line 3<BR>
2016/08/18 09:00:00 This is log line 4<BR>
eimct9ow

eimct9ow1#

假设数据是选项卡分隔的,将日志加载到两个字段中,并在第一个字段上使用filter。

A = LOAD 'log.txt' using PigStorage('\t') AS (dt:datetime,line:chararray);
B = FILTER A BY (dt > '2016/08/17 09:00:00' AND dt < '2016/08/18 09:00:00');
DUMP B;

使用参数

pig -f myscript.pig -param start_dt='2016/08/17 09:00:00' end_dt='2016/08/18 09:00:00'

A = LOAD 'log.txt' using PigStorage('\t') AS (dt:datetime,line:chararray);
B = FILTER A BY (dt > start_dt AND dt < end_dt);
DUMP B;

相关问题