我正在加载一个带有datetime列和long列的tsv文件:
A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:datetime, userid:long); DUMP A;
输入行示例:
Tue Feb 11 05:02:10 +0000 2014 205291417
输出线:
, 205291417
我该怎么做?
eivgtgni1#
你应该把日期作为字符载入(date:chararray)然后可以使用 FOREACH GENERATE 以及 ToDate 清管器内置功能。格式字符串基于 SimpleDateFormat ```A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:chararray, userid:long);B = FOREACH A GENERATE ToDate(date, '') AS date, userid;DUMP B;
FOREACH GENERATE
ToDate
SimpleDateFormat
1条答案
按热度按时间eivgtgni1#
你应该把日期作为字符载入(date:chararray)然后可以使用
FOREACH GENERATE
以及ToDate
清管器内置功能。格式字符串基于
SimpleDateFormat
```A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:chararray, userid:long);
B = FOREACH A GENERATE ToDate(date, '') AS date, userid;
DUMP B;