我试图选择包含某些关键字的记录组,并提取每个组中包含最长时间的行。
1-df1:
id1 id2 name time
1 1 xxxLOAD_TIME 1
1 1 xxxLOGIN_LOGIN_SESSION_TIMExxx 2
1 1 xxxxSome other timexxxx 3
2 2 xxSome other timex 1
3 1 xxxLOAD_TIME 1
3 1 xxSome other timexx 2
2-在创建bèflag(列顺序无关紧要)之后,应该如下所示。b\标志表示在id1+id2组中出现登录\会话和加载\时间:
id1 id2 name b_flag time
1 1 xxxLOAD_TIME 1 1
1 1 xxxLOGIN_LOGIN_SESSION_TIMExxx 1 2
1 1 xxxxSome other timexxxx 3
2 2 xxSome other timex 1
3 1 xxxLOAD_TIME 1 1
3 1 xxSome other timexx 2
3-按bĩU标志筛选
id1 id2 name b_flag time
1 1 xxxLOAD_TIME 1 1
1 1 xxxLOGIN_LOGIN_SESSION_TIMExxx 1 2
1 1 xxxxSome other timexxxx 3
3 1 xxxLOAD_TIME 1 1
3 1 xxSome other timexx 2
4-期望输出(按最大时间过滤):
id1 id2 name b_flag time
1 1 xxxxSome other timexxxx 3
3 1 xxSome other timexx 2
以下是我尝试的代码:
create table num1 as
select *
from (
select t.*, sum(b_flag) over(partition by id1,id2) as sum_b_flag,
max(time) over (partition by id1,id2) max_time,
ROW_NUMBER() OVER (PARTITION BY id1,id2) as rn /*ensure no duplicates*/
from (
select
t.*,
case when name LIKE '%LOAD_TIME' or name LIKE '%LOGIN_SESSION_TIME' then 1 end b_flag
from df1 as t
) t
) t
where sum_b_flag > 0 AND name like '%TIME' AND time = max_time AND t.rn = 1
此代码生成以下错误,可能表示内存不足:
错误:执行错误:处理语句时出错:失败:执行错误,从org.apache.hadoop.hive.ql.exec.tez.teztask返回代码2。vertex失败,vertexname=reducer 2,vertexid=vertex\u 158165816621\u 0012\u 183\u 01,诊断=[任务失败,taskid=task\u 158165816621\u 0012\u 183\u 01\u000006,诊断=[任务尝试0失败,info=[错误:运行任务时出错(失败):尝试\u 158165816621 \u 0012 \u 183 \u 01 \u000006 \u 0:java.lang.runtimeexception:java.lang.runtimeexception:org.apache.hadoop.hive.ql.metadata.hiveexception:处理org.apache.hadoop.hive.ql.exec.tez.tezprocessor.initializeandrunprocessor(tezprocessor)上的行时发生配置单元运行时错误。java:296)在org.apache.hadoop.hive.ql.exec.tez.tezprocessor.run(tezprocessor。java:250)位于org.apache.tez.runtime.logicalioprocessorruntimetask.run(logicalioprocessorruntimetask)。java:374)在org.apache.tez.runtime.task.taskrunner2callable$1.run(taskrunner2callable。java:73)在org.apache.tez.runtime.task.taskrunner2callable$1.run(taskrunner2callable。java:61)在javax.security.auth.subject.doas(主题)中的java.security.accesscontroller.doprivileged(本机方法)。java:422)在org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation。java:1730)在。。。。。。
暂无答案!
目前还没有任何答案,快来回答吧!