如何在hive中找到最长的持续时间

olqngx59  于 2021-05-31  发布在  Hadoop
关注(0)|答案(2)|浏览(391)

我在Hive逻辑上有个小问题。我需要在我的数据集中找到最长的客户

customer       time
 cust 1        8:01
 cust 1        8:02
 cust 2        8:02
 cust 2        8:03
 cust 2        8:04
 cust 2        8:05
 cust 3        8:02
 cust 3        8:03

在我的示例中,cust2拥有的客户时间最长,基于他/她参与的总时间。客户1总计数为2,客户2计数为4,客户3计数为2。我的问题在8:01到8:05之间,最长的时间属于客户2。因此,cust2应该作为结果出现。我如何实现这个逻辑。
请帮忙
注:需要无限制操作的解决方案

guicsvcw

guicsvcw1#

我想这会给你想要的

Select distinct customer from table where 
     (distinct time) = (select max(distinct time) from table)

或者如果你想要按组减少

Select  customer from( 
           (Select customer,(max(time)-
         min(time)) as time_diff
          from table group by customer
         order by time_diff desc LIMIT 1)
          ;
z2acfund

z2acfund2#

像这样:

select customer
from t
where time >= '8:01' and time <= '8:05'
group by customer
order by count(*) desc
limit 1;

相关问题