hiveql-将多行数据聚合到单行

njthzxwz  于 2021-06-28  发布在  Hive
关注(0)|答案(1)|浏览(267)

我正在努力将多行数据转换成一列。下面是我的表格:

site        country    users    country _rank
cnn.com     840        10000    1
cnn.com     31         4000     3
cnn.com     556        6000     2
rt.com      840        200      3
rt.com      33         6000     2
rt.com      400        10000    1

我想得到的结果是前两个国家的用户数,并将其放在一行中:

site       country_1    country_1_share     country_2     country_2_share
cnn.com    840          10000               556           6000
rt.com     400          10000               33            6000

我尝试了几种不同的方法:

select site, country_1, country_1_share,country_2,country_2_share
from (
  select site
  ,max(CASE WHEN country_rank = 1 THEN country END) AS country_1 
  ,max(CASE WHEN country_rank = 1 THEN users END) as country_1_share 
  ,max(CASE WHEN country_rank = 2 THEN country END) AS country_2
  ,max(CASE WHEN country_rank = 2 THEN users END) as country_2_share 
  from t1
  group by site
)

还有:

select a.site, a.country_1, b.country_1_share,c.country_2,d.country_2_share
from (
  select site, country as country_1
  from t1
  where max(CASE WHEN country_rank = 1 THEN country END)) a
 JOIN (
  select site, users as country_1_share
  from t1
  where max(CASE WHEN country_rank = 1 THEN users END)) b on (a.site=b.site)
 JOIN (
  select site, country as country_2
  from t1
  where max(CASE WHEN country_rank = 2 THEN country END)) c on (a.site = c.site)
 JOIN (
  select site, users as country_2_share
  from t1
  where max(CASE WHEN country_rank = 2 THEN users END)) d on (a.site = c.site)

任何洞察都将不胜感激!

vd2z7a6w

vd2z7a6w1#

这在hive 1.2.1上运行良好:

drop table if exists t1;

create table t1 
as
select 'cnn.com' site,  840 country ,  10000 users,   1 country_rank union all
select 'cnn.com' site,  31  country ,  4000  users,   3 country_rank union all
select 'cnn.com' site,  556 country ,  6000  users,   2 country_rank union all
select 'rt.com'  site,  840 country ,  200   users,   3 country_rank union all
select 'rt.com'  site,  33  country ,  6000  users,   2 country_rank union all
select 'rt.com'  site,  400 country ,  10000 users,   1 country_rank;

select site, country_1, country_1_share,country_2,country_2_share
from (
  select site
  ,max(CASE WHEN country_rank = 1 THEN country END) AS country_1 
  ,max(CASE WHEN country_rank = 1 THEN users END) as country_1_share 
  ,max(CASE WHEN country_rank = 2 THEN country END) AS country_2
  ,max(CASE WHEN country_rank = 2 THEN users END) as country_2_share 
  from t1
  group by site
)s;

 OK
site    country_1       country_1_share country_2       country_2_share
cnn.com 840     10000   556     6000
rt.com  400     10000   33      6000

相关问题