如何将mysql查询转换为hive查询

sqougxex  于 2021-04-05  发布在  Hive
关注(0)|答案(1)|浏览(471)

我有这个表。

CREATE TABLE ip_logs (
  `ip_address` VARCHAR(11),
  `start_date` VARCHAR(11),
  `end_date` VARCHAR(11),
  `loc_id` INTEGER
);

INSERT INTO ip_logs
  (`ip_address`,`start_date`,`end_date`, `loc_id`)
VALUES
 ('120.0.53.21','2020-01-03','2020-01-09', '5'),
 ('198.5.273.2','2020-01-10','2020-01-14', '4'),
 ('198.5.273.2','2020-01-10','2020-01-14', '4'),
 ('198.5.273.2','2020-01-10','2020-01-14', '4'),
 ('100.36.33.1','2020-02-01','2020-02-02', '4'),
 ('100.36.33.1','2020-02-01','2020-02-02', '4'),
 ('100.36.33.1','2020-02-01','2020-02-02', '4'),
 ('198.0.47.33','2020-02-22','2020-02-24', '2'),
 ('122.8.0.11', '2020-02-25','2020-02-30','4'),
 ('198.0.47.33','2020-03-10','2020-03-17', '2'),
 ('198.0.47.33','2020-03-10','2020-03-17', '2'),
 ('122.8.0.11', '2020-03-18','2020-03-23','4'),
 ('198.5.273.2','2020-03-04','2020-03-09', '3'),
 ('106.25.12.2','2020-03-24','2020-03-30', '1');

我使用这个查询来选择最常用的IP地址。

select  (
select ip_address
from ip_logs t2
where t2.loc_id = t1.loc_id
group by ip_address
order by count(*) desc
limit 1)

from ip_logs t1
group by loc_id

这在mysql8.0中可以工作。然而在hive中却不能工作,我得到这个错误。
cannot recognize input near 'select' 'ip_address' 'from' in expression specification无法识别表达式中'select''ip_address' 'from'附近的输入。
预期的输出是:

loc_id | ip_address
5        120.0.53.21
4        198.5.273.2
2        198.0.47.33
3        198.5.273.2
1        106.25.12.2
zynd9foi

zynd9foi1#

你可以尝试使用row_number()窗口函数。

select * from
(
select ip_address,loc_id,count(*) as frequency
row_number() over(partition by loc_id order by count(*) desc) as rn
from ip_logs group by ip_address,loc_id
)A where rn=1

相关问题