Hive——窗口函数(开窗函数)

x33g5p2x  于11个月前 转载在 Hive  
字(1.3k)|赞(0)|评价(0)|浏览(142)

创建好文件:

vim business.txt

数据准备:

jack,2017-01-01,10
tony,2017-01-02,15
jack,2017-02-03,23
tony,2017-01-04,29
jack,2017-01-05,46
jack,2017-04-06,42
tony,2017-01-07,50
jack,2017-01-08,55
mart,2017-04-08,62
mart,2017-04-09,68
neil,2017-05-10,12
mart,2017-04-11,75
neil,2017-06-12,80
mart,2017-04-13,94

创建新表:

create table business(
name string,
orderdate string,
cost int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

加载数据:

load data local inpath "/opt/module/data/business.txt" into table 
business;

需求:查询在2017年4月份购买过的顾客及总人数。

先查询一下在2017年四月份购买过的人有哪些。

select
    *
from 
    business
where substring(orderdate,0,7) = '2017-04';

over函数:个人理解为over函数以当前的行为中心和设置好的窗口范围 进行over函数之前的操纵,此代码为count(/*),此处的over函数的窗口范围为所有行,所有每一行都会计算总共查询出多少结果。

select
    name,
    count(*) over() 
from 
    business
where substring(orderdate,0,7) = '2017-04'
group by name;

需求:查询顾客的购买明细及月购买总额。

select
    name,
    orderdate,
    cost,
    sum(cost) over(partition by name,month(orderdate)) sum1
from 
    business;

需求:上述的场景,将每个顾客的cost按照日期进行累加。

select
    name,
    orderdate,
    cost,
    sum(cost) over(partition by name order by orderdate) sum1
from 
    business;

需求:查询用户上次的购买日期

select
    name,
    orderdate,
    lag(orderdate,1) over(partition by name order by orderdate)
from
    business;

需求:查询前20%时间的订单信息

select
    name,
    orderdate,
    cost,
    ntile(5) over(order by orderdate) groupID
from
    business;t1

select
    name,
    orderdate,
    cost
from
    (select
    name,
    orderdate,
    cost,
    ntile(5) over(order by orderdate) groupID
from
    business)t1
where groupID = 1;

相关文章