Hive——over(窗口函数)详解

x33g5p2x  于11个月前 转载在 Hive  
字(0.9k)|赞(0)|评价(0)|浏览(194)

over()
over() 函数在Hive中运用广泛,通常搭配row_number() ,min(),max(),sum()来使用,总结下over函数划定窗口的范围

  • over() 因为没设置窗口范围,此时每一行的窗口都是所有的行。
SELECT 
	id,
	name,
	num,
	sum(num) over() sum1 
from 
	aa

也就是每一行后面的sum1是对所有行的num进行sum(num)操作,如果设置好窗口范围,每一行后面的结果是对以当前行为中心按照窗口范围内的行进行sum(num)操作得出结果。一下会有对应的案例

  • over(order by id) id=1的窗口只有一行,id=2的窗口包括id=1,id=2
SELECT 
	id,
	name,
	num,
	sum(num) over(order by id) sum1 
from 
	aa

  • over(partition by name) 每一行根据 name来区分窗口
SELECT 
	id,
	name,
	num,
	sum(num) over(partition by name) sum1 
from 
	aa

  • over(partition by name order by id) 每一行根据 name来区分窗口,再根据order by 取具体的范围
SELECT 
	id,
	name,
	num,
	sum(num) over(partition by name order by id) sum1 
from 
	aa

  • over( order by id range between 1 preceding and 1 following ) range 代表范围 preceding 向前 following 向后 窗口范围当前行前后一行
SELECT 
	id,
	name,
	num,
	sum(num) over( order by id range between 1 preceding and 1 following ) sum1 
from 
	aa

  • over( order by id range between CURRENT ROW AND 1 following ) 窗口范围当前行和后面一行
SELECT 
	id,
	name,
	num,
	sum(num) over( order by id range between CURRENT ROW AND  1 following ) sum1 
from 
	aa

相关文章