sql-按一列和一些字段类型划分

af7jpaap 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(360)

我的table要大得多，但是一个小的剪下来的会像：

---------+---+----------+--------+------------+---
|distance|qtt|deliver_by| store  |deliver_time| ...
+--------+---+----------+--------+------------|---
|   11   |  1|  pa      | store_a|  1111      |
|   123  |  2|  pa      | store_a|  1112      |
|   33   |  3|  pb      | store_a|  1113      |
|   33   |  2|  pa      | store_b|  2221      |
|   44   |  2|  pb      | store_b|  2222      |
|   5    |  2|  pc      | store_b|  2223      |
|   5    |  2|  pc      | store_b|  2224      |
|   6    |  5|  pb      | store_c|  3331      |
|   7    |  5|  pb      | store_c|  3332      |
----------------------------------------------....

有多个商店，但只有3种可能的交付方式（交付方式：pa、pb和pc）在特定时间交付产品。考虑 deliver_time 时间戳。
我想选择整个表，并添加6个新的列，最小和最大时间每 deliver_by 在商店里。一个商店可以由三个送货员（pa、pb、pc）中的任何一个提供服务，但不是必需的。
我可以完成几乎正确的结果，通过下面的查询，问题是在这种情况下 deliver_by px不存在，我没有得到一个空值，而是商店交货的最小/最大值。
我真的很想使用分区，所以我写了这个来添加新的min/max列：

select
  min(deliver_time) over (partition by store, deliver_by='pa') as as min_time_sd_pa
, max(deliver_time) over (partition by store, deliver_by='pa') as as min_time_sd_pa

, min(deliver_time) over (partition by store, deliver_by='pb') as as min_time_sd_pb
, max(deliver_time) over (partition by store, deliver_by='pb') as as min_time_sd_pb

, min(deliver_time) over (partition by store, deliver_by='pc') as as min_time_sd_pc
, max(deliver_time) over (partition by store, deliver_by='pc') as as min_time_sd_pc

, distance, qtt, ....
from mytable

正确的输出是：

min_time_sd_pa|max_time_sd_pa|min_time_sd_pb|max_time_sd_pb|min_time_sd_pc|max_time_sd_pc|distance|qtt|deliver_by| store  |deliver_time
--------------+--------------+--------------+--------------+--------------+--------------+--------+---+----------+--------+------------
    1111      |  1112        |  1113        |   1113       |   null       |   null       |   11   |  1| pa       | store_a| 1111
    1111      |  1112        |  1113        |   1113       |   null       |   null       |   123  |  2| pa       | store_a| 1112
    1111      |  1112        |  1113        |   1113       |   null       |   null       |   33   |  3| pb       | store_a| 1113
    2221      |  2221        |  2222        |   2222       |   2223       |   2224       |   33   |  2| pa       | store_b| 2221
    2221      |  2221        |  2222        |   2222       |   2223       |   2224       |   44   |  2| pb       | store_b| 2222
    2221      |  2221        |  2222        |   2222       |   2223       |   2224       |   5    |  2| pc       | store_b| 2223
    2221      |  2221        |  2222        |   2222       |   2223       |   2224       |   5    |  2| pc       | store_b| 2224
    null      |  null        |  null        |   null       |   3331       |   3332       |   6    |  5| pb       | store_c| 3331
    null      |  null        |  null        |   null       |   3331       |   3332       |   7    |  5| pb       | store_c| 3332
---------------------------------------------------------------------------------------------------------------------------------------

我的生活中缺少了什么 select min(..) over.. 或者我怎样才能用最简单的方法实现这个结果？我使用的是hiveql，但我想这在大多数sqldbms中是通用的。
谢谢

sql Hive Partition window

来源：https://stackoverflow.com/questions/45123160/sql-partition-by-one-column-and-some-field-types

1条答案

按热度按时间

suzh9iv81#

你可以用一个 case 中的表达式 min 以及 max .

select
 min(case when deliver_by='pa' then deliver_time end) over (partition by store) as min_time_sd_pa
,max(case when deliver_by='pa' then deliver_time end) over (partition by store) as max_time_sd_pa
,min(case when deliver_by='pb' then deliver_time end) over (partition by store) as min_time_sd_pb
,max(case when deliver_by='pb' then deliver_time end) over (partition by store) as max_time_sd_pb
,min(case when deliver_by='pc' then deliver_time end) over (partition by store) as min_time_sd_pc
,max(case when deliver_by='pc' then deliver_time end) over (partition by store) as max_time_sd_pc
,m.*
from mytable m

赞(0）回复(0）举报 2021-06-26

我来回答

sql-按一列和一些字段类型划分

1条答案

相关问题

热门标签

最新问答