求和时sql选择大小写

ukxgm1gy  于 2021-08-09  发布在  Java
关注(0)|答案(3)|浏览(211)

我想基于GROUPBY子句的和应用一个使用case的查询。
我的table看起来像:

|---------------------|------------------|
|      customerId     |     duration     |
|---------------------|------------------|
|        A12345       |         5        |
|---------------------|------------------|
|        A12345       |         10       |
|---------------------|------------------|
|        A65421       |         5        |
|---------------------|------------------|

并希望它输出

|---------------------|------------------|
|      sum(duration)  | count(customerId)|
|---------------------|------------------|
|        0-5          |         1        |
|---------------------|------------------|
|        5-10         |         0        |
|---------------------|------------------|
|        10+          |         1        |
|---------------------|------------------|

我当前的sql类似于:

select
       case
            when sum(duration) between 0 and 5 then '0-5'
            when sum(duration) between 5 and 10 then '5-10'
            else '10+'
        end as bucket,
       count(customerId)
from table
group by 1, customerid

但这让我
无效操作:group by子句中不允许聚合;
感谢您的帮助。

laawzig2

laawzig21#

您需要嵌套聚合:

select -- then count the number of rows per bucket
   bucket,
   count(*)
from 
 ( -- aggregate the duration per customer first
   select customerid, 
          case
            when sum(duration) between 0 and 5 then '0-5'
            when sum(duration) between 5 and 10 then '5-10'
            else '10+'
        end as bucket
   from table
   group by customerid
 ) as dt
group by bucket
628mspwn

628mspwn2#

GROUPBY 1-->将按具有聚合函数的第一列进行分组 sum() 这是不允许的,所以查询应该如下

Select Case when duration between 0 and 5 then '0-5'
        when duration between 0 and 5 then '5-10'
        Else '10+'
        end as bucket,
        count(customerId) customerIdCount
From
(
    select
          customerid, sum(duration) as duration
    from table1
    group by customerid
) Q
Group by Case when duration between 0 and 5 then '0-5'
    when duration between 0 and 5 then '5-10'
    Else '10+'
    end

如果希望得到问题中所示的精确结果,则需要将所有这些bucket值放到一个表中并将其联接起来

Select t1.bucket, coalesce(customerIdCount,0) from table2 t1
Left Join 
(
    Select Case when duration between 0 and 5 then '0-5'
            when duration between 0 and 5 then '5-10'
            Else '10+'
            end as bucket,
            count(customerId) customerIdCount
    From
    (
        select
              customerid, sum(duration) as duration
        from table1
        group by customerid
    ) Q
    Group by Case when duration between 0 and 5 then '0-5'
        when duration between 0 and 5 then '5-10'
        Else '10+'
        end
) r on t1.bucket = r.bucket
;

这是小提琴

sbdsn5lh

sbdsn5lh3#

你需要先按客户分组,就像dnoeth已经建议的那样。但考虑一种改进的方法:

SELECT CASE bucket
         WHEN 0 THEN '0-5'  -- 5 excluded
         WHEN 1 THEN '5-10' -- 10 excluded
         ELSE        '10+'
       END AS sum_duration
     , count(*) AS customers
FROM  (
   SELECT customerid
        , trunc(sum(duration))::int / 5 AS bucket -- ①
   FROM   tbl
   GROUP  BY customerid
   ) sub
GROUP  BY bucket;

① 因为您将bucket设置为5,所以我们可以使用整数除法来获得整数bucket。应该短一点/快一点(尤其是桶多的时候)。如果 duration 是一个 integer 您可以简化的类型:

...
        , sum(duration) / 5 AS bucket
..

无论哪种方式,5(准确地说)最终都会排在组“5-10”中,等等-修复原始文件中的一个角落案例错误,这将重复计算。人们常常忽视这一点 BETWEEN 包括下限和上限,通常仅对离散数据类型有用。

相关问题