使用左连接条件在snowflake中创建递归查询?

nfeuvbwi  于 2021-07-24  发布在  Java
关注(0)|答案(2)|浏览(266)

我试图创建一个依赖于 LEFT JOIN 条件,但我不确定是否可能,特别是在雪花。
我有三张table: ITEM , ITEMHIERARCHY ,和 ITEMVALUE ```
CREATE TABLE ITEM
(
NAME STRING
);

INSERT INTO ITEM(NAME)
VALUES
('Item1'),('Item2'),('Item3'),('Item4'),('Item5'),('Item6');

CREATE TABLE ITEMHIERARCHY
(
ITEM STRING,
SUBITEM STRING
);

INSERT INTO ITEMHIERARCHY(ITEM,SUBITEM)
VALUES
('Item2','Item3'),('Item2','Item4'),('Item4','Item5'),('Item6','Item4');

CREATE TABLE ITEMVALUE
(
ITEM STRING,
VALUE NUMERIC(25,10)
);

INSERT INTO ITEMVALUE(ITEM,VALUE)
VALUES
('Item1',34.2),('Item3',40.5),('Item5',20.3),('Item6',77.7);

我的目标是返回所有 `ITEMs` 值和子项值汇总后:

Item1, 34.2
Item2, 60.8 //roll-up of Item3 + Item4
Item3, 40.5
Item4, 20.3 //roll-up of Item5
Item5, 20.3
Item6, 77.7 //since Item6 value is given, dont roll-up from Item4

注意,尽管 `Item6` 是从 `Item4` 因为已经有一个给定的 `77.7` 上 `ITEMVALUE` 表,则忽略汇总。
下面是我尝试的一个失败的递归查询,原因是 `LEFT JOIN` 在 `UNION ALL` 条款:

WITH RECURSIVE ITEMHIERARCHYFULL
-- Column names for the "view"/CTE
(ITEM,SUBITEM,VALUE)
AS
-- Common Table Expression
(

-- Anchor Clause
SELECT it.NAME ITEM, ih.SUBITEM, iv.VALUE
  FROM ITEM it
  --These left-joins work
  LEFT JOIN ITEMVALUE iv ON iv.ITEM = it.NAME 
  LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = it.ITEM
                             AND iv.VALUE IS NULL

UNION ALL

-- Recursive Clause
SELECT  ihf.ITEM, ih.SUBITEM,  
  IFF(ihf.VALUE IS NOT NULL,ihf.VALUE,iv.VALUE)
  FROM ITEMHIERARCHYFULL ihf
  LEFT JOIN ITEMVALUE iv ON iv.ITEM = ihf.SUBITEM
  LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = ihf.SUBITEM
                                AND iv.VALUE IS NULL 

)

-- This is the "main select".
SELECT ITEM, SUM(VALUE) AS VALUE
FROM ITEMHIERARCHYFULL
GROUP BY ITEM
ORDER BY ITEM
;

查询的目标是首先获取所有顶层 `ITEMs` 从 `ITEM` 表中,在 `ITEMVALUE` 表,如果找不到,则连接到 `ITEMHIERARCHY` 表以检索所有 `SUBITEMs` 构成了最高层 `ITEMs` . 然后我想递归地搜索 `ITEMVALUE` a的table `SUBITEM-VALUE` 匹配,或者,如果找不到,则检索 `SUBITEMs` 从 `ITEMHIERARCHY` table。
第一套 `LEFT-JOINs` 工作,但不是地下的 `UNION ALL` 告诉我错误:

SQL compilation error: OUTER JOINs with a self reference are not allowed in a recursive CTE.

有没有更好的方法来做我想做的事 `Snowflake` 还是我想的不对?
目前,我手动编写了递归层到5个级别,这意味着如果 `ITEMHIERARCHY` 表格变得更加复杂。
v440hwme

v440hwme1#

下面是一个工作示例,它给出了您所期望的结果。您也可以在sqlfiddle上查看它。

WITH CTE AS
  (
    SELECT 
        i.NAME
        , IH.SUBITEM AS descendant        
        , CASE WHEN IV.VALUE IS NULL THEN 1 ELSE 0 END AS LEVEL
    FROM ITEM AS i
    LEFT JOIN ITEMHIERARCHY AS IH
        ON i.NAME = IH.ITEM
    LEFT JOIN ITEMVALUE AS IV
        ON I.NAME = IV.ITEM
    UNION ALL
    SELECT 
        CTE.NAME
        , sIH.SUBITEM
        , 1 AS LEVEL
    FROM CTE
      INNER JOIN ITEM AS si
        ON CTE.descendant = si.NAME
      INNER JOIN ITEMHIERARCHY AS sIH
        ON si.NAME = sIH.ITEM
  ), CTE2 AS 
(
SELECT 
    CTE.NAME     
    , LEVEL
    , SUM(IV.VALUE) AS VALUE
    , ROW_NUMBER()OVER(PARTITION BY CTE.NAME ORDER BY CTE.LEVEL ASC) AS RNK    
FROM CTE
LEFT JOIN ITEMVALUE AS IV
    ON (CTE.LEVEL=0 AND CTE.NAME = IV.ITEM)
    OR (CTE.LEVEL <> 0 AND CTE.descendant = IV.ITEM)    
GROUP BY CTE.NAME, CTE.LEVEL
) 
SELECT 
    NAME
    , VALUE
FROM CTE2
WHERE RNK = 1
ORDER BY 
    NAME
;

结果:

NAME    VALUE
Item1   34.2000000000
Item2   60.8000000000
Item3   40.5000000000
Item4   20.3000000000
Item5   20.3000000000
Item6   77.7000000000
jm81lzqq

jm81lzqq2#

下面是一个堆栈溢出问题,关于为什么 LEFT JOINs 在递归查询中是不允许的:链接,基本上是防止的 ∞ recursion ,这在我看来是一个有点弱的原因。在第二个响应中还建议,如果您的sql方言支持 OUTER APPLY 你可以用它来代替函数等价,但是snowflake没有这个函数。
下面是我的手动“递归”解决方案,最多可用于3个层次:

SELECT rec.ITEM, 
  SUM(CASE
    WHEN rec.VALUE1 IS NOT NULL THEN rec.VALUE1
    WHEN rec.VALUE2 IS NOT NULL THEN rec.VALUE2
    ELSE rec.VALUE3
  END) VALUE

FROM (
  SELECT it.NAME ITEM, 
  ih1.SUBITEM SUBITEM1, CASE 
                         WHEN iv1.VALUE IS NOT NULL THEN iv1.Value
                         ELSE iv1s.Value 
                        END Value1,
  ih2.SUBITEM SUBITEM2, CASE 
                         WHEN iv2.VALUE IS NOT NULL THEN iv2.Value
                         ELSE iv2s.Value 
                        END Value2,
  ih3.SUBITEM SUBITEM3, CASE 
                         WHEN iv3.VALUE IS NOT NULL THEN iv3.Value
                         ELSE iv3s.Value 
                        END Value3

  FROM ITEM it

  LEFT JOIN ITEMVALUE iv1 ON iv1.ITEM = it.NAME 
  LEFT JOIN ITEMHIERARCHY ih1 ON ih1.ITEM = it.NAME
                             AND iv1.VALUE IS NULL
  LEFT JOIN ITEMVALUE iv1s ON iv1s.ITEM = ih1.SUBITEM

  LEFT JOIN ITEMVALUE iv2 ON iv2.ITEM = ih1.SUBITEM 
  LEFT JOIN ITEMHIERARCHY ih2 ON ih2.ITEM = ih1.SUBITEM
                             AND iv1.VALUE IS NULL
                             AND iv1s.VALUE IS NULL
                             AND iv2.VALUE IS NULL
  LEFT JOIN ITEMVALUE iv2s ON iv2s.ITEM = ih2.SUBITEM

  LEFT JOIN ITEMVALUE iv3 ON iv3.ITEM = ih2.SUBITEM 
  LEFT JOIN ITEMHIERARCHY ih3 ON ih3.ITEM = ih2.SUBITEM
                             AND iv1.VALUE IS NULL
                             AND iv1s.VALUE IS NULL
                             AND iv2.VALUE IS NULL
                             AND iv2s.VALUE IS NULL
                             AND iv3.VALUE IS NULL
  LEFT JOIN ITEMVALUE iv3s ON iv3s.ITEM = ih3.SUBITEM
) rec

WHERE CASE
    WHEN VALUE1 IS NOT NULL THEN VALUE1
    WHEN VALUE2 IS NOT NULL THEN VALUE2
    ELSE VALUE3
  END IS NOT NULL

GROUP BY ITEM

这显然是一种在语法上非常低效的方法,在每一步中,你都必须检查两者 ITEM 以及 SUBITEM 值,然后重复 NULL 检查以前的每一次 ITEMVALUE 或者 SUBITEMVALUE table。我放进了 SUBITEMs 对于每个级别,如果只运行查询的内部部分,则可以看到扩展是如何工作的。我还不得不使用 CASE 语句来让事情在sqlfiddle上运行,但我更愿意使用 IFF 以及 IFNULL(Value1,IFNULL(Value2,Value3)) .
以下是sql fiddle:link的工作代码和输出:

Item1, 34.2
Item2, 60.8
Item3, 40.5
Item4, 20.3
Item5, 20.3
Item6, 77.7

相关问题