sql递归查询在postgres12中非常慢

zqdjd7g9  于 2021-07-26  发布在  Java
关注(0)|答案(0)|浏览(388)

我有一个递归sql查询,大致如下

with recursive recCTE as (
  select parent,name from items 
  where id in (...) 
  union 
  select items.parent, items.name 
  from items 
  join recCTE on recCTE.parent=items.id
)
select * from recCTE
``` `id` 是的主键 `items` 表格:

CREATE TABLE IF NOT EXISTS items (
id BIGSERIAL PRIMARY KEY,
parent bigint REFERENCES items(id) ON DELETE CASCADE,
name text NOT NULL,
);

in子句可以是值列表,也可以是临时表上的select(这两种情况之间没有显著的性能差异)。性能相当差,因为输入一千个数字的列表需要一秒钟。我们正在往上一棵树,因此递归查询的每次传递都应该产生越来越少的结果。
执行计划显示一个递归联合,在非递归查询上使用嵌套循环,在递归部分使用哈希连接,在items表上使用seq扫描。我在table上做了分析。

CTE Scan on reccte (cost=283046.17..286189.29 rows=157156 width=144) (actual time=43.233..1644.449 rows=1091 loops=1)
Buffers: shared hit=26532 read=23328, local hit=14, temp read=3880 written=9664
CTE reccte
-> Recursive Union (cost=0.42..283046.17 rows=157156 width=71) (actual time=43.230..1644.078 rows=1091 loops=1)
Buffers: shared hit=26532 read=23328, local hit=14, temp read=3880 written=9664
-> Nested Loop (cost=0.42..11202.20 rows=1556 width=71) (actual time=43.224..45.111 rows=1000 loops=1)
Buffers: shared hit=4000, local hit=14
-> Seq Scan on temp1817829968_0 (cost=0.00..29.56 rows=1556 width=8) (actual time=0.111..0.216 rows=1000 loops=1)
Buffers: local hit=14
-> Index Scan using items_pkey on items (cost=0.42..7.18 rows=1 width=71) (actual time=0.001..0.001 rows=1 loops=1000)
Index Cond: (id = temp1817829968_0.value)
Buffers: shared hit=4000
-> Hash Join (cost=22587.04..26870.08 rows=15560 width=71) (actual time=350.280..399.117 rows=272 loops=4)
Hash Cond: (reccte_1.parent = items_1.id)
Buffers: shared hit=22532 read=23328, temp read=3880 written=9664
-> WorkTable Scan on reccte reccte_1 (cost=0.00..311.20 rows=15560 width=8) (actual time=0.567..0.604 rows=273 loops=4)
-> Hash (cost=14715.24..14715.24 rows=325024 width=71) (actual time=341.619..341.619 rows=325024 loops=4)
Buckets: 65536 Batches: 16 Memory Usage: 2035kB
Buffers: shared hit=22532 read=23328, temp written=9580
-> Seq Scan on items items_1 (cost=0.00..14715.24 rows=325024 width=71) (actual time=0.010..170.582 rows=325024 loops=4)
Buffers: shared hit=22532 read=23328
Planning Time: 0.384 ms
JIT:
Functions: 17
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 14.542 ms, Inlining 0.000 ms, Optimization 2.713 ms, Emission 41.963 ms, Total 59.218 ms
Execution Time: 1666.414 ms

准时制有问题吗?
我重写了客户机jdbc代码,使其不使用递归查询,只需这样做 `select parent,name from items where id in (...)` 递归地捕获父ID并重新运行查询。它的执行速度快了一百倍(几十毫秒而不是一秒)。
我发现很难相信优化器基本上不能像我在代码中做的那样做,那么这种类型的查询有什么问题呢?事实是我们爬上一棵树 `parent` 重视问题?有什么建议吗?
万分感谢!

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题