SQL(MariaDB)首先连接，然后筛选结果表,如何最佳化查询？

我有一个IMDB数据库和查询，返回的标题列表中，两个演员出现。下面是数据库ER图：
x1c 0d1x的数据
而查询本身：

select ti1.tconst
from title_principals as ti1 
join name_basics as nmb1 on ti1.nconst = nmb1.nconst 
where nmb1.primaryName = 'George Clooney'
and ti1.tconst in 
(
  select ti2.tconst
  from title_principals as ti2
  join name_basics as nmb2 on ti2.nconst = nmb2.nconst
  where nmb2.primaryName = 'Brad Pitt'
);

字符串
该函数返回两个参与者都出现在其中的标题id（tconst）：
| tconst |
| ------------ |
| 240772 |
| 326300 |
| 349903 |
问题是，当我将title_basics连接到title_principals时，查询的执行时间太长：

select ti1.tconst, tb.primaryTitle
from title_basics as tb 
join title_principals as ti1 on tb.tconst = ti1.tconst 
join name_basics as nmb1 on ti1.nconst = nmb1.nconst 
where nmb1.primaryName = 'George Clooney' 
and ti1.tconst in 
(
  select ti2.tconst
  from title_principals as ti2
  join name_basics as nmb2 on ti2.nconst = nmb2.nconst
  where nmb2.primaryName = 'Brad Pitt'
);

型
我认为数据库优化器应该在过滤了title_principals和name_basics的连接之后再连接title_basics（就像在第一个查询中一样），但是它似乎不是这样工作的。下面是两个查询的执行计划：
| 选择_类型|工作台|类型，类型|可能的关键字（_K）|钥匙|密钥长度|参考|列|额外的| Extra |
| --|--|--|--|--|--|--|--|--| ------------ |
| 主要|n进制1|全部|ix_名称_基础知识_nconst|空的|空的|空的|小行星126380|在何处使用| Using where |
| 主要|钛1|参考|ix_title_principal_tconst，ix_title_principal_nconst主要名称常数|ix_标题_委托人_nconst|五个| imdb.nmb1.nconst |四个|||
| 主要的||eq_参考|独特键|独特键|四个|函数|一个|||
| 物质化|n兆2|全部|ix_名称_基础知识_nconst|空的|空的|空的|小行星126380|在何处使用| Using where |
| 物质化|二氧化钛|参考|ix_title_principal_tconst，ix_title_principal_nconst主要名称常数|ix_标题_委托人_nconst|五个| imdb.nmb2.nconst |四个|||
| 选择_类型|工作台|类型，类型|可能的关键字（_K）|钥匙|密钥长度|参考|列|额外的| Extra |
| --|--|--|--|--|--|--|--|--| ------------ |
| 主要|结核分枝杆菌|全部|ix_标题_基础知识_tconst|空的|空的|空的|小行星9595802|在何处使用| Using where |
| 主要|钛1|参考|ix_title_principal_tconst，ix_title_principal_nconst主要名称常数|ix_标题_委托人_tconst|五个| imdb.tb.tconst |三个|在何处使用| Using where |
| 主要|n进制1|参考|ix_名称_基础知识_nconst| ix_名称_基础知识_nconst|五个| imdb.ti1.nconst |一个|在何处使用| Using where |
| 主要||eq_参考|独特键|独特键|四个|函数|一个|||
| 物质化|n兆2|全部|ix_名称_基础知识_nconst|空的|空的|空的|小行星126380|在何处使用| Using where |
| 物质化|二氧化钛|参考|ix_title_principal_tconst，ix_title_principal_nconst主要名称常数|ix_标题_委托人_nconst|五个| imdb.nmb2.nconst |四个|||
下面是show create table name_basics;返回的结果：

CREATE TABLE `name_basics` (
  `primaryProfession` text DEFAULT NULL,
  `nconst` int(11) DEFAULT NULL,
  `deathYear` int(11) DEFAULT NULL,
  `knownForTitles` text DEFAULT NULL,
  `ns_soundex` varchar(5) DEFAULT NULL,
  `sn_soundex` varchar(5) DEFAULT NULL,
  `primaryName` text DEFAULT NULL,
  `s_soundex` varchar(5) DEFAULT NULL,
  `birthYear` int(11) DEFAULT NULL,
  KEY `ix_name_basics_birthYear` (`birthYear`),
  KEY `ix_name_basics_sn_soundex` (`sn_soundex`),
  KEY `ix_name_basics_nconst` (`nconst`),
  KEY `ix_name_basics_s_soundex` (`s_soundex`),
  KEY `ix_name_basics_ns_soundex` (`ns_soundex`),
  KEY `ix_name_basics_deathYear` (`deathYear`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

型
然后，我在name_basics上为primaryName字段创建了一个索引：

+------+--------------+-------------+--------+-------------------------------------------------------+----------------------------+---------+------------------+------+------------------------------------+
| id   | select_type  | table       | type   | possible_keys                                         | key                        | key_len | ref              | rows | Extra                              |
+------+--------------+-------------+--------+-------------------------------------------------------+----------------------------+---------+------------------+------+------------------------------------+
|    1 | PRIMARY      | nmb1        | ref    | ix_name_basics_nconst,idx1                            | idx1                       | 1027    | const            |    1 | Using index condition; Using where |
|    1 | PRIMARY      | ti1         | ref    | ix_title_principals_tconst,ix_title_principals_nconst | ix_title_principals_nconst | 5       | imdb.nmb1.nconst |    4 | Using where                        |
|    1 | PRIMARY      | tb          | ref    | ix_title_basics_tconst                                | ix_title_basics_tconst     | 5       | imdb.ti1.tconst  |    1 |                                    |
|    1 | PRIMARY      | <subquery2> | eq_ref | distinct_key                                          | distinct_key               | 4       | func             |    1 |                                    |
|    2 | MATERIALIZED | nmb2        | ref    | ix_name_basics_nconst,idx1                            | idx1                       | 1027    | const            |    1 | Using index condition; Using where |
|    2 | MATERIALIZED | ti2         | ref    | ix_title_principals_tconst,ix_title_principals_nconst | ix_title_principals_nconst | 5       | imdb.nmb2.nconst |    4 |                                    |
+------+--------------+-------------+--------+-------------------------------------------------------+----------------------------+---------+------------------+------+------------------------------------+

型
如何最佳化查询？
我试着将筛选部分移到JOIN条件（用AND代替WHERE），就像建议的here一样，但也没有用。

您正在查找电影标题，请从title_basics中选择。您希望将此限制为某些参与者，因此在WHERE子句中使用EXISTS或IN。

select tconst, primarytitle
from title_basics as tb 
where tconst in
(
  select tp.tconst
  from title_principals tp
  where tp.nconst = (select nb.nconst from name_basics nb where nb.primaryName = 'George Clooney')
)
and tconst in
(
  select tp.tconst
  from title_principals tp
  where tp.nconst = (select nb.nconst from name_basics nb where nb.primaryName = 'Brad Pitt')
);

字符串
或者更短：

select tconst, primarytitle
from title_basics as tb 
where tconst in
(
  select tp.tconst
  from title_principals tp
  where tp.nconst in 
  (
    select nb.nconst
    from name_basics nb
    where nb.primaryName in ('George Clooney', 'Brad Pitt')
  )
  group by tp.tconst
  having count(*) = 2 -- both actors
);

型
为了快速找到参与者并获得他们的nconst ID，我们可以提供索引

create index idx1 on name_basics (primaryname, nconst);

型
为了从那里获得他们的电影和相关的tconst ID，我们可以提供

create index idx2 on title_principals (nconst, tconst);

型
对于最后一步，我们甚至可以提供一个覆盖索引：

create index idx3 on title_basics (tconst, primarytitle);

型

SQL(MariaDB)首先连接，然后筛选结果表,如何最佳化查询？

1条答案

相关问题

热门标签

最新问答