为什么MySQL在连接中不考虑(JSON)函数索引,而是考虑生成列上的索引?

qxsslcnc  于 5个月前  发布在  Mysql
关注(0)|答案(1)|浏览(46)

我在MySQL(8.0.35版)中使用了很多JSON列。通常,我在JSON属性上创建函数索引以加快查询速度。根据MySQL文档,* 函数索引被实现为隐藏的虚拟生成列 *。然而,它们的行为似乎与连接中生成列的索引不同。我将使用一个以某种方式构造的例子来解释我的观点。
示例:给定两个表productpurchasepurchase具有引用product的JSON属性$.productUuid

CREATE TABLE IF NOT EXISTS product (
    id         BINARY(16) NOT NULL,
    payload    JSON       NOT NULL,
    
    CONSTRAINT pk_product PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

CREATE TABLE IF NOT EXISTS purchase (
    id         BINARY(16) NOT NULL,
    payload    JSON       NOT NULL,
    
    CONSTRAINT pk_product PRIMARY KEY (id),
    INDEX `i_purchase_product` (
        ( CAST(payload->>'$.productUuid' AS CHAR(36)) COLLATE utf8mb4_bin )
    )
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO product (id, payload)
VALUES
  ( UUID_TO_BIN(UUID(), TRUE), '{ "name": "random drink" }' ),
  ( UUID_TO_BIN(UUID(), TRUE), '{ "name": "random dish" }' ),
  ( UUID_TO_BIN(UUID(), TRUE), '{ "name": "random tool" }' )
;

INSERT INTO purchase (id, payload)
SELECT
  UUID_TO_BIN(UUID(), TRUE), JSON_SET(payload, '$.productUuid', BIN_TO_UUID(id))
FROM product
;

字符串
对于以下查询

SELECT * FROM product a INNER JOIN purchase b ON BIN_TO_UUID(a.id) = b.payload->>'$.productUuid';


MySQL生成以下计划:

+----+-------------+-------+---------------+------+--------------------------------------------+
| id | select_type | table | possible_keys | key  | Extra                                      |
+----+-------------+-------+---------------+------+--------------------------------------------+
|  1 | SIMPLE      | a     | NULL          | NULL | NULL                                       |
|  1 | SIMPLE      | b     | NULL          | NULL | Using where; Using join buffer (hash join) |
+----+-------------+-------+---------------+------+--------------------------------------------+


该计划表明,函数索引甚至没有被考虑。如果我创建一个具有生成列和常规索引的表,情况就不同了。

CREATE TABLE IF NOT EXISTS purchase (
    id           BINARY(16)  NOT NULL,
    payload      JSON        NOT NULL,
    product_uuid VARCHAR(36) GENERATED ALWAYS AS (payload->>'$.productUuid') STORED NOT NULL,
    
    CONSTRAINT pk_purchase PRIMARY KEY (id),
    INDEX `i_purchase_product` (product_uuid)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;


现在,MySQL生成的计划显示该索引已被考虑:

+----+-------------+-------+--------------------+--------------------+-----------------------+
| id | select_type | table | possible_keys      | key                | Extra                 |
+----+-------------+-------+--------------------+--------------------+-----------------------+
|  1 | SIMPLE      | a     | NULL               | NULL               | NULL                  |
|  1 | SIMPLE      | b     | i_purchase_product | i_purchase_product | Using index condition |
+----+-------------+-------+--------------------+--------------------+-----------------------+


对于这种行为,是否有一个解释,最好是有记录的解释?

pbossiut

pbossiut1#

这是一个已知的bug:https://bugs.mysql.com/bug.php?id=98937
这甚至不是关于JSON的使用。任何函数索引都不能在JOIN中使用,即使相同的索引可以在另一个条件中使用。
我修改了你的例子,使用一个任意的字符串函数,而不是使用JSON函数:

CREATE TABLE IF NOT EXISTS product (
    id         BINARY(16) NOT NULL,
    abc CHAR(3) NOT NULL,
    CONSTRAINT pk_product PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

CREATE TABLE IF NOT EXISTS purchase (
    id         BINARY(16) NOT NULL,
    abc CHAR(3) NOT NULL, 
    CONSTRAINT pk_product PRIMARY KEY (id),
    INDEX `i_abc` ((REVERSE(abc)))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

INSERT INTO product (id, abc)
VALUES
  ( UUID_TO_BIN(UUID(), TRUE), 'abc' ),
  ( UUID_TO_BIN(UUID(), TRUE), 'abc' ),
  ( UUID_TO_BIN(UUID(), TRUE), 'abc' );

INSERT INTO purchase (id, abc)
SELECT
  UUID_TO_BIN(UUID(), TRUE),  REVERSE(abc)
FROM product;

字符串
使用函数索引的表达式进行搜索的效果与预期一样,它使用索引:

EXPLAIN 
SELECT * FROM purchase b WHERE REVERSE(b.abc) = 'abc';

+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+

| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | b     | NULL       | ref  | i_abc         | i_abc | 15      | const |    3 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+


但是使用函数索引的表达式连接并不使用索引:

EXPLAIN
SELECT * FROM product a
INNER JOIN purchase b
ON REVERSE(b.abc) = a.abc;

+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                      |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |   100.00 | NULL                                       |
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |   100.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+


这个bug是在2020年3月报告的(在我们写这篇文章的时候,已经过去了三年多)。唯一的解决办法是他们同意“提交一个文档bug来澄清当前的限制”。但是后来他们链接到了一个内部bug(我看不到)。
到目前为止,我还没有发现他们记录了这种限制。

相关问题