incubator-doris [Proposal]Doris UNIQUE table supports UPDATE syntax

izj3ouym  于 2022-04-22  发布在  Java
关注(0)|答案(2)|浏览(242)

Background

The current Doris table does not support the update operation, but there are many scenarios where the data needs to be updated. Due to the batch delete function we implemented before, it paved the way for the update.

Syntax

We only support single table update

UPDATE table_name
    SET assignment_list
    [WHERE where_condition]
    [ORDER BY...]
    [LIMIT row_count]

value:
    {expr | DEFAULT}

assignment:
    col_name = value

assignment_list:
    assignment [, assignment].

Design

There are two kinds of updates involved here, one that only contains the value column, and the other that contains the key column.

  • Include and update of value column

This method is relatively simple. You only need to plan the update statement into a query plan like select * from table insert into table during query planning, and you need to add an update node to modify the data to the updated value, or plan Into a statement similar to select a, b,'xx','xx1' from table insert into table and hand it directly to the tableSink node

  • Contains the update of the key column

This kind of relatively complicated, we need to divide the query into two parts, and need to use the MERGE semantics in the batch delete, first plan it into a query plan similar to select * from table insert into table, and then generate it first for the modification of the key column One piece of data that needs to be deleted, and then replace the value of the key column to generate new data that needs to be added, mainly generating two pieces of imported data from one piece of original data. Need update node to have the ability to repeat data

Subtasks

  1. Add index to unique table value column to speed up data scanning

  2. FE supports update statements and generates corresponding query plans. Be implements update node, which can be divided into two parts here

  3. Support updates that only contain the value column

  4. Support the update of the key column

背景

当前 Doris 表不支持 update 操作,但是有很多场景需要对数据进行更新,由于我们之前实现的批量删除功能,为实现update 铺平了道路。

语法

我们只支持单表更新

UPDATE table_name
    SET assignment_list
    [WHERE where_condition]
    [ORDER BY ...]
    [LIMIT row_count]

value:
    {expr | DEFAULT}

assignment:
    col_name = value

assignment_list:
    assignment [, assignment] .

设计方案

这里涉及两种更新,一种使只包含value 列的更新, 一种是包含key列的更新

  • 包含及value 列的更新

这种方式比较简单,只需要查询规划时将update 语句规划成类似 select * from table insert into table 的查询规划,并且需新增一个update node,用于修改数据为更新后的值, 或者 规划成 类似 select a, b, 'xx', 'xx1' from table insert into table 的语句,直接交给 tableSink 节点

  • 包含key列的更新

这种相对复杂一点,我们需要将查询分成两部,并且需要用到 批量删除里的MERGE 语义,先规划成类似 select * from table insert into table 查询规划, 之后针对key列的修改,先生成一条需要删除的数据,再替换key列值后生成需要新增的数据,主要将一条原始数据生成两条导入数据。需要update node 有数据重复的能力

子任务

  1. unique 表 value 列加索引,加快数据扫描速度

  2. FE 支持 update 语句,并生成相应的查询规划,be实现update node, 这里可以分为两部

  3. 支持只包含value 列的更新

  4. 支持包含key列的更新

cygmwpex

cygmwpex1#

Great features。

There is another scenario. For example, when using bitmap to calculate UV, it is only cumulative at present. For example, if 100 people have paid the order, UV should be equal to 100. But if 10 people cancel the order, UV should be equal to 90. In this case, bitmap needs to support the reset operation of a bit. This kind of scenario can be considered later to design.

0sgqnhkj

0sgqnhkj2#

Great features。

There is another scenario. For example, when using bitmap to calculate UV, it is only cumulative at present. For example, if 100 people have paid the order, UV should be equal to 100. But if 10 people cancel the order, UV should be equal to 90. In this case, bitmap needs to support the reset operation of a bit. This kind of scenario can be considered later to design.

The need for bitmap update does exist, but what you said may not be a simple syntax like update set bitmap=xxx where xxx.
Is that true?

bitmap 的更新需求确实是一个很重要的场景。不过你说的这类场景并不是 update 语句可以支持的。因为update支持的是列整体的更新,而不是列的数据的局部更新。
或者你知道有哪些其他的系统是支持这个功能的吗?我可以调研一下

相关问题