incubator-doris [Enhancement] AggregationNode optimization

6fe3ivhb  于 2022-04-22  发布在  Java
关注(0)|答案(0)|浏览(119)

Search before asking

  • I had searched in the issues and found no similar issues.

Description

background

Test the current Doris (2a25b90) vectorized (including storage layer vectorization, low-cardinality column optimization), SSB Q2.1 single-table query, Sql is as follows,

SELECT
    sum(LO_REVENUE),
    (LO_ORDERDATE DIV 10000) AS year,
    P_BRAND
FROM lineorder_flat
WHERE P_CATEGORY = 'MFGR#12' AND S_REGION = 'AMERICA'
GROUP BY
    year,
    P_BRAND
ORDER BY
    year,
    P_BRAND;

The key information of the Profile is as follows:

- Total: 274ms

VAGGREGATION_NODE (id=1):(Active: 252.260ms, % non-child: 77.37%)
   - ExecOption: Streaming Preaggregation
   - BuildTime: 193.163ms
   - ExecTime: 6.583ms
   - ExprTime: 108.693ms
   - GetResultsTime: 0ns
   - MergeTime: 0ns
   - PeakMemoryUsage: 0.00 
   - RowsReturned: 280
   - RowsReturnedRate: 1.109K /sec

VOLAP_SCAN_NODE (id=0):(Active: 39.909ms, % non-child: 14.54%)
   - ScanTime: 6s275ms

BuildTime represents the total time spent on the operator, ExprTime represents the computation time of the expression following group by, and ExecTime represents the time consumed by the aggregation function
BuildTime includes ExprTime and ExecTime.

It can be seen that PreAggNode takes 193.163ms, accounting for 70.50% of the total SQL time.
The ScanNode takes 39.909ms, which only accounts for 14.57% of the total SQL time.

Therefore, the next plan is to optimize AggNode, The focus is on the expression evaluation of DateTime.

Solution

optimization

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题