如何计算Hive中的计数

clj7thdc  于 2021-07-13  发布在  Hive
关注(0)|答案(0)|浏览(259)

我有两个表,我想对一个短语在数据中的次数进行求和。需要注意的是,我想做一个月到现在的分析,所以收集从一个月的第一天到当前日期-1(所以在脚本运行的前一天)之间的所有记录。我编写了一个脚本来获取计数(见下文),但我只希望有一个记录的计数总和。
这是我的密码:

WITH AB AS (
SELECT substr(A1.tm, 1,10) as the_DATE , A1.LOG_ID
FROM db1.table1 A1 , 
db1.table2 A2  WHERE A1.LOG_ID=A2.LOG_ID 
and substr(A1.tm, 1,10) between date_add(last_day(add_months(current_date, -1)),1) and date_sub(current_timestamp(),1)
),
BC AS (
SELECT  AB.the_DATE ,
COUNT ( DISTINCT (CASE WHEN (TXT like '% assistance%') THEN AB.LOG_ID ELSE NULL END )) AS
assistance ,
COUNT ( DISTINCT (CASE WHEN (TXT like '%help%') THEN AB.LOG_ID ELSE NULL END )) AS
helpp , COUNT(DISTINCT AB.LOG_ID) AS VOL_TOTAL
FROM AB LEFT JOIN db1.tabel2 BC ON AB.LOG_ID =BC.LOG_ID
where substr(tm, 1,10) between date_add(last_day(add_months(current_date, -1)),1) and date_sub(current_timestamp(),1)
group by AB.the_DATE
)
SELECT coalesce(BC.the_date ,date_sub(current_timestamp(),1)) as the_DATE  ,
COALESCE(BC.VOL_TOTAL,0) AS VOL_TOTAL  ,COALESCE (BC.assistance,0) AS assistance ,
COALESCE (BC.help,0) AS helpp
FROM BC;

产生

the_date | vol_total | assistance | helpp
2021-04-03      , 3        , 5          , 6
2021-04-06      , 3        , 5          , 6
2021-04-01      , 3        , 5          , 6
2021-04-04      , 3        , 5          , 6
2021-04-07      , 3        , 5          , 6

我希望输出为:

the_date | vol_total | assistance |    helpp
2021-04-07      , 15        , 25          , 30

因此,每个字段的总和与“theu date”字段一起计算为昨天的日期。我是新的Hive和它的功能,所以任何想法都将有助于获得我想要的输出。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题