impala查询无法检索具有nullpointerexception的结果

lymgl2op  于 2021-06-29  发布在  Hive
关注(0)|答案(1)|浏览(422)

我在hive/impala上运行了以下查询:

select count(p.id) as tweet_count, p.author as author,p.profile_image_url as profile_image_url,p.screen_name as screen_name,
concat_ws('/',min(p.postday),min(p.postmonth),min(p.postyear) ) as creation_date,p.message message,af.followerid as follower 
from post p 
inner join author_follower af on af.id like if(p.author= null, '', concat(p.author,'%'))
where p.hashtaglist like 'hashtagtobeused' 
group by author,profile_image_url,screen_name,message,follower
ORDER BY cast(min(postyear) as int),cast(min(postmonth) as int),cast(min(postday) as int),cast(min(posthour) as int) ASC;

但由于某些原因,我得到以下错误结果
您的查询有以下错误:

Bad status for request 3304: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)

我检查了查询,但找不到问题,请任何人帮助和指导问题在哪里?为什么我有这个错误而不是结果集

ev7lccsx

ev7lccsx1#

请仔细考虑重新格式化查询,因为在某些情况下,当sql解析本身由于诸如空格之类的简单问题而失败时,impala会与segv崩溃。如果您运行的是cloudera,则会在 /run/cloudera-scm-agent/process 在运行查询的节点上。
我们通过小心处理sql格式(这也是一个很好的实践,因为它使查询错误更容易发现)解决了这些问题。

SELECT
    COUNT(p.id)                                                     AS tweet_count,
    p.author                                                        AS author,
    p.profile_image_url                                             AS profile_image_url,
    p.screen_name                                                   AS screen_name,
    concat_ws('/', MIN(p.postday), MIN(p.postmonth), MIN(p.postyear) ) AS creation_date,
    p.message                                                       AS MESSAGE,
    af.followerid                                                   AS follower
FROM
    post p
INNER JOIN
    author_follower af
ON
    af.id LIKE IF(p.author = NULL, '', concat(p.author, '%'))
WHERE
    p.hashtaglist LIKE 'hashtagtobeused'
GROUP BY
    author,
    profile_image_url,
    screen_name,
    MESSAGE,
    follower
ORDER BY
    CAST(MIN(postyear) AS INT),
    CAST(MIN(postmonth) AS INT),
    CAST(MIN(postday) AS INT),
    CAST(MIN(posthour) AS INT) ASC;

(顺便说一句,我使用dbvisualizer验证并重新格式化查询语法——这是一个很好的工具)

相关问题