hive多组by和minus操作

0mkxixxg  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(457)

我正在尝试加入关于id和playerid以及year和yearid的这两个select语句(下面的模式)。还减去别名hab-eg还按年份和id分组在两个select语句中,在层次结构的后面执行除法和减法之前对值进行求和。上面写着当我尝试这个的时候,用g来分组,看起来很奇怪。我不需要按g分组,只需要id和年份,因为一个玩家可以在表中有多个条目,我们需要在计算之前把g、EH和ab加起来

Try this:

SELECT
    a.playerID AS ID,
    a.yearID AS yearID,
    (b.HAB - a.EG) AS `HAB-EG`
FROM 
    (SELECT
        SUM(playerID),
        SUM(yearID),
        (E/G) AS EG
    FROM fielding
    WHERE (
            yearID > 2005
            AND yearID < 2009
            AND G > 20 
            )GROUP BY playerID,yearID
    ) AS a
JOIN
    (SELECT
        SUM(id),
        SUM(year),
        (hits/ab) AS HAB
    FROM batting
    WHERE( 
            year > 2005
            AND year < 2009 
            AND ab > 40 
            ) GROUP BY id,year

    ) AS b ON a.playerID = b.id AND a.yearID = b.year;

只是架构

CREATE EXTERNAL TABLE IF NOT EXISTS fielding
(playerID STRING ,yearID INT ,teamID STRING ,lgID STRING ,
POS STRING ,G INT ,GS INT , InnOuts INT , PO INT,A INT, E INT,  
DP INT , PB INT , WP INT ,SB INT ,CS INT , ZR INT ) ROW
FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION      '/home/hduser/hivetest/fielding';

只是模式

CREATE EXTERNAL TABLE IF NOT EXISTS batting(id STRING, year INT, team STRING,
 league STRING, games INT, ab INT, runs INT, hits INT, doubles INT, triples
 INT, homeruns INT, rbi INT, sb INT, cs INT, walks INT, strikeouts INT, ibb
 INT, hbp INT, sh INT, sf INT, gidp INT) ROW FORMAT DELIMITED FIELDS
 TERMINATED BY ',' LOCATION '/home/hduser/hivetest/batting';
sdnqo3pr

sdnqo3pr1#

试试这个:

SELECT
    a.playerID AS ID,
    a.yearID AS yearID,
    (b.HAB - a.EG) AS `HAB-EG`
FROM 
    (SELECT
        playerID,
        yearID,
        (SUM(E)/SUM(G)) AS EG
    FROM fielding
    WHERE (
            yearID > 2005
            AND yearID < 2009
            AND G > 20 
            )GROUP BY playerID,yearID
    ) AS a
JOIN
    (SELECT
        id,
        year,
        (SUM(hits)/SUM(ab)) AS HAB
    FROM batting
    WHERE( 
            year > 2005
            AND year < 2009 
            AND ab > 40 
            ) GROUP BY id,year

    ) AS b ON a.playerID = b.id AND a.yearID = b.year;

相关问题