我在试着理解Pig的 explain
-功能(链接)。
假设以下示例:
A = load 'numbers' using PigStorage(',') as (name, age);
explain A;
这给了我:
# -----------------------------------------------
# New Logical Plan:
# -----------------------------------------------
A: (Name: LOStore Schema: name#5:bytearray,age#6:bytearray)
|
|---A: (Name: LOLoad Schema: name#5:bytearray,age#6:bytearray)RequiredFields:[0, 1]
# -----------------------------------------------
# Physical Plan:
# -----------------------------------------------
A: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-1
|
|---A: Load(file:///...pig-0.14.0/numbers:PigStorage(',')) - scope-0
2014-12-07 15:07:10,596 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-12-07 15:07:10,609 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-12-07 15:07:10,610 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
# --------------------------------------------------
# Map Reduce Plan
# --------------------------------------------------
MapReduce node scope-2
Map Plan
A: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-1
|
|---A: Load(file:///.../pig-0.14.0/numbers:PigStorage(',')) - scope-0--------
Global sort: false
----------------
我能看到什么?我发现输出结果相当混乱。
1条答案
按热度按时间wfveoks01#
它告诉你Pig的步伐。在您的例子中,它解释了如何通过加载数据来填充别名a,并且由于您还没有对它做任何操作,因此它会进入“fakefile”。它描述了流以及它如何分解为map/reduce
就像你自己看到的那样,很快就会变得一团糟。你可能想看一下Netflix的唇膏,因为它更容易接近。