本文整理了Java中org.apache.spark.api.java.JavaPairRDD.groupBy()
方法的一些代码示例,展示了JavaPairRDD.groupBy()
的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JavaPairRDD.groupBy()
方法的具体详情如下:
包路径:org.apache.spark.api.java.JavaPairRDD
类名称:JavaPairRDD
方法名:groupBy
暂无
代码示例来源:origin: org.apache.spark/spark-core
@Test
public void groupByOnPairRDD() {
// Regression test for SPARK-4459
JavaRDD<Integer> rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 13));
Function<Tuple2<Integer, Integer>, Boolean> areOdd =
x -> (x._1() % 2 == 0) && (x._2() % 2 == 0);
JavaPairRDD<Integer, Integer> pairRDD = rdd.zip(rdd);
JavaPairRDD<Boolean, Iterable<Tuple2<Integer, Integer>>> oddsAndEvens = pairRDD.groupBy(areOdd);
assertEquals(2, oddsAndEvens.count());
assertEquals(2, Iterables.size(oddsAndEvens.lookup(true).get(0))); // Evens
assertEquals(5, Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
oddsAndEvens = pairRDD.groupBy(areOdd, 1);
assertEquals(2, oddsAndEvens.count());
assertEquals(2, Iterables.size(oddsAndEvens.lookup(true).get(0))); // Evens
assertEquals(5, Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
}
代码示例来源:origin: org.apache.spark/spark-core_2.11
@Test
public void groupByOnPairRDD() {
// Regression test for SPARK-4459
JavaRDD<Integer> rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 13));
Function<Tuple2<Integer, Integer>, Boolean> areOdd =
x -> (x._1() % 2 == 0) && (x._2() % 2 == 0);
JavaPairRDD<Integer, Integer> pairRDD = rdd.zip(rdd);
JavaPairRDD<Boolean, Iterable<Tuple2<Integer, Integer>>> oddsAndEvens = pairRDD.groupBy(areOdd);
assertEquals(2, oddsAndEvens.count());
assertEquals(2, Iterables.size(oddsAndEvens.lookup(true).get(0))); // Evens
assertEquals(5, Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
oddsAndEvens = pairRDD.groupBy(areOdd, 1);
assertEquals(2, oddsAndEvens.count());
assertEquals(2, Iterables.size(oddsAndEvens.lookup(true).get(0))); // Evens
assertEquals(5, Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
}
代码示例来源:origin: org.apache.spark/spark-core_2.10
@Test
public void groupByOnPairRDD() {
// Regression test for SPARK-4459
JavaRDD<Integer> rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 13));
Function<Tuple2<Integer, Integer>, Boolean> areOdd =
x -> (x._1() % 2 == 0) && (x._2() % 2 == 0);
JavaPairRDD<Integer, Integer> pairRDD = rdd.zip(rdd);
JavaPairRDD<Boolean, Iterable<Tuple2<Integer, Integer>>> oddsAndEvens = pairRDD.groupBy(areOdd);
assertEquals(2, oddsAndEvens.count());
assertEquals(2, Iterables.size(oddsAndEvens.lookup(true).get(0))); // Evens
assertEquals(5, Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
oddsAndEvens = pairRDD.groupBy(areOdd, 1);
assertEquals(2, oddsAndEvens.count());
assertEquals(2, Iterables.size(oddsAndEvens.lookup(true).get(0))); // Evens
assertEquals(5, Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
}
内容来源于网络,如有侵权,请联系作者删除!