org.apache.spark.api.java.JavaPairRDD.sampleByKey()方法的使用及代码示例

x33g5p2x  于2022-01-21 转载在 其他  
字(2.8k)|赞(0)|评价(0)|浏览(116)

本文整理了Java中org.apache.spark.api.java.JavaPairRDD.sampleByKey()方法的一些代码示例,展示了JavaPairRDD.sampleByKey()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JavaPairRDD.sampleByKey()方法的具体详情如下:
包路径:org.apache.spark.api.java.JavaPairRDD
类名称:JavaPairRDD
方法名:sampleByKey

JavaPairRDD.sampleByKey介绍

暂无

代码示例

代码示例来源:origin: org.apache.spark/spark-core

@Test
@SuppressWarnings("unchecked")
public void sampleByKey() {
 JavaRDD<Integer> rdd1 = sc.parallelize(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8), 3);
 JavaPairRDD<Integer, Integer> rdd2 = rdd1.mapToPair(i -> new Tuple2<>(i % 2, 1));
 Map<Integer, Double> fractions = new HashMap<>();
 fractions.put(0, 0.5);
 fractions.put(1, 1.0);
 JavaPairRDD<Integer, Integer> wr = rdd2.sampleByKey(true, fractions, 1L);
 Map<Integer, Long> wrCounts = wr.countByKey();
 assertEquals(2, wrCounts.size());
 assertTrue(wrCounts.get(0) > 0);
 assertTrue(wrCounts.get(1) > 0);
 JavaPairRDD<Integer, Integer> wor = rdd2.sampleByKey(false, fractions, 1L);
 Map<Integer, Long> worCounts = wor.countByKey();
 assertEquals(2, worCounts.size());
 assertTrue(worCounts.get(0) > 0);
 assertTrue(worCounts.get(1) > 0);
}

代码示例来源:origin: org.apache.spark/spark-core_2.11

@Test
@SuppressWarnings("unchecked")
public void sampleByKey() {
 JavaRDD<Integer> rdd1 = sc.parallelize(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8), 3);
 JavaPairRDD<Integer, Integer> rdd2 = rdd1.mapToPair(i -> new Tuple2<>(i % 2, 1));
 Map<Integer, Double> fractions = new HashMap<>();
 fractions.put(0, 0.5);
 fractions.put(1, 1.0);
 JavaPairRDD<Integer, Integer> wr = rdd2.sampleByKey(true, fractions, 1L);
 Map<Integer, Long> wrCounts = wr.countByKey();
 assertEquals(2, wrCounts.size());
 assertTrue(wrCounts.get(0) > 0);
 assertTrue(wrCounts.get(1) > 0);
 JavaPairRDD<Integer, Integer> wor = rdd2.sampleByKey(false, fractions, 1L);
 Map<Integer, Long> worCounts = wor.countByKey();
 assertEquals(2, worCounts.size());
 assertTrue(worCounts.get(0) > 0);
 assertTrue(worCounts.get(1) > 0);
}

代码示例来源:origin: org.apache.spark/spark-core_2.10

@Test
@SuppressWarnings("unchecked")
public void sampleByKey() {
 JavaRDD<Integer> rdd1 = sc.parallelize(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8), 3);
 JavaPairRDD<Integer, Integer> rdd2 = rdd1.mapToPair(i -> new Tuple2<>(i % 2, 1));
 Map<Integer, Double> fractions = new HashMap<>();
 fractions.put(0, 0.5);
 fractions.put(1, 1.0);
 JavaPairRDD<Integer, Integer> wr = rdd2.sampleByKey(true, fractions, 1L);
 Map<Integer, Long> wrCounts = wr.countByKey();
 assertEquals(2, wrCounts.size());
 assertTrue(wrCounts.get(0) > 0);
 assertTrue(wrCounts.get(1) > 0);
 JavaPairRDD<Integer, Integer> wor = rdd2.sampleByKey(false, fractions, 1L);
 Map<Integer, Long> worCounts = wor.countByKey();
 assertEquals(2, worCounts.size());
 assertTrue(worCounts.get(0) > 0);
 assertTrue(worCounts.get(1) > 0);
}

相关文章

微信公众号

最新文章

更多

JavaPairRDD类方法