org.apache.spark.api.java.JavaSparkContext.wholeTextFiles()方法的使用及代码示例

x33g5p2x  于2022-01-21 转载在 其他  
字(5.1k)|赞(0)|评价(0)|浏览(143)

本文整理了Java中org.apache.spark.api.java.JavaSparkContext.wholeTextFiles()方法的一些代码示例,展示了JavaSparkContext.wholeTextFiles()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JavaSparkContext.wholeTextFiles()方法的具体详情如下:
包路径:org.apache.spark.api.java.JavaSparkContext
类名称:JavaSparkContext
方法名:wholeTextFiles

JavaSparkContext.wholeTextFiles介绍

暂无

代码示例

代码示例来源:origin: databricks/learning-spark

public static void main(String[] args) throws Exception {
    if (args.length != 3) {
   throw new Exception("Usage BasicLoadCsv sparkMaster csvInputFile csvOutputFile key");
    }
  String master = args[0];
  String csvInput = args[1];
  String outputFile = args[2];
  final String key = args[3];

    JavaSparkContext sc = new JavaSparkContext(
   master, "loadwholecsv", System.getenv("SPARK_HOME"), System.getenv("JARS"));
  JavaPairRDD<String, String> csvData = sc.wholeTextFiles(csvInput);
  JavaRDD<String[]> keyedRDD = csvData.flatMap(new ParseLine());
  JavaRDD<String[]> result =
   keyedRDD.filter(new Function<String[], Boolean>() {
     public Boolean call(String[] input) { return input[0].equals(key); }});

  result.saveAsTextFile(outputFile);
  }
}

代码示例来源:origin: org.apache.spark/spark-core_2.11

@Test
public void wholeTextFiles() throws Exception {
 byte[] content1 = "spark is easy to use.\n".getBytes(StandardCharsets.UTF_8);
 byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
 String tempDirName = tempDir.getAbsolutePath();
 String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
 String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
 Files.write(content1, new File(path1));
 Files.write(content2, new File(path2));
 Map<String, String> container = new HashMap<>();
 container.put(path1, new Text(content1).toString());
 container.put(path2, new Text(content2).toString());
 JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
 List<Tuple2<String, String>> result = readRDD.collect();
 for (Tuple2<String, String> res : result) {
  // Note that the paths from `wholeTextFiles` are in URI format on Windows,
  // for example, file:/C:/a/b/c.
  assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
 }
}

代码示例来源:origin: org.apache.spark/spark-core

@Test
public void wholeTextFiles() throws Exception {
 byte[] content1 = "spark is easy to use.\n".getBytes(StandardCharsets.UTF_8);
 byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
 String tempDirName = tempDir.getAbsolutePath();
 String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
 String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
 Files.write(content1, new File(path1));
 Files.write(content2, new File(path2));
 Map<String, String> container = new HashMap<>();
 container.put(path1, new Text(content1).toString());
 container.put(path2, new Text(content2).toString());
 JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
 List<Tuple2<String, String>> result = readRDD.collect();
 for (Tuple2<String, String> res : result) {
  // Note that the paths from `wholeTextFiles` are in URI format on Windows,
  // for example, file:/C:/a/b/c.
  assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
 }
}

代码示例来源:origin: org.apache.spark/spark-core_2.10

@Test
public void wholeTextFiles() throws Exception {
 byte[] content1 = "spark is easy to use.\n".getBytes(StandardCharsets.UTF_8);
 byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
 String tempDirName = tempDir.getAbsolutePath();
 String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
 String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
 Files.write(content1, new File(path1));
 Files.write(content2, new File(path2));
 Map<String, String> container = new HashMap<>();
 container.put(path1, new Text(content1).toString());
 container.put(path2, new Text(content2).toString());
 JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
 List<Tuple2<String, String>> result = readRDD.collect();
 for (Tuple2<String, String> res : result) {
  // Note that the paths from `wholeTextFiles` are in URI format on Windows,
  // for example, file:/C:/a/b/c.
  assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
 }
}

代码示例来源:origin: stackoverflow.com

>       JavaSparkContext jsc = new JavaSparkContext(sc);
>       JavaPairRDD<String,String> rdd = jsc.wholeTextFiles(path);
>               for(Tuple2<String, String> str : rdd.toArray()) {           System.out.println("+++++++++++++++++++++++++++++++++++++++++++");
>           System.out.println("File name " + str._1);
>           System.out.println("+++++++++++++++++++++++++++++++++++++++++++");
>           System.out.println();
>           System.out.println("-------------------------------------------");
>           System.out.println("content " + str._2);
>           System.out.println("-------------------------------------------");
>       }

代码示例来源:origin: com.davidbracewell/hermes-core

Broadcast<Config> configBroadcast = SparkStreamingContext.INSTANCE.broadcast(Config.getInstance());
JavaRDD<String> rdd = StreamingContext.distributed().sparkContext()
                   .wholeTextFiles(corpusLocation)
                   .values()
                   .flatMap(str -> {

相关文章

微信公众号

最新文章

更多