本文整理了Java中org.apache.spark.api.java.JavaSparkContext.wholeTextFiles()
方法的一些代码示例,展示了JavaSparkContext.wholeTextFiles()
的具体用法。这些代码示例主要来源于Github
/Stackoverflow
/Maven
等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JavaSparkContext.wholeTextFiles()
方法的具体详情如下:
包路径:org.apache.spark.api.java.JavaSparkContext
类名称:JavaSparkContext
方法名:wholeTextFiles
暂无
代码示例来源:origin: databricks/learning-spark
public static void main(String[] args) throws Exception {
if (args.length != 3) {
throw new Exception("Usage BasicLoadCsv sparkMaster csvInputFile csvOutputFile key");
}
String master = args[0];
String csvInput = args[1];
String outputFile = args[2];
final String key = args[3];
JavaSparkContext sc = new JavaSparkContext(
master, "loadwholecsv", System.getenv("SPARK_HOME"), System.getenv("JARS"));
JavaPairRDD<String, String> csvData = sc.wholeTextFiles(csvInput);
JavaRDD<String[]> keyedRDD = csvData.flatMap(new ParseLine());
JavaRDD<String[]> result =
keyedRDD.filter(new Function<String[], Boolean>() {
public Boolean call(String[] input) { return input[0].equals(key); }});
result.saveAsTextFile(outputFile);
}
}
代码示例来源:origin: org.apache.spark/spark-core_2.11
@Test
public void wholeTextFiles() throws Exception {
byte[] content1 = "spark is easy to use.\n".getBytes(StandardCharsets.UTF_8);
byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
String tempDirName = tempDir.getAbsolutePath();
String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
Files.write(content1, new File(path1));
Files.write(content2, new File(path2));
Map<String, String> container = new HashMap<>();
container.put(path1, new Text(content1).toString());
container.put(path2, new Text(content2).toString());
JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
List<Tuple2<String, String>> result = readRDD.collect();
for (Tuple2<String, String> res : result) {
// Note that the paths from `wholeTextFiles` are in URI format on Windows,
// for example, file:/C:/a/b/c.
assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
}
}
代码示例来源:origin: org.apache.spark/spark-core
@Test
public void wholeTextFiles() throws Exception {
byte[] content1 = "spark is easy to use.\n".getBytes(StandardCharsets.UTF_8);
byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
String tempDirName = tempDir.getAbsolutePath();
String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
Files.write(content1, new File(path1));
Files.write(content2, new File(path2));
Map<String, String> container = new HashMap<>();
container.put(path1, new Text(content1).toString());
container.put(path2, new Text(content2).toString());
JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
List<Tuple2<String, String>> result = readRDD.collect();
for (Tuple2<String, String> res : result) {
// Note that the paths from `wholeTextFiles` are in URI format on Windows,
// for example, file:/C:/a/b/c.
assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
}
}
代码示例来源:origin: org.apache.spark/spark-core_2.10
@Test
public void wholeTextFiles() throws Exception {
byte[] content1 = "spark is easy to use.\n".getBytes(StandardCharsets.UTF_8);
byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
String tempDirName = tempDir.getAbsolutePath();
String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
Files.write(content1, new File(path1));
Files.write(content2, new File(path2));
Map<String, String> container = new HashMap<>();
container.put(path1, new Text(content1).toString());
container.put(path2, new Text(content2).toString());
JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
List<Tuple2<String, String>> result = readRDD.collect();
for (Tuple2<String, String> res : result) {
// Note that the paths from `wholeTextFiles` are in URI format on Windows,
// for example, file:/C:/a/b/c.
assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
}
}
代码示例来源:origin: stackoverflow.com
> JavaSparkContext jsc = new JavaSparkContext(sc);
> JavaPairRDD<String,String> rdd = jsc.wholeTextFiles(path);
> for(Tuple2<String, String> str : rdd.toArray()) { System.out.println("+++++++++++++++++++++++++++++++++++++++++++");
> System.out.println("File name " + str._1);
> System.out.println("+++++++++++++++++++++++++++++++++++++++++++");
> System.out.println();
> System.out.println("-------------------------------------------");
> System.out.println("content " + str._2);
> System.out.println("-------------------------------------------");
> }
代码示例来源:origin: com.davidbracewell/hermes-core
Broadcast<Config> configBroadcast = SparkStreamingContext.INSTANCE.broadcast(Config.getInstance());
JavaRDD<String> rdd = StreamingContext.distributed().sparkContext()
.wholeTextFiles(corpusLocation)
.values()
.flatMap(str -> {
内容来源于网络,如有侵权,请联系作者删除!