org.apache.spark.sql.DataFrameWriter.saveAsTable()方法的使用及代码示例

x33g5p2x  于2022-01-18 转载在 其他  
字(6.8k)|赞(0)|评价(0)|浏览(307)

本文整理了Java中org.apache.spark.sql.DataFrameWriter.saveAsTable()方法的一些代码示例,展示了DataFrameWriter.saveAsTable()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。DataFrameWriter.saveAsTable()方法的具体详情如下:
包路径:org.apache.spark.sql.DataFrameWriter
类名称:DataFrameWriter
方法名:saveAsTable

DataFrameWriter.saveAsTable介绍

暂无

代码示例

代码示例来源:origin: com.cerner.bunsen/bunsen-core

/**
 * Saves an RDD of bundles as a database, where each table
 * has the resource name. This offers a simple way to load and query
 * bundles in a system, although users with more sophisticated ETL
 * operations may want to explicitly write different entities.
 *
 * <p>
 * Note this will access the given RDD of bundles once per resource name,
 * so consumers with enough memory should consider calling
 * {@link JavaRDD#cache()} so that RDD is not recomputed for each.
 * </p>
 *
 * @param spark the spark session
 * @param bundles an RDD of FHIR Bundles
 * @param database the name of the database to write to
 * @param resourceNames names of resources to be extracted from the bundle and written
 */
public void saveAsDatabase(SparkSession spark,
  JavaRDD<BundleContainer> bundles,
  String database,
  String... resourceNames) {
 spark.sql("create database if not exists " + database);
 for (String resourceName : resourceNames) {
  Dataset ds = extractEntry(spark, bundles, resourceName);
  ds.write().saveAsTable(database + "." + resourceName.toLowerCase());
 }
}

代码示例来源:origin: cerner/bunsen

/**
 * Saves an RDD of bundles as a database, where each table
 * has the resource name. This offers a simple way to load and query
 * bundles in a system, although users with more sophisticated ETL
 * operations may want to explicitly write different entities.
 *
 * <p>
 * Note this will access the given RDD of bundles once per resource name,
 * so consumers with enough memory should consider calling
 * {@link JavaRDD#cache()} so that RDD is not recomputed for each.
 * </p>
 *
 * @param spark the spark session
 * @param bundles an RDD of FHIR Bundles
 * @param database the name of the database to write to
 * @param resourceNames names of resources to be extracted from the bundle and written
 */
public void saveAsDatabase(SparkSession spark,
  JavaRDD<BundleContainer> bundles,
  String database,
  String... resourceNames) {
 spark.sql("create database if not exists " + database);
 for (String resourceName : resourceNames) {
  Dataset ds = extractEntry(spark, bundles, resourceName);
  ds.write().saveAsTable(database + "." + resourceName.toLowerCase());
 }
}

代码示例来源:origin: org.apache.spark/spark-hive_2.10

@Test
 public void saveTableAndQueryIt() {
  Map<String, String> options = new HashMap<>();
  df.write()
   .format("org.apache.spark.sql.json")
   .mode(SaveMode.Append)
   .options(options)
   .saveAsTable("javaSavedTable");

  checkAnswer(
   sqlContext.sql("SELECT * FROM javaSavedTable"),
   df.collectAsList());
 }
}

代码示例来源:origin: org.apache.spark/spark-hive_2.11

@Test
 public void saveTableAndQueryIt() {
  Map<String, String> options = new HashMap<>();
  df.write()
   .format("org.apache.spark.sql.json")
   .mode(SaveMode.Append)
   .options(options)
   .saveAsTable("javaSavedTable");

  checkAnswer(
   sqlContext.sql("SELECT * FROM javaSavedTable"),
   df.collectAsList());
 }
}

代码示例来源:origin: com.cerner.bunsen/bunsen-core

.format("parquet")
.partitionBy("timestamp")
.saveAsTable(conceptMapTable);

代码示例来源:origin: org.apache.spark/spark-hive_2.11

@Test
public void saveExternalTableAndQueryIt() {
 Map<String, String> options = new HashMap<>();
 options.put("path", path.toString());
 df.write()
  .format("org.apache.spark.sql.json")
  .mode(SaveMode.Append)
  .options(options)
  .saveAsTable("javaSavedTable");
 checkAnswer(
  sqlContext.sql("SELECT * FROM javaSavedTable"),
  df.collectAsList());
 Dataset<Row> loadedDF =
  sqlContext.createExternalTable("externalTable", "org.apache.spark.sql.json", options);
 checkAnswer(loadedDF, df.collectAsList());
 checkAnswer(
  sqlContext.sql("SELECT * FROM externalTable"),
  df.collectAsList());
}

代码示例来源:origin: org.apache.spark/spark-hive_2.10

@Test
public void saveExternalTableAndQueryIt() {
 Map<String, String> options = new HashMap<>();
 options.put("path", path.toString());
 df.write()
  .format("org.apache.spark.sql.json")
  .mode(SaveMode.Append)
  .options(options)
  .saveAsTable("javaSavedTable");
 checkAnswer(
  sqlContext.sql("SELECT * FROM javaSavedTable"),
  df.collectAsList());
 Dataset<Row> loadedDF =
  sqlContext.createExternalTable("externalTable", "org.apache.spark.sql.json", options);
 checkAnswer(loadedDF, df.collectAsList());
 checkAnswer(
  sqlContext.sql("SELECT * FROM externalTable"),
  df.collectAsList());
}

代码示例来源:origin: cerner/bunsen

.format("parquet")
.partitionBy("timestamp")
.saveAsTable(conceptMapTable);

代码示例来源:origin: org.apache.spark/spark-hive_2.10

@Test
public void saveExternalTableWithSchemaAndQueryIt() {
 Map<String, String> options = new HashMap<>();
 options.put("path", path.toString());
 df.write()
  .format("org.apache.spark.sql.json")
  .mode(SaveMode.Append)
  .options(options)
  .saveAsTable("javaSavedTable");
 checkAnswer(
  sqlContext.sql("SELECT * FROM javaSavedTable"),
  df.collectAsList());
 List<StructField> fields = new ArrayList<>();
 fields.add(DataTypes.createStructField("b", DataTypes.StringType, true));
 StructType schema = DataTypes.createStructType(fields);
 Dataset<Row> loadedDF =
  sqlContext.createExternalTable("externalTable", "org.apache.spark.sql.json", schema, options);
 checkAnswer(
  loadedDF,
  sqlContext.sql("SELECT b FROM javaSavedTable").collectAsList());
 checkAnswer(
  sqlContext.sql("SELECT * FROM externalTable"),
  sqlContext.sql("SELECT b FROM javaSavedTable").collectAsList());
}

代码示例来源:origin: com.cerner.bunsen/bunsen-core

.format("parquet")
.partitionBy("timestamp")
.saveAsTable(valueSetTable);

代码示例来源:origin: cerner/bunsen

.format("parquet")
.partitionBy("timestamp")
.saveAsTable(valueSetTable);

代码示例来源:origin: org.apache.spark/spark-hive_2.11

@Test
public void saveExternalTableWithSchemaAndQueryIt() {
 Map<String, String> options = new HashMap<>();
 options.put("path", path.toString());
 df.write()
  .format("org.apache.spark.sql.json")
  .mode(SaveMode.Append)
  .options(options)
  .saveAsTable("javaSavedTable");
 checkAnswer(
  sqlContext.sql("SELECT * FROM javaSavedTable"),
  df.collectAsList());
 List<StructField> fields = new ArrayList<>();
 fields.add(DataTypes.createStructField("b", DataTypes.StringType, true));
 StructType schema = DataTypes.createStructType(fields);
 Dataset<Row> loadedDF =
  sqlContext.createExternalTable("externalTable", "org.apache.spark.sql.json", schema, options);
 checkAnswer(
  loadedDF,
  sqlContext.sql("SELECT b FROM javaSavedTable").collectAsList());
 checkAnswer(
  sqlContext.sql("SELECT * FROM externalTable"),
  sqlContext.sql("SELECT b FROM javaSavedTable").collectAsList());
}

相关文章