org.apache.spark.api.java.JavaRDD.<init>()方法的使用及代码示例

x33g5p2x  于2022-01-21 转载在 其他  
字(6.2k)|赞(0)|评价(0)|浏览(105)

本文整理了Java中org.apache.spark.api.java.JavaRDD.<init>()方法的一些代码示例,展示了JavaRDD.<init>()的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。JavaRDD.<init>()方法的具体详情如下:
包路径:org.apache.spark.api.java.JavaRDD
类名称:JavaRDD
方法名:<init>

JavaRDD.<init>介绍

暂无

代码示例

代码示例来源:origin: org.rcsb/mmtf-spark

/**
 * Get a {@link JavaRDD} from a {@link Dataset}.
 * @param atomDataset the dataset to convert
 * @param class1 the class of the dataset
 * @param <T> the type of the dataset
 * @return the {@link JavaRDD} fromn the dataset
 */
public static <T> JavaRDD<T> getJavaRdd(Dataset<T> atomDataset, Class<T> class1) {
  ClassTag<T> classTag = scala.reflect.ClassTag$.MODULE$.apply(class1);
  return new JavaRDD<T>(atomDataset.rdd(), classTag);
}

代码示例来源:origin: org.apache.pig/pig

@Override
public RDD<Tuple> convert(List<RDD<Tuple>> predecessors, POBroadcastSpark po) {
  SparkUtil.assertPredecessorSize(predecessors, po, 1);
  RDD<Tuple> rdd = predecessors.get(0);
  // Just collect the predecessor RDD, and broadcast it
  JavaRDD<Tuple> javaRDD = new JavaRDD<>(rdd, SparkUtil.getManifest(Tuple.class));
  Broadcast<List<Tuple>> broadcastedRDD = sc.broadcast(javaRDD.collect());
  // Save the broadcast variable to broadcastedVars map, so that this
  // broadcasted variable can be referenced by the driver client.
  SparkPigContext.get().getBroadcastedVars().put(po.getBroadcastedVariableName(), broadcastedRDD);
  return rdd;
}

代码示例来源:origin: com.datastax.spark/spark-cassandra-connector-java_2.10

/**
 * Repartitions the data (via a shuffle) based upon the replication of the given {@code keyspaceName}
 * and {@code tableName}. Calling this method before using joinWithCassandraTable will ensure that
 * requests will be coordinator local. {@code partitionsPerHost} Controls the number of Spark
 * Partitions that will be created in this repartitioning event. The calling RDD must have rows that
 * can be converted into the partition key of the given Cassandra Table.
 */
public JavaRDD<T> repartitionByCassandraReplica(
    String keyspaceName,
    String tableName,
    int partitionsPerHost,
    ColumnSelector partitionkeyMapper,
    RowWriterFactory<T> rowWriterFactory
) {
  CassandraConnector connector = defaultConnector();
  ClassTag<T> ctT = rdd.toJavaRDD().classTag();
  CassandraPartitionedRDD<T> newRDD = rddFunctions.repartitionByCassandraReplica(
      keyspaceName,
      tableName,
      partitionsPerHost,
      partitionkeyMapper,
      connector,
      ctT,
      rowWriterFactory);
  return new JavaRDD<>(newRDD, ctT);
}

代码示例来源:origin: com.datastax.spark/spark-cassandra-connector_2.10

/**
 * Repartitions the data (via a shuffle) based upon the replication of the given {@code keyspaceName}
 * and {@code tableName}. Calling this method before using joinWithCassandraTable will ensure that
 * requests will be coordinator local. {@code partitionsPerHost} Controls the number of Spark
 * Partitions that will be created in this repartitioning event. The calling RDD must have rows that
 * can be converted into the partition key of the given Cassandra Table.
 */
public JavaRDD<T> repartitionByCassandraReplica(
    String keyspaceName,
    String tableName,
    int partitionsPerHost,
    ColumnSelector partitionkeyMapper,
    RowWriterFactory<T> rowWriterFactory
) {
  CassandraConnector connector = defaultConnector();
  ClassTag<T> ctT = rdd.toJavaRDD().classTag();
  CassandraPartitionedRDD<T> newRDD = rddFunctions.repartitionByCassandraReplica(
      keyspaceName,
      tableName,
      partitionsPerHost,
      partitionkeyMapper,
      connector,
      ctT,
      rowWriterFactory);
  return new JavaRDD<>(newRDD, ctT);
}

代码示例来源:origin: com.datastax.spark/spark-cassandra-connector

/**
 * Repartitions the data (via a shuffle) based upon the replication of the given {@code keyspaceName}
 * and {@code tableName}. Calling this method before using joinWithCassandraTable will ensure that
 * requests will be coordinator local. {@code partitionsPerHost} Controls the number of Spark
 * Partitions that will be created in this repartitioning event. The calling RDD must have rows that
 * can be converted into the partition key of the given Cassandra Table.
 */
public JavaRDD<T> repartitionByCassandraReplica(
    String keyspaceName,
    String tableName,
    int partitionsPerHost,
    ColumnSelector partitionkeyMapper,
    RowWriterFactory<T> rowWriterFactory
) {
  CassandraConnector connector = defaultConnector();
  ClassTag<T> ctT = rdd.toJavaRDD().classTag();
  CassandraPartitionedRDD<T> newRDD = rddFunctions.repartitionByCassandraReplica(
      keyspaceName,
      tableName,
      partitionsPerHost,
      partitionkeyMapper,
      connector,
      ctT,
      rowWriterFactory);
  return new JavaRDD<>(newRDD, ctT);
}

代码示例来源:origin: com.datastax.spark/spark-cassandra-connector-java

/**
 * Repartitions the data (via a shuffle) based upon the replication of the given {@code keyspaceName}
 * and {@code tableName}. Calling this method before using joinWithCassandraTable will ensure that
 * requests will be coordinator local. {@code partitionsPerHost} Controls the number of Spark
 * Partitions that will be created in this repartitioning event. The calling RDD must have rows that
 * can be converted into the partition key of the given Cassandra Table.
 */
public JavaRDD<T> repartitionByCassandraReplica(
    String keyspaceName,
    String tableName,
    int partitionsPerHost,
    ColumnSelector partitionkeyMapper,
    RowWriterFactory<T> rowWriterFactory
) {
  CassandraConnector connector = defaultConnector();
  ClassTag<T> ctT = rdd.toJavaRDD().classTag();
  CassandraPartitionedRDD<T> newRDD = rddFunctions.repartitionByCassandraReplica(
      keyspaceName,
      tableName,
      partitionsPerHost,
      partitionkeyMapper,
      connector,
      ctT,
      rowWriterFactory);
  return new JavaRDD<>(newRDD, ctT);
}

代码示例来源:origin: com.datastax.spark/spark-cassandra-connector-unshaded

/**
 * Repartitions the data (via a shuffle) based upon the replication of the given {@code keyspaceName}
 * and {@code tableName}. Calling this method before using joinWithCassandraTable will ensure that
 * requests will be coordinator local. {@code partitionsPerHost} Controls the number of Spark
 * Partitions that will be created in this repartitioning event. The calling RDD must have rows that
 * can be converted into the partition key of the given Cassandra Table.
 */
public JavaRDD<T> repartitionByCassandraReplica(
    String keyspaceName,
    String tableName,
    int partitionsPerHost,
    ColumnSelector partitionkeyMapper,
    RowWriterFactory<T> rowWriterFactory
) {
  CassandraConnector connector = defaultConnector();
  ClassTag<T> ctT = rdd.toJavaRDD().classTag();
  CassandraPartitionedRDD<T> newRDD = rddFunctions.repartitionByCassandraReplica(
      keyspaceName,
      tableName,
      partitionsPerHost,
      partitionkeyMapper,
      connector,
      ctT,
      rowWriterFactory);
  return new JavaRDD<>(newRDD, ctT);
}

相关文章

微信公众号

最新文章

更多