org.apache.parquet.schema.Types类的使用及代码示例

x33g5p2x  于2022-01-30 转载在 其他  
字(22.3k)|赞(0)|评价(0)|浏览(283)

本文整理了Java中org.apache.parquet.schema.Types类的一些代码示例,展示了Types类的具体用法。这些代码示例主要来源于Github/Stackoverflow/Maven等平台,是从一些精选项目中提取出来的代码,具有较强的参考意义,能在一定程度帮忙到你。Types类的具体详情如下:
包路径:org.apache.parquet.schema.Types
类名称:Types

Types介绍

[英]This class provides fluent builders that produce Parquet schema Types.

The most basic use is to build primitive types:

Types.required(INT64).named("id"); 
Types.optional(INT32).named("number");

The required(PrimitiveTypeName) factory method produces a primitive type builder, and the PrimitiveBuilder#named(String) builds the PrimitiveType. Between required and named, other builder methods can be used to add type annotations or other type metadata:

Types.required(BINARY).as(UTF8).named("username"); 
Types.optional(FIXED_LEN_BYTE_ARRAY).length(20).named("sha1");

Optional types are built using optional(PrimitiveTypeName) to get the builder.

Groups are built similarly, using requiredGroup() (or the optional version) to return a group builder. Group builders provide requiredand optional to add primitive types, which return primitive builders like the versions above.

// This produces: 
// required group User { 
//   required int64 id; 
//   optional binary email (UTF8); 
// } 
Types.requiredGroup() 
.required(INT64).named("id") 
.optional(BINARY).as(UTF8).named("email") 
.named("User")

When required is called on a group builder, the builder it returns will add the type to the parent group when it is built and named will return its parent group builder (instead of the type) so more fields can be added.

Sub-groups can be created using requiredGroup() to get a group builder that will create the group type, add it to the parent builder, and return the parent builder for more fields.

// required group User { 
//   required int64 id; 
//   optional binary email (UTF8); 
//   optional group address { 
//     required binary street (UTF8); 
//     required int32 zipcode; 
//   } 
// } 
Types.requiredGroup() 
.required(INT64).named("id") 
.optional(BINARY).as(UTF8).named("email") 
.optionalGroup() 
.required(BINARY).as(UTF8).named("street") 
.required(INT32).named("zipcode") 
.named("address") 
.named("User")

Maps are built similarly, using requiredMap() (or the optionalMap() version) to return a map builder. Map builders provide key to add a primitive as key or a groupKey to add a group as key. key()returns a MapKey builder, which extends a primitive builder. On the other hand, groupKey() returns a MapGroupKey builder, which extends a group builder. A key in a map is always required.

Once a key is built, a primitive map value can be built using requiredValue()(or the optionalValue() version) that returns MapValue builder. A group map value can be built using requiredGroupValue() (or the optionalGroupValue() version) that returns MapGroupValue builder.

// required group zipMap (MAP) { 
//   repeated group map (MAP_KEY_VALUE) { 
//     required float key 
//     optional int32 value 
//   } 
// } 
Types.requiredMap() 
.key(FLOAT) 
.optionalValue(INT32) 
.named("zipMap") 
// required group zipMap (MAP) { 
//   repeated group map (MAP_KEY_VALUE) { 
//     required group key { 
//       optional int64 first; 
//       required group second { 
//         required float inner_id_1; 
//         optional int32 inner_id_2; 
//       } 
//     } 
//     optional group value { 
//       optional group localGeoInfo { 
//         required float inner_value_1; 
//         optional int32 inner_value_2; 
//       } 
//       optional int32 zipcode; 
//     } 
//   } 
// } 
Types.requiredMap() 
.groupKey() 
.optional(INT64).named("id") 
.requiredGroup() 
.required(FLOAT).named("inner_id_1") 
.required(FLOAT).named("inner_id_2") 
.named("second") 
.optionalGroup() 
.optionalGroup() 
.required(FLOAT).named("inner_value_1") 
.optional(INT32).named("inner_value_2") 
.named("localGeoInfo") 
.optional(INT32).named("zipcode") 
.named("zipMap")

Message types are built using #buildMessage() and function just like group builders.

// message User { 
//   required int64 id; 
//   optional binary email (UTF8); 
//   optional group address { 
//     required binary street (UTF8); 
//     required int32 zipcode; 
//   } 
// } 
Types.buildMessage() 
.required(INT64).named("id") 
.optional(BINARY).as(UTF8).named("email") 
.optionalGroup() 
.required(BINARY).as(UTF8).named("street") 
.required(INT32).named("zipcode") 
.named("address") 
.named("User")

These builders enforce consistency checks based on the specifications in the parquet-format documentation. For example, if DECIMAL is used to annotate a FIXED_LEN_BYTE_ARRAY that is not long enough for its maximum precision, these builders will throw an IllegalArgumentException:

// throws IllegalArgumentException with message: 
// "FIXED(4) is not long enough to store 10 digits" 
Types.required(FIXED_LEN_BYTE_ARRAY).length(4) 
.as(DECIMAL).precision(10) 
.named("badDecimal");

[中]此类提供了生成拼花图案类型的流畅构建器。
最基本的用途是构建基本类型:

Types.required(INT64).named("id"); 
Types.optional(INT32).named("number");

required(PrimitiveTypeName)工厂方法生成一个PrimitiveType生成器,PrimitiveBuilder#named(String)生成PrimitiveType。在required和named之间,可以使用其他生成器方法添加类型注释或其他类型元数据:

Types.required(BINARY).as(UTF8).named("username"); 
Types.optional(FIXED_LEN_BYTE_ARRAY).length(20).named("sha1");

可选类型是使用Optional(PrimitiveTypeName)来获取生成器的。
使用requiredGroup()(或可选版本)返回组生成器,组的构建方式也类似。组生成器提供Required和可选的添加基元类型,这些基元类型返回与上述版本类似的基元生成器。

// This produces: 
// required group User { 
//   required int64 id; 
//   optional binary email (UTF8); 
// } 
Types.requiredGroup() 
.required(INT64).named("id") 
.optional(BINARY).as(UTF8).named("email") 
.named("User")

在组生成器上调用required时,它返回的生成器将在生成并命名时将类型添加到父组中,并将返回其父组生成器(而不是类型),以便添加更多字段。
可以使用requiredGroup()创建子组,以获得一个组生成器,该生成器将创建组类型,将其添加到父生成器,并返回父生成器以获取更多字段。

// required group User { 
//   required int64 id; 
//   optional binary email (UTF8); 
//   optional group address { 
//     required binary street (UTF8); 
//     required int32 zipcode; 
//   } 
// } 
Types.requiredGroup() 
.required(INT64).named("id") 
.optional(BINARY).as(UTF8).named("email") 
.optionalGroup() 
.required(BINARY).as(UTF8).named("street") 
.required(INT32).named("zipcode") 
.named("address") 
.named("User")

地图的构建方式与此类似,使用requiredMap()(或optionalMap()版本)返回地图生成器。Map Builder提供键来添加基元作为键,或提供groupKey来添加组作为键。key()返回一个MapKey生成器,它扩展了一个基本生成器。另一方面,groupKey()返回一个MapGroupKey生成器,它扩展了一个组生成器。地图上的钥匙总是必需的。
生成键后,可以使用返回MapValue builder的requiredValue()(或optionalValue()版本)生成基本映射值。可以使用返回MapGroupValue builder的requiredGroupValue()(或optionalGroupValue()版本)生成组映射值。

// required group zipMap (MAP) { 
//   repeated group map (MAP_KEY_VALUE) { 
//     required float key 
//     optional int32 value 
//   } 
// } 
Types.requiredMap() 
.key(FLOAT) 
.optionalValue(INT32) 
.named("zipMap") 
// required group zipMap (MAP) { 
//   repeated group map (MAP_KEY_VALUE) { 
//     required group key { 
//       optional int64 first; 
//       required group second { 
//         required float inner_id_1; 
//         optional int32 inner_id_2; 
//       } 
//     } 
//     optional group value { 
//       optional group localGeoInfo { 
//         required float inner_value_1; 
//         optional int32 inner_value_2; 
//       } 
//       optional int32 zipcode; 
//     } 
//   } 
// } 
Types.requiredMap() 
.groupKey() 
.optional(INT64).named("id") 
.requiredGroup() 
.required(FLOAT).named("inner_id_1") 
.required(FLOAT).named("inner_id_2") 
.named("second") 
.optionalGroup() 
.optionalGroup() 
.required(FLOAT).named("inner_value_1") 
.optional(INT32).named("inner_value_2") 
.named("localGeoInfo") 
.optional(INT32).named("zipcode") 
.named("zipMap")

消息类型是使用#buildMessage()构建的,其功能与组生成器类似。

// message User { 
//   required int64 id; 
//   optional binary email (UTF8); 
//   optional group address { 
//     required binary street (UTF8); 
//     required int32 zipcode; 
//   } 
// } 
Types.buildMessage() 
.required(INT64).named("id") 
.optional(BINARY).as(UTF8).named("email") 
.optionalGroup() 
.required(BINARY).as(UTF8).named("street") 
.required(INT32).named("zipcode") 
.named("address") 
.named("User")

这些建筑商根据拼花地板格式文档中的规范执行一致性检查。例如,如果使用DECIMAL来注释长度不足以达到其最大精度的固定字节数组,这些构建器将抛出IllegalArgumentException:

// throws IllegalArgumentException with message: 
// "FIXED(4) is not long enough to store 10 digits" 
Types.required(FIXED_LEN_BYTE_ARRAY).length(4) 
.as(DECIMAL).precision(10) 
.named("badDecimal");

代码示例

代码示例来源:origin: apache/hive

/**
 * Searches column names by name on a given Parquet message schema, and returns its projected
 * Parquet schema types.
 *
 * @param schema Message type schema where to search for column names.
 * @param colNames List of column names.
 * @param colTypes List of column types.
 * @return A MessageType object of projected columns.
 */
public static MessageType getSchemaByName(MessageType schema, List<String> colNames, List<TypeInfo> colTypes) {
 List<Type> projectedFields = getProjectedGroupFields(schema, colNames, colTypes);
 Type[] typesArray = projectedFields.toArray(new Type[0]);
 return Types.buildMessage()
   .addFields(typesArray)
   .named(schema.getName());
}

代码示例来源:origin: apache/hive

if (typeInfo.getCategory().equals(Category.PRIMITIVE)) {
 if (typeInfo.equals(TypeInfoFactory.stringTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.BINARY, repetition).as(OriginalType.UTF8)
   .named(name);
 } else if (typeInfo.equals(TypeInfoFactory.intTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.INT32, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.shortTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.INT32, repetition)
    .as(OriginalType.INT_16).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.byteTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.INT32, repetition)
    .as(OriginalType.INT_8).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.longTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.INT64, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.doubleTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.DOUBLE, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.floatTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.FLOAT, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.booleanTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.BOOLEAN, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.binaryTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.BINARY, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.timestampTypeInfo)) {
  return Types.primitive(PrimitiveTypeName.INT96, repetition).named(name);
 } else if (typeInfo.equals(TypeInfoFactory.voidTypeInfo)) {
  throw new UnsupportedOperationException("Void type not implemented");
 } else if (typeInfo.getTypeName().toLowerCase().startsWith(
   serdeConstants.CHAR_TYPE_NAME)) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8)

代码示例来源:origin: apache/hive

return Types.buildGroup(fieldType.getRepetition())
 .addFields(typesArray)
 .named(fieldType.getName());
   subFieldType = getProjectedType(elemType, subFieldType);
  return Types.buildGroup(Repetition.OPTIONAL).as(OriginalType.LIST).addFields(
   subFieldType).named(fieldType.getName());

代码示例来源:origin: prestosql/presto

return Types.primitive(PrimitiveTypeName.BINARY, repetition).as(OriginalType.UTF8)
    .named(name);
  typeInfo.equals(TypeInfoFactory.shortTypeInfo) ||
  typeInfo.equals(TypeInfoFactory.byteTypeInfo)) {
return Types.primitive(PrimitiveTypeName.INT32, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT64, repetition).named(name);
return Types.primitive(PrimitiveTypeName.DOUBLE, repetition).named(name);
return Types.primitive(PrimitiveTypeName.FLOAT, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BOOLEAN, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BINARY, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT96, repetition).named(name);
  serdeConstants.CHAR_TYPE_NAME)) {
if (repetition == Repetition.OPTIONAL) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
  return Types.repeated(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
  serdeConstants.VARCHAR_TYPE_NAME)) {
if (repetition == Repetition.OPTIONAL) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);

代码示例来源:origin: apache/hive

/**
 * Searchs column names by name on a given Parquet schema, and returns its corresponded
 * Parquet schema types.
 *
 * @param schema Group schema where to search for column names.
 * @param colNames List of column names.
 * @param colTypes List of column types.
 * @return List of GroupType objects of projected columns.
 */
private static List<Type> getProjectedGroupFields(GroupType schema, List<String> colNames, List<TypeInfo> colTypes) {
 List<Type> schemaTypes = new ArrayList<Type>();
 ListIterator<String> columnIterator = colNames.listIterator();
 while (columnIterator.hasNext()) {
  TypeInfo colType = colTypes.get(columnIterator.nextIndex());
  String colName = columnIterator.next();
  Type fieldType = getFieldTypeIgnoreCase(schema, colName);
  if (fieldType == null) {
   schemaTypes.add(Types.optional(PrimitiveTypeName.BINARY).named(colName));
  } else {
   schemaTypes.add(getProjectedType(colType, fieldType));
  }
 }
 return schemaTypes;
}

代码示例来源:origin: org.apache.parquet/parquet-thrift

private ConvertedField visitPrimitiveType(PrimitiveTypeName type, OriginalType orig, State state) {
 PrimitiveBuilder<PrimitiveType> b = primitive(type, state.repetition);
 if (orig != null) {
  b = b.as(orig);
 }
 if (fieldProjectionFilter.keep(state.path)) {
  return new Keep(state.path, b.named(state.name));
 } else {
  return new Drop(state.path);
 }
}

代码示例来源:origin: pentaho/pentaho-hadoop-shims

case DECIMAL:
 if ( f.getAllowNull() ) {
  return Types.optional( PrimitiveType.PrimitiveTypeName.BINARY ).as( OriginalType.DECIMAL ).precision( f.getPrecision() ).scale( f.getScale() ).named( formatFieldName );
 } else {
  return Types.required( PrimitiveType.PrimitiveTypeName.BINARY ).as( OriginalType.DECIMAL ).precision( f.getPrecision() ).scale( f.getScale() ).named( formatFieldName );
  return Types.optional( PrimitiveType.PrimitiveTypeName.INT32 ).as( OriginalType.DECIMAL ).precision( f.getPrecision() ).scale( f.getScale() ).named( formatFieldName );
 } else {
  return Types.required( PrimitiveType.PrimitiveTypeName.INT32 ).as( OriginalType.DECIMAL ).precision( f.getPrecision() ).scale( f.getScale() ).named( formatFieldName );
  return Types.optional( PrimitiveType.PrimitiveTypeName.INT64 ).as( OriginalType.DECIMAL ).precision( f.getPrecision() ).scale( f.getScale() ).named( formatFieldName );
 } else {
  return Types.required( PrimitiveType.PrimitiveTypeName.INT64 ).as( OriginalType.DECIMAL ).precision( f.getPrecision() ).scale( f.getScale() ).named( formatFieldName );

代码示例来源:origin: org.apache.parquet/parquet-column

public static MapBuilder<GroupType> optionalMap() {
 return map(Type.Repetition.OPTIONAL);
}

代码示例来源:origin: org.apache.parquet/parquet-column

public static ListBuilder<GroupType> requiredList() {
 return list(Type.Repetition.REQUIRED);
}

代码示例来源:origin: io.prestosql/presto-hive

return Types.primitive(PrimitiveTypeName.BINARY, repetition).as(OriginalType.UTF8)
    .named(name);
  typeInfo.equals(TypeInfoFactory.shortTypeInfo) ||
  typeInfo.equals(TypeInfoFactory.byteTypeInfo)) {
return Types.primitive(PrimitiveTypeName.INT32, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT64, repetition).named(name);
return Types.primitive(PrimitiveTypeName.DOUBLE, repetition).named(name);
return Types.primitive(PrimitiveTypeName.FLOAT, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BOOLEAN, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BINARY, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT96, repetition).named(name);
  serdeConstants.CHAR_TYPE_NAME)) {
if (repetition == Repetition.OPTIONAL) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
  return Types.repeated(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
  serdeConstants.VARCHAR_TYPE_NAME)) {
if (repetition == Repetition.OPTIONAL) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);

代码示例来源:origin: apache/hive

/**
 * Searches column names by indexes on a given Parquet file schema, and returns its corresponded
 * Parquet schema types.
 *
 * @param schema Message schema where to search for column names.
 * @param colNames List of column names.
 * @param colIndexes List of column indexes.
 * @return A MessageType object of the column names found.
 */
public static MessageType getSchemaByIndex(MessageType schema, List<String> colNames, List<Integer> colIndexes) {
 List<Type> schemaTypes = new ArrayList<Type>();
 for (Integer i : colIndexes) {
  if (i < colNames.size()) {
   if (i < schema.getFieldCount()) {
    schemaTypes.add(schema.getType(i));
   } else {
    //prefixing with '_mask_' to ensure no conflict with named
    //columns in the file schema
    schemaTypes.add(
     Types.optional(PrimitiveTypeName.BINARY).named("_mask_" + colNames.get(i)));
   }
  }
 }
 return new MessageType(schema.getName(), schemaTypes);
}

代码示例来源:origin: com.alibaba.blink/flink-table

private static Type convertType(
  final String name, final InternalType type, final Type.Repetition repetition) {
  if (DataTypes.INT.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.INT32, repetition).named(name);
  } else if (DataTypes.SHORT.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.INT32, repetition)
      .as(OriginalType.INT_16)
      .named(name);
  } else if (DataTypes.BOOLEAN.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.BOOLEAN, repetition).named(name);
  } else if (DataTypes.BYTE.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.INT32, repetition)
      .as(OriginalType.INT_8)
      .named(name);
  } else if (DataTypes.DOUBLE.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.DOUBLE, repetition).named(name);
  } else if (DataTypes.FLOAT.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.FLOAT, repetition).named(name);
  } else if (DataTypes.LONG.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.INT64, repetition).named(name);
  } else if (DataTypes.STRING.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.BINARY, repetition).as(OriginalType.UTF8)
      .named(name);
  } else if (DataTypes.DATE.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.INT32, repetition).as(OriginalType.DATE).named(name);
  } else if (DataTypes.TIME.equals(type)) {
    return Types.primitive(PrimitiveType.PrimitiveTypeName.INT32, repetition)
      .as(OriginalType.TIME_MILLIS)
      .named(name);
  } else if (DataTypes.TIMESTAMP.equals(type)) {

代码示例来源:origin: org.lasersonlab.apache.parquet/parquet-column

public static MapBuilder<GroupType> optionalMap() {
 return map(Type.Repetition.OPTIONAL);
}

代码示例来源:origin: org.apache.parquet/parquet-column

public static ListBuilder<GroupType> optionalList() {
 return list(Type.Repetition.OPTIONAL);
}

代码示例来源:origin: apache/hive

@Test
public void testUnannotatedListOfPrimitives() throws Exception {
 MessageType fileSchema = Types.buildMessage()
   .repeated(INT32).named("list_of_ints")
   .named("UnannotatedListOfPrimitives");
 Path test = writeDirect("UnannotatedListOfPrimitives",
   fileSchema,
   new DirectWriter() {
    @Override
    public void write(RecordConsumer rc) {
     rc.startMessage();
     rc.startField("list_of_ints", 0);
     rc.addInteger(34);
     rc.addInteger(35);
     rc.addInteger(36);
     rc.endField("list_of_ints", 0);
     rc.endMessage();
    }
   });
 ArrayWritable expected = list(
   new IntWritable(34), new IntWritable(35), new IntWritable(36));
 List<ArrayWritable> records = read(test);
 Assert.assertEquals("Should have only one record", 1, records.size());
 assertEquals("Should match expected record",
   expected, records.get(0));
}

代码示例来源:origin: io.prestosql/presto-hive

return Types.primitive(PrimitiveTypeName.BINARY, repetition).as(OriginalType.UTF8)
    .named(name);
  typeInfo.equals(TypeInfoFactory.shortTypeInfo) ||
  typeInfo.equals(TypeInfoFactory.byteTypeInfo)) {
return Types.primitive(PrimitiveTypeName.INT32, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT64, repetition).named(name);
return Types.primitive(PrimitiveTypeName.DOUBLE, repetition).named(name);
return Types.primitive(PrimitiveTypeName.FLOAT, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BOOLEAN, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BINARY, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT96, repetition).named(name);
  serdeConstants.CHAR_TYPE_NAME)) {
if (repetition == Repetition.OPTIONAL) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
  return Types.repeated(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
  serdeConstants.VARCHAR_TYPE_NAME)) {
if (repetition == Repetition.OPTIONAL) {
  return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);

代码示例来源:origin: io.prestosql/presto-hive

return Types.primitive(PrimitiveTypeName.BINARY, repetition).as(OriginalType.UTF8)
    .named(name);
  typeInfo.equals(TypeInfoFactory.shortTypeInfo) ||
  typeInfo.equals(TypeInfoFactory.byteTypeInfo)) {
return Types.primitive(PrimitiveTypeName.INT32, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT64, repetition).named(name);
return Types.primitive(PrimitiveTypeName.DOUBLE, repetition).named(name);
return Types.primitive(PrimitiveTypeName.FLOAT, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BOOLEAN, repetition).named(name);
return Types.primitive(PrimitiveTypeName.BINARY, repetition).named(name);
return Types.primitive(PrimitiveTypeName.INT96, repetition).named(name);
return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
return Types.optional(PrimitiveTypeName.BINARY).as(OriginalType.UTF8).named(name);
int scale = decimalTypeInfo.scale();
int bytes = ParquetHiveSerDe.PRECISION_TO_BYTE_COUNT[prec - 1];
return Types.optional(PrimitiveTypeName.FIXED_LEN_BYTE_ARRAY).length(bytes).as(OriginalType.DECIMAL).scale(scale).precision(prec).named(name);
return Types.primitive(PrimitiveTypeName.INT32, repetition).as(OriginalType.DATE).named(name);

代码示例来源:origin: apache/hive

schemaTypes.add(Types.optional(PrimitiveTypeName.BINARY).named("_mask_" + colNames.get(i)));

代码示例来源:origin: org.apache.parquet/parquet-avro

Schema.Type type = schema.getType();
if (type.equals(Schema.Type.BOOLEAN)) {
 builder = Types.primitive(BOOLEAN, repetition);
} else if (type.equals(Schema.Type.INT)) {
 builder = Types.primitive(INT32, repetition);
} else if (type.equals(Schema.Type.LONG)) {
 builder = Types.primitive(INT64, repetition);
} else if (type.equals(Schema.Type.FLOAT)) {
 builder = Types.primitive(FLOAT, repetition);
} else if (type.equals(Schema.Type.DOUBLE)) {
 builder = Types.primitive(DOUBLE, repetition);
} else if (type.equals(Schema.Type.BYTES)) {
 builder = Types.primitive(BINARY, repetition);
} else if (type.equals(Schema.Type.STRING)) {
 builder = Types.primitive(BINARY, repetition).as(UTF8);
} else if (type.equals(Schema.Type.RECORD)) {
 return new GroupType(repetition, fieldName, convertFields(schema.getFields()));
} else if (type.equals(Schema.Type.ENUM)) {
 builder = Types.primitive(BINARY, repetition).as(ENUM);
} else if (type.equals(Schema.Type.ARRAY)) {
 if (writeOldListStructure) {
 builder = Types.primitive(FIXED_LEN_BYTE_ARRAY, repetition)
   .length(schema.getFixedSize());
} else if (type.equals(Schema.Type.UNION)) {

代码示例来源:origin: org.apache.parquet/parquet-arrow

@Override
public TypeMapping visit(Struct_ type) {
 List<TypeMapping> parquetTypes = fromArrow(children);
 return new StructTypeMapping(field, addToBuilder(parquetTypes, Types.buildGroup(OPTIONAL)).named(fieldName), parquetTypes);
}

相关文章

微信公众号

最新文章

更多