如何在aws athena-hive中对Parquet数据创建结构数组

juzqafwq  于 2021-05-31  发布在  Hadoop
关注(0)|答案(1)|浏览(345)

我尝试在aws athena上创建一个表,其中包含Parquet数据的配置单元,如下所示:

CREATE TABLE IF NOT EXISTS db.test (
  country STRING ,
  day_part STRING ,
  dma STRING ,
  first_seen STRING, 
  geohash STRING ,
  last_seen STRING, 
  location_backfill ARRAY <
   element STRUCT <
    backfill_type: BIGINT, 
    brq: BIGINT ,
    first_seen: STRING, 
    last_seen: STRING ,
    num_days: BIGINT >>
  ) 
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
  's3://<location>'
TBLPROPERTIES (
  'parquet.compress'='SNAPPY', 
  'transient_lastDdlTime'='<sometime>')

我反复得到错误
第9:12行:不匹配的输入'struct'应为{'(','array','>'}(服务:amazonathena;状态码:400;错误代码:invalidrequestexception;请求id:)
语法似乎很好,不确定。数据存储在s3路径中。知道是什么导致了这个问题吗?

3qpi33ja

3qpi33ja1#

数组元素未命名,请仅指定类型(结构):

location_backfill ARRAY <
    STRUCT <
    backfill_type: BIGINT, 
    brq: BIGINT ,
    first_seen: STRING, 
    last_seen: STRING ,
    num_days: BIGINT >>

相关问题