我正在尝试对aws glue运行此查询
CREATE TABLE bucketing_example
USING parquet
CLUSTERED BY (id) INTO 2 BUCKETS
LOCATION 's3://my-bucket/bucketing_example'
AS SELECT * FROM (
VALUES(1, 'red'),
(2, 'orange'),
(5, 'yellow'),
(10, 'green'),
(11, 'blue'),
(12, 'indigo'),
(20, 'violet'))
AS Colors(id, value)
我得到以下例外:
java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:163)
at org.apache.hadoop.fs.Path.<init>(Path.java:175)
at org.apache.spark.sql.catalyst.catalog.CatalogUtils$.stringToURI(ExternalCatalogUtils.scala:236)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getDatabase$1$$anonfun$apply$2.apply(HiveClientImpl.scala:343)
at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getDatabase$1$$anonfun$apply$2.apply(HiveClientImpl.scala:339)
at scala.Option.map(Option.scala:146)
此外,我还尝试在使用athena(仍在使用glue)创建的带扣表上运行类似于这些的spark sql查询。
虽然 DESCRIBE EXTENDED
在表中显示的bucket列中,join的arms上的交换仍保留在计划中。
bucketing与glue和spark sql一起工作吗?
暂无答案!
目前还没有任何答案,快来回答吧!