将csv文件加载到配置单元表时忽略其中的引号

sqougxex  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(337)

我有一个csv文件,其中包含以下格式的数据:

"SomeName1",25,"SomeString1"
"SomeName2",26,"SomeString2"
"SomeName3",27,"SomeString3"

我正在将此csv加载到配置单元表中。在表中,第1列和第3列与我不需要的引号一起插入。我希望列1是 SomeName1 第三栏是 SomeString1 我试过了

WITH SERDEPROPERTIES (
   "separatorChar" = "\t",
   "quoteChar"     = "\""
)

但它不起作用,并且保留了“”。
这里应该怎么做?
表创建语句:

CREATE TABLE `abcdefgh`(
  `name` string COMMENT 'from deserializer',
  `age` string COMMENT 'from deserializer',
  `value` string COMMENT 'from deserializer')
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
  'quoteChar'='\"',
  'separatorChar'='\t')
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://a-b-c-d-e:9000/user/hive/warehouse/abcdefgh'
TBLPROPERTIES (
  'numFiles'='1',
  'numRows'='0',
  'rawDataSize'='0',
  'totalSize'='3134916',
  'transient_lastDdlTime'='1490713221')
cgh8pdjw

cgh8pdjw1#

分隔符应为逗号: "separatorChar" = ',' ```
create external table mytable
(
col1 string
,col2 int
,col3 string
)
row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties
(
"separatorChar" = ','
,"quoteChar" = '"'
)
stored as textfile
;

select * from mytable
;

+--------------+--------------+--------------+
| mytable.col1 | mytable.col2 | mytable.col3 |
+--------------+--------------+--------------+
| SomeName1 | 25 | SomeString1 |
| SomeName2 | 26 | SomeString2 |
| SomeName3 | 27 | SomeString3 |
+--------------+--------------+--------------+

相关问题