xml hive serde提取时间戳hadoop

kzipqqlq  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(301)

我正在尝试使用hive中的xmlserde从xml中提取时间戳。外部表创建链接到hdfs目录。目前,timestamp值在我的表中显示为null。
我在想时间戳需要铸造吗?我不确定。其余的xml信息工作正常,并显示在hive中。
输入文件是:

<example>
<date>2017-02-09 22:03:58<date>
</example>

配置单元创建脚本:

create external table example (
date timestamp
)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.date"="/example/date/text()"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION 'mypath'
TBLPROPERTIES (
"xmlinput.start"="<example>",
"xmlinput.end"="</example>"
);
uyhoqukh

uyhoqukh1#

似乎只支持java原语类型。
查看 getPrimitiveValue 中的方法 XmlUtils.java 文件。

/**
 * (c) Copyright IBM Corp. 2013. All rights reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License").
 * You may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.

* /

package com.ibm.spss.hive.serde2.xml.processor;

import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector.PrimitiveCategory;

/**
 * The XML utilities
 */
public class XmlUtils {

    /**
     * Private constructor
     */
    private XmlUtils() {
    }

    /**
     * Converts the string value to the java object for the given primitive category
     * 
     * @param value
     *            the value
     * @param primitiveCategory
     *            the primitive category
     * @return the java object
     */
    public static Object getPrimitiveValue(String value, PrimitiveCategory primitiveCategory) {
        if (value != null) {
            try {
                switch (primitiveCategory) {
                    case BOOLEAN:
                        return Boolean.valueOf(value);
                    case BYTE:
                        return Byte.valueOf(value);
                    case DOUBLE:
                        return Double.valueOf(value);
                    case FLOAT:
                        return Float.valueOf(value);
                    case INT:
                        return Integer.valueOf(value);
                    case LONG:
                        return Long.valueOf(value);
                    case SHORT:
                        return Short.valueOf(value);
                    case STRING:
                        return value;
                    default:
                        throw new IllegalStateException(primitiveCategory.toString());
                }
            } catch (Exception ignored) {
            }
        }
        return null;
    }

}

相关问题