sql—使用openx serde在athena中聚合json对象自己的键值属性

cotxawn7  于 2021-06-24  发布在  Hive
关注(0)|答案(1)|浏览(207)

我有一个类似以下两个示例事件的json结构:
事件1

{
      "event":{
             "type" : "FooBarEvent"
             "kv":{
                "key1":"value1",
                "key2":"value2",
                "3":"three",
                "d":"4"
             }
      }
    }

事件2

{
      "event":{
             "type" : "FooBarEvent"
             "kv":{
                "key1":"value1",
                "key2":"value2000",
                "e": "4"
             }
      }
    }

请注意,我不知道哪些键和值是预先进来,我想聚合(计数)他们。两个事件的输出如下所示:

+-----------+------+-----------+--------+
| EventType | Key  | Value     | Amount |
+-----------+------+-----------+--------+
| Foobar    | key1 | value1    | 2      |
+-----------+------+-----------+--------+
| Foobar    | key2 | value1    | 1      |
+-----------+------+-----------+--------+
| Foobar    | key2 | value2000 | 1      |
+-----------+------+-----------+--------+
| Foobar    | 3    | three     | 1      |
+-----------+------+-----------+--------+
| Foobar    | d    | 4         | 1      |
+-----------+------+-----------+--------+
| Foobar    | e    | 4         | 1      |
+-----------+------+-----------+--------+

在雅典娜,有没有一种不改变json结构的方法来实现这一点?如何最好地Map和展平/查询结构?

fv2wmkja

fv2wmkja1#

你好,应该可以用 UNNEST 功能和铸造 kv 到Map上去。假设您的数据存储在一个名为json\u data的表中,下面的查询应该可以工作

with data_formated as
(
    select *
    ,json_extract_scalar(json_field,'$.event.type') event_type
    ,cast(json_extract(json_field,'$.event.kv') as map(varchar,varchar)) key_value
    from json_data
)
,unnesting_data as
(
    select *
    from data_formated
    cross join unnest(key_value) as t (k,v)
)
select event_type,k,v,count(1) amount
from unnesting_data
group by 1,2,3
order by 1,2,3;

相关问题