hive从多个表生成嵌套的json

rseugnpd  于 2021-05-31  发布在  Hadoop
关注(0)|答案(0)|浏览(235)

我有3个表:hadoop上的customer/order/orderline,我想生成一个类似json的嵌套表

{
  "_source": {
    "idcustomer": 2411,
    "order": [
      {
        "idorder": 3376,
        "date": "2015-07-02 18:08:26.000",
        "lines": [
          {
            "product": "Product A",
            "qte": 1
          },
          {
            "product": "Product B",
            "qte": 2
          }
        ]
      }
    ]
  }
}

所以我尝试使用collect(来自udaf brickhouse),但显然不可能在两个级别上“collect”

SELECT to_json(named_struct(
    '_source', named_struct(
        'idcustomer', c.id_customer,
        'order', collect(named_struct(
            "idorder", o.id_order
            "date", o.date_vente,
            "lines", collect(named_struct(
                    'product', ol.product_name,
                    'qte', ol.product_quantity
                    ))
            ))
            )))
FROM customers c
LEFT JOIN orders o on c.id_customer = o.id_customer
left join orders_lines ol on ol.id_order = o.id_order
where o.id_customer is not NULL
and o.id_customer is NOT NULL
and ol.id_order is not null
GROUP BY c.id_customer;

Hive响应“线路”,收集。。

Error while compiling statement: FAILED: SemanticException [Error 10128]: line 15:22 Not yet supported place for UDAF 'collect'

有人有解决办法吗(用Hive、普雷斯托或 Impala )?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题