我得到了sparkdataframe,它是从多行json文件加载的。
列(数据)模式之一如下所示:
root
|-- data: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- f: struct (nullable = true)
| | | |-- 0: struct (nullable = true)
| | | | |-- v: double (nullable = true)
| | |-- ts: string (nullable = true)
和样本数据:
array
0: {"f": {"0": {"v": 25.08}}, "ts": "2021-01-11T05:59:00.170Z"}
1: {"f": {"0": {"v": 25.92}}, "ts": "2021-03-22T03:29:00.170Z"}
2: {"f": {"0": {"v": 25.94}}, "ts": "2021-03-22T03:39:00.173Z"}
3: {"f": {"0": {"v": 25.95}}, "ts": "2021-03-22T03:49:00.170Z"}
4: {"f": {"0": {"v": 25.99}}, "ts": "2021-03-22T04:00:00.173Z"}
我只想提取ts和v。
示例结果
1条答案
按热度按时间6vl6ewon1#
可以将结构数组分解为多行,然后选择所需的结构元素: