我有 DataFrame
结构如下:
root
|-- very_hot: string (nullable = true)
|-- hot: string (nullable = true)
|-- cold: string (nullable = true)
|-- little_snow: string (nullable = true)
|-- medium_snow: string (nullable = true)
|-- very_cold: string (nullable = true)
|-- deep_snow: string (nullable = true)
|-- freezing: string (nullable = true)
|-- windy: string (nullable = true)
每一列都包含 True
或者 False
. 我想用列名数组创建一个新列,它们是 True
. 我该怎么做?
编辑:这是我的表格:
+--------+-----+-----+-----------+-----------+---------+---------+--------+-----+
|very_hot| hot| cold|little_snow|medium_snow|very_cold|deep_snow|freezing|windy|
+--------+-----+-----+-----------+-----------+---------+---------+--------+-----+
| True|False|False| False| False| False| False| False| True|
| False|False| True| True| False| False| False| False|False|
| False|False| True| False| True| False| False| False|False|
| False|False|False| False| False| True| True| False|False|
+--------+-----+-----+-----------+-----------+---------+---------+--------+-----+
我想要的列应该如下所示:
+--------------------+
| features|
+--------------------+
| very_hot, windy|
| cold, little_snow|
| cold, medium_snow|
|very_cold, deep_snow|
+--------------------+
4条答案
按热度按时间7cjasjjr1#
这个scala代码
将导致
q9rjltbz2#
试试这个。
ipakzgxi3#
这个代码可能对你有帮助,
snvhrwxg4#
另一种选择-