I have a function that adds 2 columns:
def sum_num (num1: Int, num2: Int): Int = {
return num1 + num2
}
我有一个Dataframedf,值如下
+----+----+----+
|col1|col2|col3|
+----+----+----+
|1 |2 |5 |
|7 |4 |4 |
+----+----+----+
我想添加一列并向函数传递列名,但下面的代码不起作用。它给出了一个错误发现列required is int
val newdf = df.withColumn("sum_of_cols1", sum_num($col1, $ col2))
.withColumn("sum_of_cols2", sum_num($col1, $ col3))
1条答案
按热度按时间vh0rcniy1#
将代码更改为:
必须在spark sql列上操作。你可以用它们做算术运算。看看可以使用的运算符