pyspark-如何在withcolumn中添加express

ybzsozfc  于 2021-05-18  发布在  Spark
关注(0)|答案(1)|浏览(353)

我想添加一个新列,它是两个现有列的连接,我正在使用以下查询:这个查询有什么问题?我看到新列的“null”

df.select(df['DEST_COUNTRY_NAME'],df['ORIGIN_COUNTRY_NAME']).withColumn("COMPLETE_PATH",df['DEST_COUNTRY_NAME'] + ",").filter(df['DEST_COUNTRY_NAME']=='Egypt').show()

+-----------------+-------------------+-------------+
|DEST_COUNTRY_NAME|ORIGIN_COUNTRY_NAME|COMPLETE_PATH|
+-----------------+-------------------+-------------+
|            Egypt|      United States|         null|
|            Egypt|      United States|         null|
|            Egypt|      United States|         null|
|            Egypt|      United States|         null|
|            Egypt|      United States|         null|
|            Egypt|      United States|         null|
+-----------------+-------------------+-------------+
webghufk

webghufk1#

尝试:

import org.apache.spark.sql.functions.concat
...
df.withColumn(concat(col("DEST_COUNTRY_NAME"), lit(",")))

相关问题