scala—如何组合两个Dataframedf1和df2,以保持df2的公共列

de90aj5v  于 2021-07-13  发布在  Java
关注(0)|答案(1)|浏览(209)

我有df1

+-----+------------------------+---------------------------+----------------------+---------+
|JobId|TotalRecordType1Count   |TotalRecordType2Count      |TotalRecordType3Count |JobStatus|
+-----+------------------------+---------------------------+----------------------+---------+
|  100|                       0|                          0|                     0|Success  |
+-----+------------------------+---------------------------+----------------------+---------+

df2组件

+---------------------------+----------------------+
|TotalRecordType1Count      |TotalRecordType2Count |
+---------------------------+----------------------+
|                        800|                   900|
+---------------------------+----------------------+

df1和df2都只有一行。
我想在公共计数列上组合df1和df2,并保留df2的计数

+-----+------------------------+---------------------------+----------------------+---------+
|JobId|TotalRecordType1Count   |TotalRecordType2Count      |TotalRecordType3Count |JobStatus|
+-----+------------------------+---------------------------+----------------------+---------+
|  100|                     800|                        900|                     0|Success  |
+-----+------------------------+---------------------------+----------------------+---------+
jfgube3f

jfgube3f1#

您可以进行交叉连接,并根据需要选择列:

val cols = df1.columns.map(x => if(df2.columns.contains(x)) df2(x) else df1(x))

result = df1.crossJoin(df2).select(cols:_*)

result.show
+-----+---------------------+---------------------+---------------------+---------+
|JobId|TotalRecordType1Count|TotalRecordType2Count|TotalRecordType3Count|JobStatus|
+-----+---------------------+---------------------+---------------------+---------+
|  100|                  800|                  900|                    0|  Success|
+-----+---------------------+---------------------+---------------------+---------+

相关问题