sparkDataframe值到scala列表

sh7euo9m  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(349)

我有一个带有数组的列的Dataframe:

+----------------------------+
|User    | Color             |
+----------------------------+
|User1   | [Green,Blue,Red]  |
|User2   | [Blue,Red]        |
+----------------------------+

我在试着过滤 User1 并将颜色列表放入scala列表:

val colorsList: List[String] = List("Green","Blue","Red")

以下是我迄今为止尝试的内容(输出作为注解添加):
尝试1:

val dfTest1 = myDataframe.where("User=='User1'").select("Color").rdd.map(r => r(0)).collect()
println(dfTest1)  //[Ljava.lang.Object;@44022255
for(EachColor<- dfTest1){
  println(EachColor)    //WrappedArray(Green, Blue, Red)
}

尝试2:

val dfTest2 = myDataframe.where("User=='User1'").select("Color").collectAsList.get(0).getList(0)
println(dfTest2)  //[Green, Blue, Red]   but type is util.List[Nothing]

尝试3:

val dfTest32 = myDataframe.where("User=='User1'").select("Color").rdd.map(r => r(0)).collect.toList 
println(dfTest32)   //List(WrappedArray(Green, Blue, Red))

for(EachColor <- dfTest32){
  println(EachColor) //WrappedArray(Green, Blue, Red)
}

尝试4:

val dfTest31 = myDataframe.where("User=='User1'").select("Color").map(r => r.getString(0)).collect.toList    
//Exception : scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.lang.String
bihw5rsg

bihw5rsg1#

你可以试着 Seq[String] 和转换 toList :

val colorsList = df.where("User=='User1'")
                   .select("Color")
                   .rdd.map(r => r.getAs[Seq[String]](0))
                   .collect()(0)
                   .toList

或同等地

val colorsList = df.where("User=='User1'")
                   .select("Color")
                   .collect()(0)
                   .getAs[Seq[String]](0)
                   .toList

相关问题