以Class类型作为参数的Scala函数

twh00eeo  于 8个月前  发布在  Scala
关注(0)|答案(1)|浏览(94)

在scala中,我想创建一个以类类型作为参数的函数。举例来说:

case class Person(name: String, age: Int)
case class Order(orderId: String, amount: Int)

def readParquet(path: String): DataFrame = {
  return spark.read.parquet(path)
}

def readParquetAsDataset(path: String, MyObject: MyClassThatChange): DataFrame = {
  return readParquet(path).as[MyClassThatChange]
}

最后,我可以做到:

var dsProduct = readParquetAsDataset(myPath, Product)
var dsOrder = readParquetAsDataset(myPath, Order)

我的数据集行都是用好的数据类型转换的。

oxf4rvwz

oxf4rvwz1#

你需要将编码器作为隐式的。注意,你的类必须是Spark productEncoder逻辑在运行时找到它们的顶层。
注意:在Scala中,你不需要使用return,除非你想跳出某段代码的执行,但这通常是你想避免的。

case class Person(name: String, age: Int)
case class Order(orderId: String, amount: Int)

def readParquet(path: String, spark: SparkSession): DataFrame =
  spark.read.parquet(path)

// requires a generic T with an implicit type class of Encoder[T]
// returns the typed Dataset (DataFrame is Dataset[Row])
def readParquetAsDataset[T: Encoder](path: String, spark: SparkSession): Dataset[T] = 
// as takes the implicit Encoder[T] instance
  readParquet(path, spark).as[T]

val myPath = ""
val spark: SparkSession = null

import spark.implicits._ // brings in the encoder derivation
  
var dsPerson = readParquetAsDataset[Person](myPath, spark)
var dsOrder = readParquetAsDataset[Order](myPath, spark)

相关问题