用于单元测试的spark scala mocking spark.implicits

nom7f22z  于 2021-05-19  发布在  Spark
关注(0)|答案(1)|浏览(916)

当试图用spark和scala简化单元测试时,我使用的是scala测试和mockitoscala(以及mockitosugar)。这只是让你做一些类似的事情:

val sparkSessionMock = mock[SparkSession]

然后你通常可以用“when”和“verify”来完成所有的魔术。
但是如果您有一些实现需要导入

import spark.implicits._

在它的代码中,单元测试的简单性似乎消失了(或者至少我还没有找到解决这个问题的最合适的方法)。
我最终得到了这个错误:

org.mockito.exceptions.verification.SmartNullPointerException: 
You have a NullPointerException here:
-> at ...
because this method call was *not* stubbed correctly:
-> at scala.Option.orElse(Option.scala:289)
sparkSession.implicits();

由于键入问题,在sparksession中简单地模拟对“implicits”对象的调用是没有帮助的:

val implicitsMock = mock[SQLImplicits]
when(sparkSessionMock.implicits).thenReturn(implicitsMock)

不会让你通过,因为它说它需要你的mock中对象的类型:

require: sparkSessionMock.implicits.type
found: implicitsMock.type

请不要告诉我我宁愿做sparksession.builder.getorcreate()。。。从那时起,这不再是一个单元测试,而是一个更重的集成测试。
(编辑):下面是一个完整的可复制示例:

import org.apache.spark.sql._
import org.mockito.Mockito.when
import org.scalatest.{ FlatSpec, Matchers }
import org.scalatestplus.mockito.MockitoSugar

case class MyData(key: String, value: String)

class ClassToTest()(implicit spark: SparkSession) {
    import spark.implicits._

    def read(path: String): Dataset[MyData] = 
         spark.read.parquet(path).as[MyData]
}

class SparkMock extends FlatSpec with Matchers with MockitoSugar {

     it should "be able to mock spark.implicits" in {
         implicit val sparkMock: SparkSession = mock[SparkSession]
         val implicitsMock = mock[SQLImplicits]
         when(sparkMock.implicits).thenReturn(implicitsMock)
         val readerMock = mock[DataFrameReader]
         when(sparkMock.read).thenReturn(readerMock)
         val dataFrameMock = mock[DataFrame]
         when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
         val dataSetMock = mock[Dataset[MyData]]
         implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
         when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)

         new ClassToTest().read("/some/path/") shouldBe dataSetMock
    }
 }
vwoqyblh

vwoqyblh1#

你不能嘲笑它。隐式在编译时解析,而模拟在运行时发生(运行时反射,通过字节伙伴进行字节码操作)。不能在编译时导入仅在运行时模拟的隐式。您必须手动解析隐式(原则上,如果您在运行时再次启动编译器,您可以在运行时解析隐式,但这将非常困难)。
尝试

class ClassToTest()(implicit spark: SparkSession, encoder: Encoder[MyData]) {
  def read(path: String): Dataset[MyData] = 
    spark.read.parquet(path).as[MyData]
}

class SparkMock extends AnyFlatSpec with Matchers with MockitoSugar {

  it should "be able to mock spark.implicits" in {
    implicit val sparkMock: SparkSession = mock[SparkSession]
    val readerMock = mock[DataFrameReader]
    when(sparkMock.read).thenReturn(readerMock)
    val dataFrameMock = mock[DataFrame]
    when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
    val dataSetMock = mock[Dataset[MyData]]
    implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
    when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)

    new ClassToTest().read("/some/path") shouldBe dataSetMock
  }
}

//[info] SparkMock:
//[info] - should be able to mock spark.implicits
//[info] Run completed in 2 seconds, 727 milliseconds.
//[info] Total number of tests run: 1
//[info] Suites: completed 1, aborted 0
//[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
//[info] All tests passed.

请注意 "/some/path" 两个地方应该是一样的。在代码片段中,两个字符串是不同的。

相关问题