按表源添加时间属性时flink数据类型不匹配

vwhgwdsa  于 2021-06-24  发布在  Flink
关注(0)|答案(1)|浏览(283)

我试图根据flink doc添加一个带有事件时间属性的表源。我的代码如下:

class SISSourceTable
    extends StreamTableSource[Row]
    with DefinedRowtimeAttributes
    with FlinkCal
    with FlinkTypeTags {
  private[this] val profileProp = ConfigurationManager.loadBusinessProperty
  val topic: String = ...
  val schemas = Seq(
    (TsCol, SQLTimestamp),
    (DCol, StringTag),
    (CCol, StringTag),
    (RCol, StringTag)
  )

  override def getProducedDataType: DataType = DataTypes.ROW(extractFields(schemas): _*)

  override def getTableSchema: TableSchema =
    new TableSchema.Builder()
      .fields(extractFieldNames(schemas), extractFieldDataTypes(schemas))
      .build()

  override def getRowtimeAttributeDescriptors: util.List[RowtimeAttributeDescriptor] =
    Collections.singletonList(
      new RowtimeAttributeDescriptor(
        TsCol,
        new ExistingField(TsCol),
        new AscendingTimestamps
      )
    )

  override def getDataStream(execEnv: StreamExecutionEnvironment): DataStream[Row] = {
    val windowTime: Int = profileProp.getProperty("xxx", "300").toInt
    val source = prepareSource(topic)
    val colsToCheck = List(RCol, CCol, DCol)

    execEnv
      .addSource(source)
      .map(new MapFunction[String, Map[String, String]]() {
        override def map(value: String): Map[String, String] = ...
      })
      .map(new MapFunction[Map[String, String], Row]() {
        override def map(value: Map[String, String]): Row = {
          Row.of(new Timestamp(value(TsCol).toLong * 1000), value(DCol), value(CCol), value(RCol))
        }
      })
      .assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor[Row](Time.seconds(windowTime)) {
        override def extractTimestamp(element: Row): Long = element.getField(0).asInstanceOf[Timestamp].getTime
      })
  }
}

这个 source 我进去了 getDataStream 方法是kafka字符串源。我从Kafka的每一张唱片中都提取了一张tscol。我想用tscol作为事件时间。但是tscol是一个10位的时间戳,数据类型为string,所以我需要将它转换为13位的长数据类型。当我尝试使用13位长的数据作为行时,我得到了一个异常,即行时只能从一个sql\u timestamp列中提取。因此,我最终将ts col转换为java.sql.timestamp。当我在源表上面注册并运行flink时。我有以下例外:

org.apache.flink.table.api.TableException: TableSource of type com.mob.mobeye.flink.table.source.StayInStoreSourceTable returned a DataStream of data type ROW<`t` TIMESTAMP(3), `mac` STRING, `c` STRING, `r` STRING> that does not match with the data type ROW<`t` TIMESTAMP(3), `mac` STRING, `c` STRING, `r` STRING> declared by the TableSource.getProducedDataType() method. Please validate the implementation of the TableSource.
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:113)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:55)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlan(StreamExecTableSourceScan.scala:55)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlanInternal(StreamExecExchange.scala:84)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlanInternal(StreamExecExchange.scala:44)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecExchange.translateToPlan(StreamExecExchange.scala:44)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecGroupWindowAggregate.translateToPlanInternal(StreamExecGroupWindowAggregate.scala:140)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecGroupWindowAggregate.translateToPlanInternal(StreamExecGroupWindowAggregate.scala:55)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecGroupWindowAggregate.translateToPlan(StreamExecGroupWindowAggregate.scala:55)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:97)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlan(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:97)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlanInternal(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecLookupJoin.translateToPlan(StreamExecLookupJoin.scala:40)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToTransformation(StreamExecSink.scala:185)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:133)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:50)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan(ExecNode.scala:54)
    at org.apache.flink.table.planner.plan.nodes.exec.ExecNode.translateToPlan$(ExecNode.scala:52)
    at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlan(StreamExecSink.scala:50)
    at org.apache.flink.table.planner.delegation.StreamPlanner.$anonfun$translateToPlan$1(StreamPlanner.scala:61)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)
    at scala.collection.Iterator.foreach(Iterator.scala:937)
    at scala.collection.Iterator.foreach$(Iterator.scala:937)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
    at scala.collection.IterableLike.foreach(IterableLike.scala:70)
    at scala.collection.IterableLike.foreach$(IterableLike.scala:69)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
    at scala.collection.TraversableLike.map(TraversableLike.scala:233)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:226)
    at scala.collection.AbstractTraversable.map(Traversable.scala:104)
    at org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:60)
    at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:149)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:439)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.insertInto(TableEnvironmentImpl.java:327)
    at org.apache.flink.table.api.internal.TableImpl.insertInto(TableImpl.java:411)

我很困惑为什么
行< t 时间戳(3), mac 字符串, c 字符串, r 字符串>
与数据类型不匹配
行< t 时间戳(3), mac 字符串, c 字符串, r 字符串>
我在另一个地方也遇到了类似的错误,我把时间戳替换为long,这样就行了。但是在这里,我需要将t列提取为rowtime,所以它必须是timestamp(3)类型。我非常感激有人能帮助解决这个问题。

i2loujxw

i2loujxw1#

你用的是什么flink版本?如果我没弄错的话,您使用的是<1.9.2版本。对吗?
如果是这样,异常消息就没有多大帮助,因为它有一个在中修复的bughttps://issues.apache.org/jira/browse/flink-15726. 在那之前,实际上同一种字体印刷了两次。
您的实现有几个问题。类型不匹配很可能是因为 GenericTypeInformationmap 操作员输入

.map(new MapFunction[Map[String, String], Row]() {
        override def map(value: Map[String, String]): Row = {
          Row.of(new Timestamp(value(TsCol).toLong * 1000), value(DCol), value(CCol), value(RCol))
        }
      })

试着把它改成

.map(new MapFunction[Map[String, String], Row]() {
        override def map(value: Map[String, String]): Row = {
          Row.of(new Timestamp(value(TsCol).toLong * 1000), value(DCol), value(CCol), value(RCol))
        }
      }).returns(Types.ROW(Types.SQL_TIMESTAMP, Types.STRING, Types.STRING, Types.STRING))

其次,不需要在表源中分配时间戳和水印。它们将根据通过提供的信息自动分配 DefinedRowtimeAttributes .

相关问题