pyspark 如何从Azure Databricks输出“底层SQLException”而不是一般异常消息？

x8goxv8g 于 4个月前发布在 Spark

关注(0)|答案(1)|浏览(38)

我们正在从数据工厂管道调用Azure Databricks笔记本，该管道执行Azure Synapse的摄取。但每当笔记本运行失败时，它只显示以下错误消息：
第一个月
但是当我们进入运行日志并向下滚动到这个异常消息时，就在这个消息下面，

Underlying SQLException(s):
  - com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: The column [4] is not nullable and also USE_DEFAULT_VALUE is false, thus empty input is not allowed. [ErrorCode = 107090] [SQLState = S0001]

字符串
或者有时候是这样的

Underlying SQLException(s):
  - com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: String or Binary would be truncated

型
我们用来获取数据的代码是：

try:
    data.write.format('com.databricks.spark.sqldw').option("url", connection_string).option("dbTable", table) \
                .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
                .option("tempDir", Connection.storageaccount_path + 'store/dataload')
                .save(mode="append")
except Exception as e:
            raise Exception("error took place for table: " + table + " : " + str(e))

型
因此，这个Underlying SQLException(s):是告诉出错的实际错误消息。但是它从来没有显示在我们在ADF管道上看到的runError输出中。因此，我们不可能使用Azure Log Analytics批量识别错误。我们总是必须手动向下滚动到错误日志一个接一个的失败。
我们每天在生产环境中运行数千次，许多管道定期发生故障。但由于在查看确切错误消息方面存在这种限制，我们无法有效地监控故障。
有没有办法让Databricks输出Underlying SQLException(s):而不是一般的消息：com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.

pyspark

来源：https://stackoverflow.com/questions/77492660/how-to-output-the-underlying-sqlexception-from-azure-databricks-instead-of-the

1条答案

按热度按时间

9rbhqvlz1#

要获取实际错误消息而不是一般错误消息，您需要使用适当的索引号将其拆分，并根据索引号获取实际错误消息。

try:
    df1.write.format('com.databricks.spark.sqldw').option("url", "URL").option("dbTable", "demo4") \
                .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
                .option("tempDir", "tempdir") \
                .option("forwardSparkAzureStorageCredentials", "true") \
                .save(mode="append")
except Exception as e:
    error_message = str(e)
    error_parts = error_message.split('\n')
    print("Error occurred:",error_parts[2],error_parts[3],error_parts[4])

字符串
在这里，我将错误消息与\n（新行）分离到数组中，为了获得预期的结果，我调用了分离数组的第2，第3，第4个索引元素。

我的执行：

的数据