我有一个代码,它曾经在一个简单的表上工作,当同一个表被分区到多个子分区时就停止工作了。
在分布式应用程序(spark)中,我们的代码可以同时从不同的计算机并行执行批删除查询(删除不同的记录)。
大多数查询都可以工作,但其中一个查询在套接字连接超时时失败:
java.sql.BatchUpdateException: Batch entry 0 DELETE FROM my_table WHERE vessel_id='xxxxxx' AND day='2020-09-15 00:00:00+00'::timestamp was aborted: An I/O error occurred while sending to the backend. Call getNextException to see other errors in the batch.
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
When the code retries to run the task the connection fails on
:FATAL: the database system is in recovery mode
在数据库日志中,我看到:
2020-09-21 16:44:27 UTC::@:[26848]:DETAIL: Failed process was running: DELETE FROM my_table WHERE vessel_id=$1 AND day=$2
2020-09-21 16:44:27 UTC::@:[26848]:LOG: terminating any other active server processes
2020-09-21 16:44:27 UTC:172.31.4.110(59468):postgres@postgres:[27705]:WARNING: terminating connection because of crash of another server process
2020-09-21 16:44:27 UTC:172.31.4.110(59468):postgres@postgres:[27705]:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-09-21 16:44:27 UTC:172.31.4.110(59468):postgres@postgres:[27705]:HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-09-21 16:44:27 UTC:10.3.1.138(57926):rdsrepladmin@[unknown]:[26740]:WARNING: terminating connection because of crash of another server process
2020-09-21 16:44:27 UTC:10.3.1.138(57926):rdsrepladmin@[unknown]:[26740]:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-09-21 16:44:27 UTC:10.3.1.138(57926):rdsrepladmin@[unknown]:[26740]:HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-09-21 16:44:27 UTC::@:[22480]:WARNING: terminating connection because of crash of another server process
2020-09-21 16:44:27 UTC::@:[22480]:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2020-09-21 16:44:27 UTC::@:[22480]:HINT: In a moment you should be able to reconnect to the database and repeat your command.
2020-09-21 16:44:27 UTC:127.0.0.1(31826):rdsadmin@rdsadmin:[27967]:FATAL: the database system is in recovery mode
知道为什么表被分区时数据库会失败吗?为什么其他计算机上的所有其他连接都已关闭,数据库进入恢复模式?
暂无答案!
目前还没有任何答案,快来回答吧!