cassandra 什么会导致JVMStabilityInspector中的ReadStage异常?

bcs8qyzn  于 2023-05-22  发布在  Cassandra
关注(0)|答案(1)|浏览(94)

刚刚从cassandra 3.11.13升级到cassandra 4.1.1,我可以在日志中看到这样的错误:

ERROR [ReadStage-1] 2023-05-18 09:00:13,032 JVMStabilityInspector.java:68 - Exception in thread Thread[ReadStage-1,5,SharedPool]
java.lang.NullPointerException: null
        at org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:854)
        at org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:793)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:219)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158)
        at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
        at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:770)
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.mergeStaticRows(UnfilteredRowIterators.java:494)
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.<init>(UnfilteredRowIterators.java:407)
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.create(UnfilteredRowIterators.java:426)
        at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.access$000(UnfilteredRowIterators.java:391)
        at org.apache.cassandra.db.rows.UnfilteredRowIterators.merge(UnfilteredRowIterators.java:135)
        at org.apache.cassandra.db.SinglePartitionReadCommand.withSSTablesIterated(SinglePartitionReadCommand.java:817)
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:766)
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:625)
        at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:459)
        at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:419)
        at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:61)
        at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
        at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
        at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
        at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
        at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:124)
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:120)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:829)
ERROR [ReadStage-3] 2023-05-18 09:00:41,324 JVMStabilityInspector.java:68 - Exception in thread Thread[ReadStage-3,5,SharedPool]
java.lang.NullPointerException: null

它是一个6节点集群,8 cpu/64 G RAM,Java 11,Shenandoah GC,31 GB堆。你知道从哪里着手解决吗?谢谢你
nodetool tpstats的输出如下所示:

Pool Name                    Active Pending Completed Blocked All time blocked
RequestResponseStage         0      0       4597757   0       0               
MutationStage                0      0       2280940   0       0               
ReadStage                    0      0       3770853   0       0               
CompactionExecutor           0      0       6828      0       0               
MemtableReclaimMemory        0      0       34        0       0               
PendingRangeCalculator       0      0       10        0       0               
GossipStage                  0      0       11283     0       0               
SecondaryIndexManagement     0      0       1         0       0               
HintsDispatcher              0      0       0         0       0               
MigrationStage               0      0       5         0       0               
MemtablePostFlush            0      0       49        0       0               
PerDiskMemtableFlushWriter_0 0      0       22        0       0               
ValidationExecutor           0      0       0         0       0               
Sampler                      0      0       0         0       0               
ViewBuildExecutor            0      0       0         0       0               
MemtableFlushWriter          0      0       34        0       0               
CacheCleanupExecutor         0      0       0         0       0               
Native-Transport-Requests    1      0       1745221   0       0               

Latencies waiting in queue (micros) per dropped message types
Message type                      Dropped     50%      95%      99%      Max     
READ_RSP                          0           924.0    2299.0   2759.0   8239.0  
RANGE_REQ                         0           642.0    1916.0   2759.0   2759.0  
PING_REQ                          0           0.0      0.0      0.0      0.0     
PAXOS2_COMMIT_REMOTE_RSP          0           0.0      0.0      0.0      0.0     
PAXOS2_COMMIT_AND_PREPARE_RSP     0           0.0      0.0      0.0      0.0     
_SAMPLE                           0           0.0      0.0      0.0      0.0     
VALIDATION_RSP                    0           0.0      0.0      0.0      0.0     
SCHEMA_PULL_RSP                   0           0.0      0.0      0.0      0.0     
SYNC_RSP                          0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_START_PREPARE_REQ  0           0.0      0.0      0.0      0.0     
PAXOS2_PREPARE_REFRESH_REQ        0           0.0      0.0      0.0      0.0     
SCHEMA_VERSION_REQ                0           0.0      0.0      0.0      0.0     
HINT_RSP                          0           0.0      0.0      0.0      0.0     
BATCH_REMOVE_RSP                  0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_RSP                0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_FINISH_PREPARE_RSP 0           0.0      0.0      0.0      0.0     
PAXOS_COMMIT_REQ                  0           642.0    1916.0   2299.0   2759.0  
SNAPSHOT_RSP                      0           0.0      0.0      0.0      0.0     
COUNTER_MUTATION_REQ              0           0.0      0.0      0.0      0.0     
PAXOS2_PROPOSE_REQ                0           0.0      0.0      0.0      0.0     
GOSSIP_DIGEST_SYN                 0           924.0    35425.0  88148.0  88148.0 
PAXOS_PREPARE_REQ                 0           924.0    2299.0   2759.0   14237.0 
PREPARE_MSG                       0           0.0      0.0      0.0      0.0     
PAXOS2_PREPARE_REFRESH_RSP        0           0.0      0.0      0.0      0.0     
PAXOS_COMMIT_RSP                  0           1109.0   2299.0   2759.0   2759.0  
HINT_REQ                          0           0.0      0.0      0.0      0.0     
BATCH_REMOVE_REQ                  0           642.0    1916.0   2299.0   2759.0  
STATUS_RSP                        0           0.0      0.0      0.0      0.0     
READ_REPAIR_RSP                   0           535.0    535.0    535.0    535.0   
PAXOS2_PROPOSE_RSP                0           0.0      0.0      0.0      0.0     
GOSSIP_DIGEST_ACK2                0           1109.0   35425.0  61214.0  61214.0 
CLEANUP_MSG                       0           0.0      0.0      0.0      0.0     
REQUEST_RSP                       0           642.0    73457.0  88148.0  219342.0
TRUNCATE_RSP                      0           0.0      0.0      0.0      0.0     
UNUSED_CUSTOM_VERB                0           0.0      0.0      0.0      0.0     
REPLICATION_DONE_RSP              0           0.0      0.0      0.0      0.0     
SNAPSHOT_REQ                      0           0.0      0.0      0.0      0.0     
ECHO_REQ                          0           0.0      0.0      0.0      0.0     
PAXOS2_REPAIR_REQ                 0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_COMPLETE_RSP       0           0.0      0.0      0.0      0.0     
PREPARE_CONSISTENT_REQ            0           0.0      0.0      0.0      0.0     
FAILURE_RSP                       0           1916.0   1916.0   1916.0   1916.0  
BATCH_STORE_RSP                   0           924.0    2299.0   2759.0   2759.0  
SCHEMA_PUSH_RSP                   0           0.0      0.0      0.0      0.0     
MUTATION_RSP                      0           924.0    2299.0   2759.0   14237.0 
FINALIZE_PROPOSE_MSG              0           0.0      0.0      0.0      0.0     
ECHO_RSP                          0           0.0      0.0      0.0      0.0     
PAXOS2_REPAIR_RSP                 0           0.0      0.0      0.0      0.0     
INTERNAL_RSP                      0           0.0      0.0      0.0      0.0     
FAILED_SESSION_MSG                0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_COMPLETE_REQ       0           0.0      0.0      0.0      0.0     
_TRACE                            0           0.0      0.0      0.0      0.0     
SCHEMA_VERSION_RSP                0           0.0      0.0      0.0      0.0     
FINALIZE_COMMIT_MSG               0           0.0      0.0      0.0      0.0     
SNAPSHOT_MSG                      0           0.0      0.0      0.0      0.0     
PREPARE_CONSISTENT_RSP            0           0.0      0.0      0.0      0.0     
PAXOS_PROPOSE_REQ                 0           642.0    1916.0   2299.0   3311.0  
PAXOS_PREPARE_RSP                 0           924.0    2299.0   2759.0   14237.0 
MUTATION_REQ                      0           642.0    1916.0   2299.0   11864.0 
PAXOS2_CLEANUP_RSP2               0           0.0      0.0      0.0      0.0     
READ_REQ                          0           642.0    1597.0   2299.0   9887.0  
PING_RSP                          0           0.0      0.0      0.0      0.0     
PAXOS2_COMMIT_REMOTE_REQ          0           0.0      0.0      0.0      0.0     
RANGE_RSP                         0           1109.0   2299.0   2299.0   2299.0  
VALIDATION_REQ                    0           0.0      0.0      0.0      0.0     
SYNC_REQ                          0           0.0      0.0      0.0      0.0     
PAXOS2_PREPARE_RSP                0           0.0      0.0      0.0      0.0     
_TEST_1                           0           0.0      0.0      0.0      0.0     
GOSSIP_SHUTDOWN                   0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_START_PREPARE_RSP  0           0.0      0.0      0.0      0.0     
TRUNCATE_REQ                      0           0.0      0.0      0.0      0.0     
_TEST_2                           0           0.0      0.0      0.0      0.0     
GOSSIP_DIGEST_ACK                 0           924.0    35425.0  61214.0  61214.0 
PAXOS2_CLEANUP_REQ                0           0.0      0.0      0.0      0.0     
SCHEMA_PUSH_REQ                   0           0.0      0.0      0.0      0.0     
FINALIZE_PROMISE_MSG              0           0.0      0.0      0.0      0.0     
PAXOS2_CLEANUP_FINISH_PREPARE_REQ 0           0.0      0.0      0.0      0.0     
PAXOS2_PREPARE_REQ                0           0.0      0.0      0.0      0.0     
BATCH_STORE_REQ                   0           642.0    1916.0   2299.0   9887.0  
COUNTER_MUTATION_RSP              0           0.0      0.0      0.0      0.0     
REPAIR_RSP                        0           0.0      0.0      0.0      0.0     
PAXOS2_COMMIT_AND_PREPARE_REQ     0           0.0      0.0      0.0      0.0     
STATUS_REQ                        0           0.0      0.0      0.0      0.0     
SCHEMA_PULL_REQ                   0           0.0      0.0      0.0      0.0     
READ_REPAIR_REQ                   0           446.0    535.0    535.0    535.0   
REPLICATION_DONE_REQ              0           0.0      0.0      0.0      0.0     
PAXOS_PROPOSE_RSP                 0           924.0    2299.0   2759.0   3311.0

任何帮助都非常感谢,谢谢。

ykejflvf

ykejflvf1#

不幸的是,你的文章中没有任何东西可以提供问题的线索。
实际上,您截断了错误中最重要的部分,即堆栈跟踪。
在任何情况下,您都需要做更多的调查并分析日志以寻找线索。注意几分钟导致错误信息的消息,但我建议您通过查看system.log来“缩小”。
DEBUG消息在您调查的这个阶段没有帮助,因为在您放大到特定问题之前,它们是不相关的。干杯!

相关问题